A Framework To Assess Vulnerability Of A Neural Network And A Method

< Back

A Framework To Assess Vulnerability Of A Neural Network And A Method Thereof

Abstract: TITLE: A framework (100) to assess vulnerability of a neural network (M) and a method thereof. Abstract The present disclosure proposes a framework (100) to assess vulnerability of a neural network (M) using method steps (200). The framework (100) comprises a processor (20) in communication with the neural network (M). The processor (20) is configured to select a set of neural network segments basket from the plurality of neural network segments basket comprising the neural network (M). Then a set of attack vectors to the neural network (M) and the selected set of neural network segments basket to get a first and a second output respectively. The first and the second output are compared to compute a loss function. An alternate set of neural network segments basket is selected until the loss function is below a pre-defined threshold and designated as a student model (30). The student model (30) is fed with input and its output is recorded to assess the vulnerability of the neural network (M). Figure 1.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

31 May 2023

Publication Number

49/2024

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Bosch Global Software Technologies Private Limited

123, Industrial Layout, Hosur Road, Koramangala, Bangalore – 560095, Karnataka, India

Robert Bosch GmbH

Postfach 30 02 20, 0-70442, Stuttgart, Germany

Inventors

1. Manojkumar Somabhai Parmar

#202, Nisarg, Apartment, Nr - L G Corner, Maninagar, Ahmedabad, Gujarat 380008, India

2. Yuvaraj Govindarajulu

#816, 16th A Main, 23rd B Cross, Sector-3, HSR Layout, Bengaluru, Karnataka 560102, India

3. Pavan Kulkarni

#33,"KALPAVRUKSHA",2nd cross, Shreya Estate, Gokul road,Hubli - 580030,Dharwad District, Karnataka, India

Specification

Description:Complete Specification:
The following specification describes and ascertains the nature of this invention and the manner in which it is to be performed

Field of the invention
[0001] The present disclosure relates to the field of Artificial Intelligence security. In particular, it proposes a framework to assess vulnerability of a neural network and a method thereof.

Background of the invention
[0002] With the advent of data science, data processing and decision-making systems are implemented using artificial intelligence modules. The artificial intelligence modules use different techniques like machine learning, neural networks, deep learning etc. Most of the AI based systems, receive large amounts of data and process the data to train AI models. Trained AI models generate output based on the use cases requested by the user. Typically, the AI systems are used in the fields of computer vision, speech recognition, natural language processing, audio recognition, healthcare, autonomous driving, manufacturing, robotics etc. where they process data to generate required output based on certain rules/intelligence acquired through training.

[0003] Neural Networks in particular are computing systems inspired by the biological neural networks that constitute animal brains. A neural network is a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. A neural network comprises combination of several segments of Neural Networks (like the big-little architecture), called the Neural Network Segments Basket (NNSB).

[0004] To process the inputs and give a desired output, the AI systems use various models/algorithms which are trained using the training data. Once the AI system is trained using the training data, the AI systems use the models to analyze the real time data and generate appropriate result. The models may be fine-tuned in real-time based on the results. The models in the AI systems form the core of the system. Lots of effort, resources (tangible and intangible), and knowledge goes into developing these models.

[0005] It is possible that some adversary may try to tamper/manipulate/evade the model in neural network to create incorrect outputs. The adversary may use different techniques to manipulate the output of the model. One of the simplest techniques used by the adversary is where the adversary sends queries to the neural network using his own test data to compute or approximate the architecture of the neural network. Another technique is wherein the adversary may manipulate the input data to bring an artificial output. This will cause hardships to the original developer of the AI in the form of business disadvantages, loss of confidential information, loss of lead time spent in development, loss of intellectual properties, loss of future revenues etc. Hence there is a need to assess vulnerability of the neural network.
[0006] There are methods known in the prior arts on the method of defending a neural network. Patent application US2019156183 AA titled “Defending neural networks by randomizing model weights” discloses systems and methods for the selective introduction of low-level pseudo-random noise into at least a portion of the weights used in a neural network model to increase the robustness of the neural network and provide a stochastic transformation defense against perturbation type attacks. Random number generation circuitry provides a plurality of pseudo-random values. Combiner circuitry combines the pseudo-random values with a defined number of least significant bits/digits in at least some of the weights used to provide a neural network model implemented by neural network circuitry. In some instances, selection circuitry selects pseudo-random values for combination with the network weights based on a defined pseudo-random value probability distribution.

Brief description of the accompanying drawings
[0007] An embodiment of the invention is described with reference to the following accompanying drawings:
[0008] Figure 1 depicts a framework for assessing vulnerability of a neural network (M);
[0009] Figure 2 illustrates method steps of assessing vulnerability of the neural network (M).

Detailed description of the drawings
[0010] A person skilled in the art would be aware of the different types of AI models such as linear regression, naïve bayes classifier, support vector machine, neural networks and the like. An AI model with reference to this disclosure can be defined as reference or an inference set of data, which is use different forms of correlation matrices. Using these models and the data from these models, correlations can be established between different types of data to arrive at some logical understanding of the data. It must be understood that this disclosure is specific to the type of model being executed in the AI model i.e. neural networks.

[0011] Some of the typical tasks performed by neural networks are classification, clustering, regression etc. Majority of classification tasks depend upon labeled datasets; that is, the data sets are labelled manually in order for a neural network to learn the correlation between labels and data. This is known as supervised learning. Some of the typical applications of classifications are: face recognition, object identification, gesture recognition, voice recognition etc. In a regression task, the neural network is trained based on labeled datasets, where the target labels are numeric values. Some of the typical applications of regressions are: Weather forecasting, Stock price predictions, House price estimation, energy consumption forecasting etc. Clustering or grouping is the detection of similarities in the inputs. The cluster learning techniques do not require labels to detect similarities. Learning without labels is called unsupervised learning. Unlabeled data is the majority of data in the world.

[0012] As the AI model such as the neural network forms the core of the AI system, the model needs to be protected against attacks. AI adversarial threats can be largely categorized into – model extraction attacks, inference attacks, evasion attacks, and data poisoning attacks. In poisoning attacks, the adversarial carefully inject crafted data to contaminate the training data which eventually affects the functionality of the AI system. Inference attacks attempt to infer the training data from the corresponding output or other information leaked by the target model. Studies have shown that it is possible to recover training data associated with arbitrary model output. Ability to extract this data further possess data privacy issues. Evasion attacks are the most prevalent kind of attack that may occur during AI system operations. In this method, the attacker works on the AI algorithm's inputs to find small perturbations leading to large modifications of its outputs (e.g., decision errors) which leads to evasion of the AI model.

[0013] In Model Extraction Attacks (MEA), the attacker gains information about the model internals through analysis of input, output, and other external information. Stealing such a model reveals the important intellectual properties of the organization and enables the attacker to craft other adversarial attacks such as evasion attacks. This attack is initiated through an attack vector. In the computing technology a vector may be defined as a method in which a malicious code/virus data uses to propagate itself such as to infect a computer, a computer system or a computer network. Similarly, an attack vector is defined a path or means by which a hacker can gain access to a computer or a network in order to deliver a payload or a malicious outcome. A model stealing attack uses a kind of attack vector that can make a digital twin/replica/copy of an AI module.

[0014] The attacker typically generates random queries of the size and shape of the input specifications and starts querying the model with these arbitrary queries. This querying produces input-output pairs for random queries and generates a secondary dataset that is inferred from the pre-trained model. The attacker then take this I/O pairs and trains the new model from scratch using this secondary dataset. This is a black box model attack vector where no prior knowledge of original model is required. As the prior information regarding model is available and increasing, attacker moves towards more intelligent attacks.

[0015] Our aim through this disclosure is to identify segments of the neural network that give the best input/output pair needed to extract the functionality of the trained neural network.

[0016] Figure 1 depicts a framework (100) for assessing vulnerability of a neural network (M). The framework (100) comprises a processor (20) in communication with the neural network (M). The neural network (M) is incorporated into specialized silicon chips, which incorporate AI technology and are used for machine learning. The neural network (M) comprises a plurality of neural network segments basket. The neural network segments could be one layer, a plurality of layers or a segment of any neural network backbone. A segment possesses a part of the structure and functionality of the neural network (M). The neural network (M) could be in a system comprising other components such as an input-output interface, a blocker module amongst other components known to a person skilled in the art. The blocker module blocks a user or modifies the output of the neural network (M) when a batch of input queries is determined as an attack vector. For simplicity only components having a bearing on the methodology disclosed in the present invention have been elucidated.

[0017] Generally, the processor (20) may be implemented as any or a combination of: one or more microchips or integrated circuits interconnected using a parent board, hardwired logic, software stored by a memory device and executed by a microprocessor (20), firmware, an application specific integrated circuit (ASIC), and/or a field programmable gate array (FPGA).

[0018] The processor (20) is configured to select a set of neural network segments basket from the plurality of neural network segments basket; feed a set of attack vectors to the neural network (M) and the selected set of neural network segments basket to get a first and a second output respectively; compare the first and the second output to compute a loss function; select an alternate set of neural network segments basket when the value of loss function exceeds a pre-determined threshold; designate a student model (30) when the loss function is below a pre-defined threshold based on the selected set of neural network segments basket; feed an input to the student model (30); record the behavior of the student model (30) to assess the vulnerability of the neural network (M).

[0019] In an embodiment of the present invention the processor (20) is adapted to segregate the neural network (M) into the plurality of neural network segments basket. In another embodiment the processor (20) has access to dynamic range of neural network segments basket stored in a proprietary database (NNSB). Further the processor (20) has access to a database of attack vectors (DBA). The processor (20) analyzes the output of the student model (30) relative to the neural network (M) for the said input to assess vulnerability.

[0020] As used in this application, the terms "component," "module," "interface," are intended to refer to a computer-related entity or an entity related to, or that is part of, an operational apparatus with one or more specific functionalities, wherein such entities can be either hardware, a combination of hardware and software, software, or software in execution. As further yet another example, interface(s) can include input/output (I/O) components as well as associated processor (20) (11), application, or Application Programming Interface (API) components. These various modules can either be a software embedded in a single chip or a combination of software and hardware where each module and its functionality is executed by separate independent chips connected to each other to function as a system.

[0021] Figure 2 illustrates method steps for assessing vulnerability of a neural network (M). The framework (100) and its components such as the processor (20) and the neural network (M) have been explained in accordance with figure 1.

[0022] Method step 201 comprises selecting a set of neural network segments basket from the plurality of neural network segments basket by means of the processor (20). In an embodiment of the present invention the processor (20) is adapted to segregate the neural network (M) into the plurality of neural network segments basket. In another embodiment the processor (20) has access to dynamic range of neural network segments basket stored in a proprietary database (NNSB). The selection of the set of neural network segments basket can be either random based on hit and trial or is chosen based on the performance of the previous segments. The NNSB also contains the relative characteristics of the neural network segments. The selection of segments is based on these characteristics. The database of neural network (M)s segment basket is constantly updated through exploratory research. The segments that perform similarly for several victim models are discarded to reduce redundancy.

[0023] Method step 202 comprises feeding a set of attack vectors to the neural network (M) and the selected set of neural network segments basket to get a first and a second output respectively. The processor (20) has access to a database of attack vectors (DBA).

[0024] Method step 203 comprises comparing the first and the second output to compute a loss function by means of the processor (20). The loss function here is an indication of the how distant the outputs (first and second) are from each other. The idea is to minimize this distance in order to extract the architecture of the neural network (M).

[0025] Method step 204 comprises selecting an alternate set of neural network segments basket when the value of loss function exceeds a pre-determined threshold by means of the processor (20). The selection of the alternate set is random based on hit and trial. Method step 203 and 204 are repeated continuously until the loss function is below a pre-defined threshold. Method step 205 comprises designating a student model (30) when the loss function is below a pre-defined threshold based on the selected set of neural network segments basket. The set of neural network (M)s segments basket which give the least value of loss function are ideally the closest architecture of the neural network (M).

[0026] Method step 206 comprises feeding an input to the student model (30). This input can also be random input, or from the Attack database (DBA) that may or may not be adversarial. Method step 207 comprises recording the behavior of the student model (30) to assess the vulnerability of the neural network (M). Assessing vulnerability comprises analyzing output of the student model (30) relative to the neural network (M) for the said input.

[0027] The idea behind this invention is to identify the portions of the neural network (M) having the maximum weightage in the output and hence assess the most portion of the neural network (M) most vulnerable to adversarial attacks. The method simulates the attack methods which a typical attacker would perform through trial and error. This automated framework explores the potential steps performed by the attacker through clever choice of the segments. The resulting final set of segments represent the stolen model architecture at the end of extraction. The characteristics of the stolen model (student model) reveals the vulnerabilities of the model under assessment.

[0028] It must be understood that the embodiments explained in the above detailed description are only illustrative and do not limit the scope of this invention. Any modification the framework (100) and adaptation of the method for assessing vulnerability of neural network (M) are envisaged and form a part of this invention. The scope of this invention is limited only by the claims.
, Claims:We Claim:
1. A framework (100) to assess vulnerability of a neural network (M), the neural network (M) comprising a plurality of neural network segments basket, said framework (100) comprising at least a processor (20) in communication with the neural network (M), said processor (20) configured to:
select a set of neural network segments basket from the plurality of neural network segments basket;
feed a set of attack vectors to the neural network (M) and the selected set of neural network segments basket to get a first and a second output respectively;
compare the first and the second output to compute a loss function;
select an alternate set of neural network segments basket when the value of loss function exceeds a pre-determined threshold;
designate a student model (30) when the loss function is below a pre-defined threshold based on the selected set of neural network segments basket;
feed an input to the student model (30);
record the behavior of the student model (30) to assess the vulnerability of the neural network (M).

2. The framework (100) to assess vulnerability of a neural network (M) as claimed in claim 1, wherein the processor (20) is adapted to segregate the neural network (M) into the plurality of neural network segments basket.

3. The framework (100) to assess vulnerability of a neural network (M) as claimed in claim 1, wherein the processor (20) has access to a database of attack vectors.

4. The framework (100) to assess vulnerability of a neural network (M) as claimed in claim 1, wherein the processor (20) analyzes the output of the student model (30) relative to the neural network (M) for the said input to assess vulnerability.

5. A method (200) to assess vulnerability of a neural network (M), the neural network (M) comprising a plurality of neural network segments basket, said neural network (M) in communication with a processor (20), the method comprising:
selecting (201) a set of neural network segments basket from the plurality of neural network segments basket by means of the processor (20);
feeding (202) a set of attack vectors to the neural network (M) and the selected set of neural network segments basket to get a first and a second output respectively;
comparing (203) the first and the second output to compute a loss function by means of the processor (20);
selecting (204) an alternate set of neural network segments basket when the value of loss function exceeds a pre-determined threshold by means of the processor (20);
designating (205) a student model (30) when the loss function is below a pre-defined threshold based on the selected set of neural network segments basket;
feeding (206) an input to the student model (30);
recording (207) the behavior of the student model (30) to assess the vulnerability of the neural network (M).

6. The method (200) to assess vulnerability of a neural network (M) as claimed in claim 5, wherein a neural network (M) is segregated into the plurality of neural network segments basket by means of the processor (20).

7. The method (200) to assess vulnerability of a neural network (M) as claimed in claim 5, wherein the processor (20) has access to a database of attack vectors.

8. The method (200) to assess vulnerability of a neural network (M) as claimed in claim 5, wherein assessing vulnerability comprises analyzing output of the student model (30) relative to the neural network (M) for the said input

Documents

Application Documents

#	Name	Date
1	202341037510-POWER OF AUTHORITY [31-05-2023(online)].pdf	2023-05-31
2	202341037510-FORM 1 [31-05-2023(online)].pdf	2023-05-31
3	202341037510-DRAWINGS [31-05-2023(online)].pdf	2023-05-31
4	202341037510-DECLARATION OF INVENTORSHIP (FORM 5) [31-05-2023(online)].pdf	2023-05-31
5	202341037510-COMPLETE SPECIFICATION [31-05-2023(online)].pdf	2023-05-31
6	202341037510-Power of Attorney [14-04-2024(online)].pdf	2024-04-14
7	202341037510-Covering Letter [14-04-2024(online)].pdf	2024-04-14