A Processor Adapted To Detect A Poisoned Input And A Training Method

< Back

A Processor Adapted To Detect A Poisoned Input And A Training Method Thereof

Abstract: TITLE: A processor (11) adapted to detect a poisoned input and a training method (200) thereof. Abstract The present invention proposes a processor (11) adapted to detect a poisoned input from an input to be fed to an AI model (M0) and a training method thereof. The processor (11) is configured to record the behavior of the AI model (M0) in response to a set of pre-determined poisoned samples (DS0). The processor (11) performs a vulnerability assessment based on the recorded behavior which is used to train a classifier model (12) within the processor (11). Vulnerability assessment comprises determining a threshold of poisoning for which the AI model (M0) misclassifies an input. During training the classifier is trained to identify a poisoned input based on feature values, feature distributions, and meta-data of set of pre-determined poisoned input.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

28 April 2023

Publication Number

44/2024

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Bosch Global Software Technologies Private Limited

123, Industrial Layout, Hosur Road, Koramangala, Bangalore – 560095, Karnataka, India

Robert Bosch GmbH

Postfach 30 02 20, 0-70442, Stuttgart, Germany

Inventors

1. Manojkumar Somabhai Parmar

#202, Nisarg, Apartment, Nr - . L G Corner, Maninagar, Ahmedabad, Gujarat 380008, India

2. Yuvaraj Govindarajulu

#816, 16th A Main, 23rd B Cross, Sector-3, HSR Layout, Bengaluru, Karnataka 560102, India

3. Pavan Kulkarni

#33, "Kalpavruksha", 2nd Cross, Shreya Estate, Gokul Road, Hubli, Dharwad Dist., Karnataka, 580030, India

Specification

Description:Complete Specification:
The following specification describes and ascertains the nature of this invention and the manner in which it is to be performed

Field of the invention
[001] The present disclosure relates to the field of Artificial Intelligence security. In particular, it proposes a processor adapted to detect a poisoned input from input to be fed to an AI model and a training method thereof.

Background of the invention
[002] With the advent of data science, data processing and decision-making systems are implemented using artificial intelligence modules. The artificial intelligence modules use different techniques like machine learning, neural networks, deep learning etc. Most of the AI based systems, receive large amounts of data and process the data to train AI models. Trained AI models generate output based on the use cases requested by the user. Typically, the AI systems are used in the fields of computer vision, speech recognition, natural language processing, audio recognition, healthcare, autonomous driving, manufacturing, robotics etc. where they process data to generate required output based on certain rules/intelligence acquired through training.

[003] To process the inputs and give a desired output, the AI systems use various models/algorithms which are trained using the training data. Once the AI system is trained using the training data, the AI systems use the models to analyze the real time data and generate appropriate result. The models may be fine-tuned in real-time based on the results. The models in the AI systems form the core of the system. Lots of effort, resources (tangible and intangible), and knowledge goes into developing these models.

[004] It is possible that some adversary may try to tamper/manipulate/evade the model in AI Systems to create incorrect outputs. The adversary may use different techniques to manipulate the output of the model. One of the simplest techniques used by the adversary is where the adversary sends queries to the AI system using his own test data to compute or approximate the gradients through the model. Based on these gradients, the adversary can then manipulate the input in order to manipulate the output of the Model. Another technique is wherein the adversary may manipulate the input data to bring an artificial output. In this invention we focus on poisoning. Adversarial data poisoning is defined as an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset. Since the model learns on poisoned dataset, it is bound to give incorrect results.

[005] This will cause hardships to the original developer of the AI in the form of business disadvantages, loss of confidential information, loss of lead time spent in development, loss of intellectual properties, loss of future revenues etc. Data poisoning can render machine learning models inaccurate, possibly resulting in poor decisions based on faulty outputs. Hence there is a need for a method of detection of AI poisoning.

Brief description of the accompanying drawings
[006] An embodiment of the invention is described with reference to the following accompanying drawings:
[007] Figure 1 depicts a processor (11) adapted to detect a poisoned input from an input to be fed to an AI model (M0);
[008] Figure 2 illustrates method steps of training a processor (11) to detect a poisoned input from an input to be fed to the AI model (M0).

Detailed description of the drawings
[009] It is important to understand some aspects of artificial intelligence (AI) technology and artificial intelligence (AI) based systems or artificial intelligence (AI) system. Some important aspects of the AI technology and AI systems can be explained as follows. Depending on the architecture of the implements AI systems may include many components. One such component is an AI module. An AI module with reference to this disclosure can be explained as a component which runs a model.

[0010] A model can be defined as reference or an inference set of data, which is use different forms of correlation matrices. Using these models and the data from these models, correlations can be established between different types of data to arrive at some logical understanding of the data. A person skilled in the art would be aware of the different types of AI model (M0)s such as linear regression, naïve bayes classifier, support vector machine, neural networks and the like. It must be understood that this disclosure is not specific to the type of model being executed in the AI module and can be applied to any AI module irrespective of the AI model (M0) being executed. A person skilled in the art will also appreciate that the AI module may be implemented as a set of software instructions, combination of software and hardware or any combination of the same.

[0011] Some of the typical tasks performed by AI systems are classification, clustering, regression etc. Majority of classification tasks depend upon labeled datasets; that is, the data sets are labelled manually in order for a neural network to learn the correlation between labels and data. This is known as supervised learning. Some of the typical applications of classifications are: face recognition, object identification, gesture recognition, voice recognition etc. In a regression task, the model is trained based on labeled datasets, where the target labels are numeric values. Some of the typical applications of regressions are: Weather forecasting, Stock price predictions, House price estimation, energy consumption forecasting etc. Clustering or grouping is the detection of similarities in the inputs. The cluster learning techniques do not require labels to detect similarities. Learning without labels is called unsupervised learning. Unlabeled data is the majority of data in the world.

[0012] As the AI module forms the core of the AI system, the module needs to be protected against attacks. AI adversarial threats can be largely categorized into – model extraction attacks, inference attacks, evasion attacks, and data poisoning attacks. Inference attacks attempt to infer the training data from the corresponding output or other information leaked by the target model. Studies have shown that it is possible to recover training data associated with arbitrary model output. Ability to extract this data further possess data privacy issues. Evasion attacks are the most prevalent kind of attack that may occur during AI system operations. In this method, the attacker works on the AI algorithm's inputs to find small perturbations leading to large modifications of its outputs (e.g., decision errors) which leads to evasion of the AI model (M0). In poisoning attacks, the adversarial carefully inject crafted data to contaminate the training data which eventually affects the functionality of the AI system.

[0013] In Model Extraction Attacks (MEA), the attacker gains information about the model internals through analysis of input, output, and other external information. Stealing such a model reveals the important intellectual properties of the organization and enables the attacker to craft other adversarial attacks such as evasion attacks. This attack is initiated through an attack vector. In the computing technology a vector may be defined as a method in which a malicious code/virus data uses to propagate itself such as to infect a computer, a computer system or a computer network. Similarly, an attack vector is defined a path or means by which a hacker can gain access to a computer or a network in order to deliver a payload or a malicious outcome. A model stealing attack uses a kind of attack vector that can make a digital twin/replica/copy of an AI module.

[0014] This invention primarily focuses on detecting whether or not an input query that is to fed to an AI model (M0) is poisoned. In poisoning, the attacker's goal is to get their poisoned inputs to be accepted as training data. AI model (M0)s are retrained with newly collected data at certain intervals, depending on their intended use. Since poisoning usually happens over time, and over some number of training cycles, it can be hard to tell when prediction accuracy starts to shift. Hence the aim of this invention is to design a module/hardware that detects and filters poisoned input before it is fed to an AI model (M0) at any stage.

[0015] Figure 1 depicts a processor (11) adapted to detect a poisoned input from an input to be fed to an AI model (M0). The AI model (M0) is deployed in a real-world application, for example voice/face recognition, stock prediction, disease detection. The processor (11) comprises at least a classifier model (12) and is in communication with the AI model (M0). The classifier model (12) is a type of artificial intelligence model configured to classify an input into two or more classes.

[0016] The processor (11) comprises other components known to a person skilled in the art, such as receivers and transmitters. It may be implemented as a function of microcontrollers, firmware, an application specific integrated circuit (ASIC), and/or a field programmable gate array (FPGA). These various modules/components can either be a software embedded in a single chip or a combination of software and hardware where each module and its functionality is executed by separate independent chips connected to each other to function as the system. For example, the classifier model (12) mentioned herein after can be a software residing in the system or the cloud or embodied within an electronic chip. Alternatively, we can have neural network chips that are specialized silicon chips, which incorporate AI technology and are used for machine learning.

[0017] As used in this application, the terms "component," "model," "interface," are intended to refer to a computer-related entity or an entity related to, or that is part of, an operational apparatus with one or more specific functionalities, wherein such entities can be either hardware, a combination of hardware and software, software, or software in execution. As further yet another example, interface(s) can include input/output (I/O) components as well as associated processor (11), application, or Application Programming Interface (API) components. Similarly, the classifier model (12) could be a hardware or a software combination of these modules or could be deployed remotely on a cloud or server. These various modules can either be a software embedded in a single chip or a combination of software and hardware where each module and its functionality is executed by separate independent chips connected to each other to function as a system.

[0018] The processor (11) is characterized by it’s functionality and configuration. The processor (11) is configured to: feed a set of pre-determined poisoned samples (DS0) to the AI model (M0); record the behavior of the AI model (M0) in response to the set of pre-determined poisoned samples (DS0); perform a vulnerability assessment based on the recorded behavior; train a classifier model (12) based on the vulnerability assessment; receive the input to be fed to the AI model (M0); execute the classifier model (12) to detect poisoned input from the input to be fed to the AI model (M0).

[0019] Vulnerability assessment comprises determining a threshold of poisoning for which the AI model (M0) misclassifies an input. During training the classifier is trained to identify a poisoned input based on feature values, feature distributions, and meta-data of set of pre-determined poisoned input. Once the classifier is trained, it receives the input to be fed to the AI model (M0) and detects a poisoned input. Its filters out the poisoned input, allowing only the clean input to be fed to the AI model (M0).

[0020] It should be understood at the outset that, although exemplary embodiments are illustrated in the figures and described below, the present disclosure should in no way be limited to the exemplary implementations and techniques illustrated in the drawings and described below.

[0021] Figure 2 illustrates method steps of training a processor (11) to detect a poisoned input from an input to be fed to the AI model (M0). The processor (11) and the AI model (M0) have been explained in accordance with figure 1. For clarity it is reiterated that said AI model (M0) is in communication with the processor (11) and said processor (11) comprises at least a classifier model (12).

[0022] Method step 201 comprises feeding a set of pre-determined poisoned samples (DS0) to the AI model (M0) by means of the processor (11). The set of pre-determined poisoned samples (DS0) poisoned input queries created through techniques known to a person skilled in the art such as gradient based attacks, pixel or patch/pattern attacks. The example of patch attack is the image of a cat with patches at specific places that induces the AI model (M0) to misclassify it as a dog.

[0023] Method step 202 comprises recording the behavior of the AI model (M0) in response to the set of pre-determined poisoned samples (DS0). How the AI model (M0) responds to these poisoned queries is analyzed. Method step 203 comprises performing a vulnerability assessment by means of the processor (11) based on the recorded responses. Performing a vulnerability assessment comprises determining a threshold of poisoning for which the AI model (M0) misclassifies an input. Taking cue from the previous example, for what amount patch attack i.e. patches on the image of a cat, does the AI model (M0) misclassifies it as a dog.

[0024] Method step 204 comprises training the classifier model (12) based on the vulnerability assessment by means of the processor (11). During training the classifier model (12) is fed with both clean samples and poisoned samples. The poisoned samples are labelled in a supervised manner. This makes the classifier model (12) during training, identify a poisoned input based on feature values, feature distributions, and meta-data of the set of pre-determined poisoned input.

[0025] The classifier model (12) has been trained in accordance with method steps. Figure 3 depicts three stages of deployment of the processor (11). In the first stage the processor (11) is configured i.e. the classifier model (12) within the processor (11) is trained based on the vulnerability assessment. In the second stage processor (11) is deployed in training/re-training stage of the AI model (M0). Here it ensures that the AI model (M0) is trained on only on clean dataset (DS0-tested). In the final stage wherein the AI model (M0) is deployed in a real-world application, the processor (11) acts like a poisoning sniffer/filter that allows only the non-poisoned queries to reach the AI model (M0).

[0026] A real world example of the AI Model (M0) could be classification as in – Facial recognition or regression as in – Stock price prediction. When an attacker gains access to a subset of the training dataset, he could poison the training samples (as less as 1-2%) to create adversarial impacts on the model’s performance. The consequences could be – deteriorated model performance after training, hidden backdoors that the attacker could leverage for targeted attacks. These could directly or indirectly affect the performance of the system and even jeopardize the system completely, which could be fatal in mission critical applications. In Facial Recognition the system associated could be a security access control to a restricted facility. The poisoned samples are usually secret visually imperceptible patches embedded in the legitimate samples, where person-1 who ideally has no access could be mapped to full access through poisoned samples. The poisoned samples here will be person-1 with embedded patches. The proposed invention attempts to provide a technical solution to such real-world problems.

[0027] It must be understood that while these methodologies describe only a series of steps to accomplish the objectives, these methodologies are implemented in the processor (11) and AI model (M0), which may be a combination of hardware or software or a combination thereof. Further the embodiments explained in the above detailed description are only illustrative and do not limit the scope of this invention. Any modification to the processor (11) adapted to detect a poisoned input and a training method thereof form a part of this invention. The scope of this invention is limited only by the claims.

, Claims:We Claim:
1. A processor (11) adapted to detect a poisoned input from an input to be fed to an AI model (M0), the processor (11) comprising at least a classifier model (12), said processor (11) in communication with the AI model (M0), said processor (11) configured to:
feed a set of pre-determined poisoned samples (DS0) to the AI model (M0);
record the behavior of the AI model (M0) in response to the set of pre-determined poisoned samples (DS0);
perform a vulnerability assessment based on the recorded behavior;
train a classifier model (12) based on the vulnerability assessment;
receive the input to be fed to the AI model (M0);
execute the classifier model (12) to detect poisoned input from the input to be fed to the AI model (M0).

2. The processor (11) adapted to detect a poisoned input as claimed in claim 1, wherein vulnerability assessment comprises determining a threshold of poisoning for which the AI model (M0) misclassifies an input.

3. The processor (11) adapted to detect a poisoned input as claimed in claim 1, wherein the classifier is trained to identify a poisoned input based on feature values, feature distributions, and meta-data of set of pre-determined poisoned input.

4. A method (200) of training a processor (11) to detect a poisoned input from an input to be fed to an AI model (M0), said AI model (M0) in communication with the processor (11), said processor (11) comprising at least a classifier model (12), the method steps comprising:
feeding (201) a set of pre-determined poisoned samples (DS0) to the AI model (M0) by means of the processor (11);
recording (202) the behavior of the AI model (M0) in response to the set of pre-determined poisoned samples (DS0);
performing (203) a vulnerability assessment by means of the processor (11) based on the recorded responses;
training (204) the classifier model (12) based on the vulnerability assessment by means of the processor (11).

5. The method (200) of training a processor (11) to detect a poisoned input as claimed in claim 4, wherein performing a vulnerability assessment comprises determining a threshold of poisoning for which the AI model (M0) misclassifies an input.

6. The method (200) of training a processor (11) to detect a poisoned input as claimed in claim 4, wherein training the classifier comprises identifying a poisoned input based on feature values, feature distributions, and meta-data of the set of pre-determined poisoned input.

Documents

Application Documents

#	Name	Date
1	202341030698-POWER OF AUTHORITY [28-04-2023(online)].pdf	2023-04-28
2	202341030698-FORM 1 [28-04-2023(online)].pdf	2023-04-28
3	202341030698-DRAWINGS [28-04-2023(online)].pdf	2023-04-28
4	202341030698-DECLARATION OF INVENTORSHIP (FORM 5) [28-04-2023(online)].pdf	2023-04-28
5	202341030698-COMPLETE SPECIFICATION [28-04-2023(online)].pdf	2023-04-28
6	202341030698-Power of Attorney [30-04-2024(online)].pdf	2024-04-30
7	202341030698-Covering Letter [30-04-2024(online)].pdf	2024-04-30