Neural Network Defense Mechanisms Employing Gradient Obfuscation

< Back

Neural Network Defense Mechanisms Employing Gradient Obfuscation Against White Box Intrusions

Abstract: The present disclosure provides a method for defending neural networks against white-box attacks by obfuscating gradients. The method comprises identifying vulnerable components of a neural network based on a susceptibility analysis involving at least one of layer sensitivity, parameter importance, and adversarial perturbation patterns. A targeted gradient masking technique is applied specifically to the identified vulnerable components by selectively suppressing or altering gradient values according to a predetermined masking policy. The masking policy involves conditional rules based on real-time threat analysis. A feedback mechanism is utilized to preserve the training dynamics and convergence properties of the neural network by monitoring and adjusting the learning rate and weight updates in response to the applied gradient masking.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

26 April 2024

Publication Number

23/2024

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

MARWADI UNIVERSITY

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

KUNAL SHAH

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

PRIYANSHI AJAGIYA

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

MITANSHU VADNAGARA

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

PARTH PARMAR

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

DR. ANJALI DIWAN

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

Inventors

1. KUNAL SHAH

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

2. PRIYANSHI AJAGIYA

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

3. MITANSHU VADNAGARA

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

4. PARTH PARMAR

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

5. DR. ANJALI DIWAN

MARWADI UNIVERSITY, RAJKOT- MORBI HIGHWAY, AT GAURIDAD, RAJKOT – 360003, GUJARAT, INDIA

Specification

Description:Brief Description of the Drawings

Generally, the present disclosure relates to cybersecurity measures for neural networks. Particularly, the present disclosure relates to defending neural networks against white-box attacks by obfuscating gradients.
Background
The background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
In recent years, neural networks have become pivotal in advancing artificial intelligence, finding applications across diverse fields such as image recognition, natural language processing, and cybersecurity. These sophisticated computational models mimic the human brain's neural structure, enabling machines to learn from vast amounts of data. As reliance on neural networks has increased, so has the importance of securing them against various types of attacks. Among these, white-box attacks, wherein attackers have complete access to the model's architecture and parameters, pose a significant threat. Such attacks can exploit the model's vulnerabilities, leading to unauthorized access, data leakage, or model corruption.
Vulnerability analysis forms the cornerstone of defending neural networks against these threats. It involves a comprehensive assessment of a model to identify components most susceptible to attacks. This assessment may include analyzing layer sensitivity, parameter importance, and patterns in adversarial perturbations. Layer sensitivity refers to the extent to which changes in inputs affect the outputs of specific layers in the network. Parameter importance evaluates which weights and biases in the model significantly impact the network's performance. Adversarial perturbation patterns involve studying how slight, often imperceptible, modifications to input data can deceive the model into making incorrect predictions or classifications.
Gradient masking has emerged as a robust technique to safeguard neural networks by obfuscating the information that attackers can exploit. This strategy involves selectively suppressing or altering the gradient values—derivatives of the loss function with respect to the model parameters—during the training process. By implementing a targeted gradient masking approach, based on a predefined masking policy that incorporates conditional rules derived from real-time threat analysis, the method ensures that the manipulation of gradient information does not compromise the model's integrity.
Moreover, maintaining the integrity of the neural network's learning process is paramount. The application of gradient masking could potentially disrupt the training dynamics and convergence properties of the model. To mitigate this, the deployment of a feedback mechanism is crucial. This mechanism monitors the learning rate and weight updates, adjusting them as necessary to counterbalance the effects of gradient masking. Such adjustments ensure that the network continues to learn effectively, despite the obfuscation of gradient information, thereby preserving the model's accuracy and performance.
Despite these advancements, existing solutions often fall short in providing comprehensive protection. They may not fully account for the complexity of white-box attacks or may degrade the model's performance due to the indiscriminate application of security measures. Furthermore, many strategies do not dynamically adapt to evolving threats, rendering them less effective over time.
In light of the above discussion, there exists an urgent need for solutions that overcome the problems associated with conventional systems and techniques for defending neural networks against white-box attacks by obfuscating gradients.

Summary
The following presents a simplified summary of various aspects of this disclosure in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements nor delineate the scope of such aspects. Its purpose is to present some concepts of this disclosure in a simplified form as a prelude to the more detailed description that is presented later.
The following paragraphs provide additional support for the claims of the subject application.
In an aspect, the present disclosure aims to provide a method and system for defending neural networks against white-box attacks through gradient obfuscation. The disclosed method involves identifying vulnerable components of a neural network via susceptibility analysis, which considers layer sensitivity, parameter importance, and adversarial perturbation patterns. A targeted gradient masking technique is applied to these components, where gradient values are selectively suppressed or altered according to a predetermined masking policy based on conditional rules from real-time threat analysis. To maintain the training dynamics and convergence properties of the neural network, a feedback mechanism adjusts the learning rate and weight updates in response to gradient masking.
Furthermore, the method includes enhancing the precision of gradient masking by selectively zeroing out gradient components for parameters within certain vulnerability distribution quantiles, based on historical attack data. Another refinement involves applying a non-linear sigmoid function to gradients to control distortion, with adjustable sigmoid function parameters based on adversarial sample error rates. The method also encompasses dynamically modulating the gradient obfuscation degree based on continuous adversarial pattern assessment, using an adversarial pattern recognition module trained on known attack vectors, and an escalation protocol that intensifies gradient masking in relation to the sophistication level of detected attacks.
An adaptive gradient masking algorithm applies variable masking intensities based on real-time analysis of adversarial perturbation sensitivity, assessing gradient sensitivity through differential geometric measures. This adaptive approach, alongside the strategic implementation of noise injection into gradients via a calibrated stochastic process, aims to enhance model resilience without compromising performance.
The system for implementing these defenses comprises a processor executing instructions for performing susceptibility analysis, applying targeted gradient masking, and adjusting model training dynamics. It includes a memory storing the necessary instructions, a feedback mechanism, an adversarial pattern recognition module for dynamic modulation of obfuscation, and an escalation protocol for response to sophisticated attacks.

Field of the Invention

The features and advantages of the present disclosure would be more clearly understood from the following description taken in conjunction with the accompanying drawings in which:
FIG. 1 illustrates a method (100) relates to a process for defending neural networks against white-box attacks, in accordance with the embodiments of the present disclosure.
FIG. 2 illustrates a block diagram of a system (200) for defending neural networks against white-box attacks, in accordance with the embodiments of the present disclosure.
FIG. 3 illustrates an architecture of gradient obfuscation techniques for defending neural networks against white-box attacks, in accordance with the embodiments of the present disclosure.
FIG. 4 illustrates a flow diagram that presents a systematic approach for implementing gradient obfuscation techniques, in accordance with the embodiments of the present disclosure.

Detailed Description
In the following detailed description of the invention, reference is made to the accompanying drawings that form a part hereof, and in which is shown, by way of illustration, specific embodiments in which the invention may be practiced. In the drawings, like numerals describe substantially similar components throughout the several views. These embodiments are described in sufficient detail to claim those skilled in the art to practice the invention. Other embodiments may be utilized and structural, logical, and electrical changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims and equivalents thereof.
The use of the terms “a” and “an” and “the” and “at least one” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The use of the term “at least one” followed by a list of one or more items (for example, “at least one of A and B”) is to be construed to mean one item selected from the listed items (A or B) or any combination of two or more of the listed items (A and B), unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.
Pursuant to the "Detailed Description" section herein, whenever an element is explicitly associated with a specific numeral for the first time, such association shall be deemed consistent and applicable throughout the entirety of the "Detailed Description" section, unless otherwise expressly stated or contradicted by the context.
FIG. 1 illustrates a method (100) relates to a process for defending neural networks against white-box attacks, in accordance with the embodiments of the present disclosure. This defense mechanism involves a multifaceted approach that includes identifying vulnerable components, applying targeted gradient masking, and utilizing a feedback mechanism. In step (102), identifying vulnerable components of a neural network is based on a susceptibility analysis. The susceptibility analysis encompasses at least one of layer sensitivity, parameter importance, and adversarial perturbation patterns. The term "susceptibility analysis" relates to evaluating various aspects of the neural network to ascertain its vulnerabilities. This analysis aids in pinpointing the components that are most susceptible to white-box attacks, thereby enabling a more focused and effective defense strategy. In step (104), applying a targeted gradient masking technique to the identified vulnerable components involves selectively suppressing or altering gradient values. This approach is guided by a predetermined masking policy that includes conditional rules based on real-time threat analysis. The term "targeted gradient masking technique" refers to a method of selectively manipulating the gradient information during the neural network's training phase. By adjusting the gradients according to specific conditions and threats, the method aims to obfuscate potential attack vectors, making it more challenging for attackers to exploit the network's vulnerabilities. Further in step (106), utilizing a feedback mechanism (206) to preserve the training dynamics and convergence properties of the neural network involves monitoring and adjusting the learning rate and weight updates. The feedback mechanism (206) ensures that despite the application of gradient masking, the neural network continues to learn effectively without significant disruptions to its training process. The term "feedback mechanism" denotes a system or process that adjusts the neural network's learning parameters in response to the effects of gradient masking. This adjustment is crucial for maintaining the network's ability to converge to an optimal solution despite potential interference caused by the defense mechanisms.
In an embodiment, the targeted gradient masking technique of the method (100) comprises selectively zeroing out gradient components for parameters located within certain quantiles of a vulnerability distribution, the quantiles being calculated based on historical attack data. This approach enhances the method's precision in defending against white-box attacks by focusing on the most vulnerable aspects of the neural network as determined through a comprehensive analysis of past attack patterns. The selective suppression of gradient values within specific quantiles allows for a nuanced application of gradient masking, minimizing potential disruptions to the neural network's learning process while maximizing the obfuscation of information that could be exploited by attackers. By grounding the selection of quantiles in historical data, the method adapts to the evolving nature of threats, ensuring that the defense mechanism remains robust against new and sophisticated attacks. This strategic application of gradient masking based on vulnerability quantiles significantly bolsters the neural network's defenses, making it more challenging for attackers to identify and exploit weaknesses.
In another embodiment, the targeted gradient masking technique of the method (100) includes applying a non-linear sigmoid function to the gradients to distort the gradient trajectory in a controlled manner. The parameters of the sigmoid function are adjustable based on the classification error rate of adversarial samples. This incorporation of a non-linear function serves to obfuscate the gradients in a way that is dynamically responsive to the threat environment. By adjusting the parameters of the sigmoid function in relation to the observed effectiveness of adversarial samples, the method ensures that the gradient masking is both adaptive and optimally disruptive to potential attackers. This controlled distortion of the gradient trajectory complicates the attacker's ability to derive useful information from the gradients, thereby enhancing the security of the neural network. The adaptability of the sigmoid function parameters according to real-time feedback on attack patterns represents a significant advance in the art of neural network defense, ensuring that the gradient masking remains effective under varied and evolving attack scenarios.
In a further embodiment, the method (100) includes dynamically modulating the degree of gradient obfuscation. This modulation is based on a continuous assessment of adversarial attack patterns, utilizing an adversarial pattern recognition module (208) trained on a dataset comprising known adversarial attack vectors. Additionally, an escalation protocol (210) increases the intensity of gradient masking in direct correlation to the sophistication level of detected adversarial attacks. This dynamic modulation and escalation protocol provide a responsive and proactive defense mechanism that adapts to the threat landscape. By leveraging a trained adversarial pattern recognition module, the method continuously evaluates the threat environment, allowing for the gradient masking intensity to be adjusted in real-time. The escalation protocol ensures that as attacks increase in sophistication, the defensive measures are correspondingly intensified, maintaining a robust defense against a wide spectrum of white-box attacks. This approach not only enhances the resilience of neural networks to advanced attacks but also preserves the efficacy of the defense over time, despite the continuous evolution of attack methodologies.
In yet another embodiment, dynamically modulating the degree of gradient obfuscation includes employing a stochastic process for injecting noise into the gradients. This process is calibrated using a probabilistic model that predicts the impact of noise on model resilience and performance. This technique introduces an additional layer of complexity to the defense mechanism, leveraging randomness to further obfuscate the information attackers seek to exploit. The calibration of this stochastic process through a probabilistic model ensures that the injection of noise does not unduly impair the neural network's performance. By carefully balancing the need for security with the imperative of maintaining high performance, this approach optimizes the resilience of the neural network against attacks. The strategic injection of noise represents an innovative means of complicating the task of potential attackers, making it significantly more difficult to decipher the true gradient information or to exploit the neural network's vulnerabilities.
In a further embodiment, the method (100) comprises deploying an adaptive gradient masking algorithm that applies variable masking intensities based on a real-time adversarial perturbation sensitivity analysis. This analysis assesses gradient sensitivity through differential geometric measures. The adaptive nature of the gradient masking algorithm ensures that the defense mechanism is finely tuned to the current threat environment, adjusting the intensity of masking in response to the detected sensitivity of the neural network to adversarial perturbations. By employing differential geometric measures to evaluate gradient sensitivity, the method gains a nuanced understanding of how different components of the neural network are affected by potential attacks. This understanding allows for a highly targeted application of gradient masking, optimizing the defense mechanism's effectiveness while minimizing any negative impact on the neural network's learning process. The deployment of an adaptive gradient masking algorithm signifies a sophisticated approach to neural network defense, offering a dynamic and highly effective means of safeguarding against the ever-evolving landscape of white-box attacks.
The term “system” as used throughout the present disclosure relates to an arrangement configured to defend neural networks against white-box attacks. The system comprises several components each serving a distinct function in enhancing the security of neural networks.
The term “processor” as used throughout the present disclosure relates to a hardware component tasked with executing instructions. In the context of the present system, the processor executes instructions for performing susceptibility analysis on components of a neural network. The analysis aims to identify vulnerabilities by examining at least one of layer sensitivity, parameter importance, and adversarial perturbation patterns. Such analysis is crucial for determining which parts of the neural network are most susceptible to attacks, thereby guiding the deployment of protective measures.
The term “memory” as used throughout the present disclosure denotes a storage medium coupled to the processor. The memory stores instructions for applying a targeted gradient masking technique to the identified vulnerable components of the neural network. This technique involves selectively suppressing or altering gradient values based on a predetermined masking policy that incorporates conditional rules derived from real-time threat analysis. The ability to dynamically adjust the gradient masking approach in response to evolving threats plays a pivotal role in maintaining the integrity and security of the neural network.
The term “feedback mechanism” as used throughout the present disclosure refers to a system operatively coupled to the processor. This mechanism is responsible for preserving the training dynamics and convergence properties of the neural network by monitoring and adjusting the learning rate and weight updates in response to the applied gradient masking. Through this feedback loop, the neural network can continue to learn and adapt, even as measures are taken to obscure gradient information from potential attackers.
The term “adversarial pattern recognition module” as used throughout the present disclosure pertains to a module configured to dynamically modulate the degree of gradient obfuscation. This modulation is based on a continuous assessment of adversarial attack patterns, with the module being trained on a dataset comprising known adversarial attack vectors. By continuously analyzing and adapting to new and evolving attack strategies, the module enhances the system’s ability to protect the neural network from sophisticated threats.
The term “escalation protocol” as used throughout the present disclosure describes a protocol within the processor instructions. This protocol is designed to increase the intensity of gradient masking in direct correlation to the sophistication level of detected adversarial attacks. The escalation protocol ensures that as threats become more complex and harder to detect, the system responds by enhancing its defensive mechanisms, thereby providing a robust and scalable defense strategy against a range of white-box attacks.
FIG. 2 illustrates a block diagram of a system (200) for defending neural networks against white-box attacks, in accordance with the embodiments of the present disclosure. Said system comprises a processor (202), a memory (204), a feedback mechanism (206), an adversarial pattern recognition module (208), and an escalation protocol (210). Said processor (202) is configured to execute instructions for performing susceptibility analysis on neural network components to identify vulnerabilities. The execution of such instructions involves the assessment of layer sensitivity, parameter importance, and adversarial perturbation patterns. Said memory (204) is coupled to the processor (202), wherein instructions for applying a targeted gradient masking technique to identified vulnerable components are stored. The gradient masking technique selectively suppresses or alters gradient values in accordance with a predetermined masking policy that includes conditional rules derived from real-time threat analysis. Furthermore, said feedback mechanism (206) is operatively coupled to the processor for preserving the neural network's training dynamics and convergence properties. Adjustments in learning rate and weight updates are monitored and executed in response to the gradient masking applied. In addition, said adversarial pattern recognition module (208) is configured to dynamically modulate the degree of gradient obfuscation based on a continuous assessment of adversarial attack patterns, where such module is trained on a dataset comprising known adversarial attack vectors. Additionally, said escalation protocol (210) is incorporated within the processor instructions to increase the intensity of gradient masking proportional to the detected sophistication level of adversarial attacks, thereby enhancing the neural network's defense capabilities.
In an embodiment, the system (200) enhances its defense mechanism against white-box attacks by incorporating an advanced configuration within the processor (202). This configuration enables the processor to execute instructions for selectively zeroing out gradient components. Such action is targeted at parameters located within certain quantiles of a vulnerability distribution. The determination of these quantiles is based on a thorough analysis of historical attack data, allowing for a refined approach to gradient masking. This method recognizes that not all components of a neural network are equally susceptible to attacks. By focusing on the most vulnerable areas as indicated by past security breaches, the system optimizes its protective measures. This strategic suppression of gradient components significantly reduces the likelihood of successful adversarial exploitation, thereby enhancing the overall security posture of the neural network. The precision with which these measures are applied minimizes any potential impact on the network's learning capabilities, ensuring that its performance remains robust while its vulnerabilities are shielded from attackers.
In another embodiment, the processor (202) within the system (200) employs a sophisticated approach to gradient masking by applying a non-linear sigmoid function to the gradients. This application aims to control the distortion of the gradient trajectory, making it more challenging for attackers to leverage gradient information in their attacks. Crucially, the parameters of the sigmoid function are designed to be adaptable. Adjustments to these parameters are made in response to the classification error rate of adversarial samples, enabling a dynamic and responsive defense mechanism. By fine-tuning the function based on real-time feedback regarding the effectiveness of adversarial attacks, the system ensures that its protective measures are both targeted and flexible. This adaptability is key to maintaining the resilience of the neural network against a constantly evolving array of threats, as it allows the system to optimize the balance between security and learning efficiency.
In a further embodiment, the adversarial pattern recognition module (208) of the system (200) is equipped with instructions for implementing a stochastic process. This process introduces noise into the gradients, a technique calibrated using a probabilistic model. The model evaluates the impact of noise injection on the neural network's resilience and performance, ensuring that the added noise serves as an effective deterrent against adversarial attacks without compromising the network's functionality. The introduction of noise into the gradient calculations adds an additional layer of complexity for attackers, obfuscating the data they would need to craft effective attacks. This stochastic approach to enhancing network security highlights the system's innovative use of randomness and probability theory in defending against sophisticated cyber threats. By carefully calibrating the noise injection to achieve an optimal balance between security and performance, the system maintains its efficacy in neural network training and inference tasks while significantly bolstering its defenses against white-box attacks.
FIG. 3 illustrates an architecture of gradient obfuscation techniques for defending neural networks against white-box attacks, in accordance with the embodiments of the present disclosure. Attack simulation is performed, where a 3D model symbolises the testing of neural networks against white-box attacks. In a white-box attack scenario, an attacker has complete access to the neural network model, including architecture, inputs, outputs, and weights. This comprehensive access allows the attacker to exploit the model's vulnerabilities. An example of this would be an adversary who has access to the neural network of a facial recognition system, using the detailed knowledge to craft images that the model incorrectly classifies or fails to recognize as faces. Obfuscation technique is aimed to obscure the model's gradients that are often exploited during an attack. Gradient masking involves altering the gradient calculations in a way that confuses attackers, model distillation simplifies the model while preserving its performance, and noise addition introduces random data into the calculations to further obscure the gradients. For instance, in gradient masking, a neural network tasked with image classification might introduce small perturbations to its gradients to prevent an attacker from understanding how different input changes affect the output classifications. In an embodiment, the system may utilizes defence strategies, which are proactive measures to safeguard the neural network. Adversarial training and input frequency adjustment are the two primary strategies depicted. Adversarial training involves training the network with both regular and adversarial inputs, thereby improving its resilience to attacks. Input frequency adjustment may involve modifying the input data to enhance model robustness, such as altering the frequency components of input images to a neural network to mitigate the effects of adversarial noise. For example, a voice recognition system could be exposed to various voice manipulations during training, thus preparing it to resist attempts to deceive it using synthetic or altered voice samples.
FIG. 4 illustrates a flow diagram that presents a systematic approach for implementing gradient obfuscation techniques, in accordance with the embodiments of the present disclosure. The process starts with the identification of attack vectors, which are the possible ways through which an attacker could exploit the neural network. For example, attack vector might be the inputs to a neural network used for autonomous driving, where slight alterations to road sign images could mislead the network into misinterpreting them. Subsequently, analysis of the network's architecture to determine its vulnerabilities. For example, analysing a financial fraud detection neural network to identify the layers and nodes that are most sensitive to manipulation. Once vulnerabilities are identified, gradient masking is implemented to protect those areas. In practice, this could involve modifying a speech recognition system’s gradient such that the system becomes less sensitive to small, deliberate variations in speech patterns to evade detection. The modified network is then tested against white-box attacks. If the obfuscation is successful and the network withstands the simulated attacks, same can be deployed. However, if the network requires further refinement, the process cycles back to the implementation phase. An example of this iterative process might involve a biometric authentication system that, after initial obfuscation, is still vulnerable to synthetic fingerprints, necessitating additional rounds of obfuscation and testing.
In an embodiment, the gradient obfuscation techniques are deployed to enhance the robustness of neural network models against adversarial attacks by adding noise or distortion to the gradients during training or inference. The gradient obfuscation improves the model's defenses, making challenging for attackers to generate effective adversarial examples that could lead to incorrect predictions or misclassifications. White-box attacks, where attackers have complete knowledge of the neural network's parameters and architecture, use gradient-based optimization techniques to craft attacks. Obfuscating gradients reduces the efficacy of such attacks, preserving the security and privacy of systems that process sensitive data. The gradient obfuscation improves trustworthiness of AI systems in critical sectors, by mitigating the risks associated with adversarial attacks. Further, gradient obfuscation include the introduction of unpredictability in gradient computation, acting as a form of regularization to enhance generalization and prevent overfitting. Additionally, gradient masking mask the model’s vulnerabilities by altering or suppressing gradient components, which further fortifies the model against adversarial intrusions to the threat environment, varying the degree of noise injection to focus defenses on the most vulnerable parts of the input space.
Example embodiments herein have been described above with reference to block diagrams and flowchart illustrations of methods and apparatuses. It will be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, respectively, can be implemented by various means including hardware, software, firmware, and a combination thereof. For example, in one embodiment, each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations can be implemented by computer program instructions. These computer program instructions may be loaded onto a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions which execute on the computer or other programmable data processing apparatus create means for implementing the functions specified in the flowchart block or blocks.
Throughout the present disclosure, the term ‘processing means’ or ‘microprocessor’ or ‘processor’ or ‘processors’ includes, but is not limited to, a general purpose processor (such as, for example, a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a microprocessor implementing other types of instruction sets, or a microprocessor implementing a combination of types of instruction sets) or a specialized processor (such as, for example, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), or a network processor).
The term “non-transitory storage device” or “storage” or “memory,” as used herein relates to a random access memory, read only memory and variants thereof, in which a computer can store data or software for any duration.
Operations in accordance with a variety of aspects of the disclosure is described above would not have to be performed in the precise order described. Rather, various steps can be handled in reverse order or simultaneously or not at all.
While several implementations have been described and illustrated herein, a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein may be utilized, and each of such variations and/or modifications is deemed to be within the scope of the implementations described herein. More generally, all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific implementations described herein. It is, therefore, to be understood that the foregoing implementations are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, implementations may be practiced otherwise than as specifically described and claimed. Implementations of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the scope of the present disclosure.

Claims

I/We Claims

A method (100) for defending neural networks against white-box attacks by obfuscating gradients, the method (100) comprising:
a) identifying vulnerable components of a neural network based on a susceptibility analysis involving at least one of: layer sensitivity, parameter importance, and adversarial perturbation patterns;
b) applying a targeted gradient masking technique specifically to the identified vulnerable components by selectively suppressing or altering gradient values according to a predetermined masking policy, where the masking policy involves conditional rules based on real-time threat analysis; and
c) utilizing a feedback mechanism (206) to preserve the training dynamics and convergence properties of the neural network by monitoring and adjusting the learning rate and weight updates in response to the applied gradient masking.
The method (100) of claim 1, wherein the targeted gradient masking technique comprises selectively zeroing out gradient components for parameters located within certain quantiles of a vulnerability distribution, the quantiles being calculated based on historical attack data.
The method (100) of claim 1, wherein the targeted gradient masking technique includes applying a non-linear sigmoid function to the gradients to distort the gradient trajectory in a controlled manner, with parameters of the sigmoid function being adjustable based on the classification error rate of adversarial samples.
The method (100) of claim 1, further comprising:
a) dynamically modulating the degree of gradient obfuscation based on a continuous assessment of adversarial attack patterns, where the assessment utilizes an adversarial pattern recognition module (208) trained on a dataset comprising known adversarial attack vectors; and
b) implementing an escalation protocol (210) that increases the intensity of gradient masking in direct correlation to the sophistication level of detected adversarial attacks.
The method (100) of claim 4, wherein dynamically modulating includes a stochastic process for injecting noise into the gradients, the process calibrated using a probabilistic model that predicts the impact of noise on model resilience and performance.
The method (100) of claim 1, further comprising deploying an adaptive gradient masking algorithm that applies variable masking intensities based on a real-time adversarial perturbation sensitivity analysis, which assesses gradient sensitivity through differential geometric measures.
A system (200) for defending neural networks against white-box attacks, comprising:
a) a processor (202) configured to execute instructions for performing susceptibility analysis on neural network components to identify vulnerabilities based on at least one of layer sensitivity, parameter importance, and adversarial perturbation patterns;
b) a memory (204) coupled to the processor (202), storing instructions for applying a targeted gradient masking technique to the identified vulnerable components by selectively suppressing or altering gradient values according to a predetermined masking policy that incorporates conditional rules based on real-time threat analysis;
c) a feedback mechanism (206) operatively coupled to the processor for preserving the neural network's training dynamics and convergence properties by monitoring and adjusting learning rate and weight updates in response to the gradient masking applied.
d) an adversarial pattern recognition module (208) configured to dynamically modulate the degree of gradient obfuscation based on a continuous assessment of adversarial attack patterns, trained on a dataset comprising known adversarial attack vectors; and
e) an escalation protocol (210) within the processor instructions that increases the intensity of gradient masking proportionally to the sophistication level of detected adversarial attacks.

The system (200) of claim 7, wherein the processor (200) is further configured to execute instructions for selectively zeroing out gradient components for parameters located within certain quantiles of a vulnerability distribution, which are computed based on historical attack data.
The system (200) of claim 7, wherein the processor applies a non-linear sigmoid function to gradients to control the distortion of the gradient trajectory, with the sigmoid function's parameters being adaptable based on the classification error rate of adversarial samples.
The system (200) of claim 7, wherein the adversarial pattern recognition module (208) includes instructions for implementing a stochastic process that injects noise into the gradients, calibrated using a probabilistic model that evaluates the effect of noise on the neural network's resilience and performance.

NEURAL NETWORK DEFENSE MECHANISMS EMPLOYING GRADIENT OBFUSCATION AGAINST WHITE-BOX INTRUSIONS

The present disclosure provides a method for defending neural networks against white-box attacks by obfuscating gradients. The method comprises identifying vulnerable components of a neural network based on a susceptibility analysis involving at least one of layer sensitivity, parameter importance, and adversarial perturbation patterns. A targeted gradient masking technique is applied specifically to the identified vulnerable components by selectively suppressing or altering gradient values according to a predetermined masking policy. The masking policy involves conditional rules based on real-time threat analysis. A feedback mechanism is utilized to preserve the training dynamics and convergence properties of the neural network by monitoring and adjusting the learning rate and weight updates in response to the applied gradient masking.

, Claims:I/We Claims

A method (100) for defending neural networks against white-box attacks by obfuscating gradients, the method (100) comprising:
a) identifying vulnerable components of a neural network based on a susceptibility analysis involving at least one of: layer sensitivity, parameter importance, and adversarial perturbation patterns;
b) applying a targeted gradient masking technique specifically to the identified vulnerable components by selectively suppressing or altering gradient values according to a predetermined masking policy, where the masking policy involves conditional rules based on real-time threat analysis; and
c) utilizing a feedback mechanism (206) to preserve the training dynamics and convergence properties of the neural network by monitoring and adjusting the learning rate and weight updates in response to the applied gradient masking.
The method (100) of claim 1, wherein the targeted gradient masking technique comprises selectively zeroing out gradient components for parameters located within certain quantiles of a vulnerability distribution, the quantiles being calculated based on historical attack data.
The method (100) of claim 1, wherein the targeted gradient masking technique includes applying a non-linear sigmoid function to the gradients to distort the gradient trajectory in a controlled manner, with parameters of the sigmoid function being adjustable based on the classification error rate of adversarial samples.
The method (100) of claim 1, further comprising:
a) dynamically modulating the degree of gradient obfuscation based on a continuous assessment of adversarial attack patterns, where the assessment utilizes an adversarial pattern recognition module (208) trained on a dataset comprising known adversarial attack vectors; and
b) implementing an escalation protocol (210) that increases the intensity of gradient masking in direct correlation to the sophistication level of detected adversarial attacks.
The method (100) of claim 4, wherein dynamically modulating includes a stochastic process for injecting noise into the gradients, the process calibrated using a probabilistic model that predicts the impact of noise on model resilience and performance.
The method (100) of claim 1, further comprising deploying an adaptive gradient masking algorithm that applies variable masking intensities based on a real-time adversarial perturbation sensitivity analysis, which assesses gradient sensitivity through differential geometric measures.
A system (200) for defending neural networks against white-box attacks, comprising:
a) a processor (202) configured to execute instructions for performing susceptibility analysis on neural network components to identify vulnerabilities based on at least one of layer sensitivity, parameter importance, and adversarial perturbation patterns;
b) a memory (204) coupled to the processor (202), storing instructions for applying a targeted gradient masking technique to the identified vulnerable components by selectively suppressing or altering gradient values according to a predetermined masking policy that incorporates conditional rules based on real-time threat analysis;
c) a feedback mechanism (206) operatively coupled to the processor for preserving the neural network's training dynamics and convergence properties by monitoring and adjusting learning rate and weight updates in response to the gradient masking applied.
d) an adversarial pattern recognition module (208) configured to dynamically modulate the degree of gradient obfuscation based on a continuous assessment of adversarial attack patterns, trained on a dataset comprising known adversarial attack vectors; and
e) an escalation protocol (210) within the processor instructions that increases the intensity of gradient masking proportionally to the sophistication level of detected adversarial attacks.

The system (200) of claim 7, wherein the processor (200) is further configured to execute instructions for selectively zeroing out gradient components for parameters located within certain quantiles of a vulnerability distribution, which are computed based on historical attack data.
The system (200) of claim 7, wherein the processor applies a non-linear sigmoid function to gradients to control the distortion of the gradient trajectory, with the sigmoid function's parameters being adaptable based on the classification error rate of adversarial samples.
The system (200) of claim 7, wherein the adversarial pattern recognition module (208) includes instructions for implementing a stochastic process that injects noise into the gradients, calibrated using a probabilistic model that evaluates the effect of noise on the neural network's resilience and performance.

NEURAL NETWORK DEFENSE MECHANISMS EMPLOYING GRADIENT OBFUSCATION AGAINST WHITE-BOX INTRUSIONS

Documents

Application Documents

#	Name	Date
1	202421033180-OTHERS [26-04-2024(online)].pdf	2024-04-26
2	202421033180-FORM FOR SMALL ENTITY(FORM-28) [26-04-2024(online)].pdf	2024-04-26
3	202421033180-FORM 1 [26-04-2024(online)].pdf	2024-04-26
4	202421033180-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [26-04-2024(online)].pdf	2024-04-26
5	202421033180-EDUCATIONAL INSTITUTION(S) [26-04-2024(online)].pdf	2024-04-26
6	202421033180-DRAWINGS [26-04-2024(online)].pdf	2024-04-26
7	202421033180-DECLARATION OF INVENTORSHIP (FORM 5) [26-04-2024(online)].pdf	2024-04-26
8	202421033180-COMPLETE SPECIFICATION [26-04-2024(online)].pdf	2024-04-26
9	202421033180-FORM-9 [07-05-2024(online)].pdf	2024-05-07
10	202421033180-FORM 18 [08-05-2024(online)].pdf	2024-05-08
11	202421033180-FORM-26 [12-05-2024(online)].pdf	2024-05-12
12	202421033180-FORM 3 [13-06-2024(online)].pdf	2024-06-13
13	202421033180-RELEVANT DOCUMENTS [01-10-2024(online)].pdf	2024-10-01
14	202421033180-POA [01-10-2024(online)].pdf	2024-10-01
15	202421033180-FORM 13 [01-10-2024(online)].pdf	2024-10-01
16	202421033180-FER.pdf	2025-07-21
17	202421033180-FORM-8 [02-09-2025(online)].pdf	2025-09-02
18	202421033180-FER_SER_REPLY [02-09-2025(online)].pdf	2025-09-02
19	202421033180-DRAWING [02-09-2025(online)].pdf	2025-09-02
20	202421033180-CORRESPONDENCE [02-09-2025(online)].pdf	2025-09-02
21	202421033180-CLAIMS [02-09-2025(online)].pdf	2025-09-02

Search Strategy

1	202421033180_SearchStrategyNew_E_202421033180E_12-03-2025.pdf