A System And Method For Identifying Security Leaks In Software Systems

< Back

A System And Method For Identifying Security Leaks In Software Systems

Abstract: Title: A SYSTEM AND METHOD FOR IDENTIFYING SECURITY LEAKS IN SOFTWARE SYSTEMS A system and method for identifying security leaks in software systems; using CVE-driven attack scenario synthesis, synthetic data generation, and LLM-powered simulation; wherein the system(100) works due to interaction of user, input/output devices and processing unit comprising of a CVE (common vulnerabilities and exposures) intelligence core(110), an interface mapper and attack surface analyzer(120), LLM-based scenario engine(130), synthetic data and persona generator(140), distributed simulation engine(150), leak detection engine(160), root cause identifier and patch assistant(170), and a feedback loop(180); all the modules work in collaboration to identify the security leaks and provide actionable remediations to mitigate the leak; by importing, collecting, filtering, parsing and categorizing metadata; scanning software systems, dynamic tracing and constructing comprehensive endpoint, generating realistic, adversarial multi-step attack-flows, creating edge cases, chain CVEs into composite attack graphs, fabricating realistic, malicious user personas and request sequences, launching, orchestrating synthetic requests, detecting semantic anomalies, identifying root cause, generating actionable remediations.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

11 April 2025

Publication Number

41/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Persistent Systems

Bhageerath, 402, Senapati Bapat Rd, Shivaji Cooperative Housing Society, Gokhale Nagar, Pune - 411016, Maharashtra, India.

Inventors

1. Mr. Nitish Shrivastava

10764 Farallone Dr, Cupertino, CA 95014-4453, United States.

2. Mr. Bharath Mohanraj

6883, Chantel Ct, San Jose, CA 95129, United States.

Specification

Description:FIELD OF INVENTION:
The present invention relates to cybersecurity and automated security validation. More specifically, it pertains to a system and method for identifying security leaks in software systems using CVE-driven attack scenario synthesis, synthetic data generation, and LLM-powered simulation of edge cases and user behavior to detect vulnerabilities, role-based access violations, and semantic response inconsistencies.

BACKGROUND OF THE INVENTION:
The current software systems are increasingly vulnerable to security threats due to the growing complexity of applications and ever-evolving attack techniques. There are various vulnerabilities, such as the security vulnerabilities arising from issues like improper access controls, misconfigurations, weak authentication systems, and unprotected data. In order to address these emerging risks, businesses often rely on static scanners, rule-based security tools, and manual penetration testing. However, these methods struggle to keep up with the rapidly changing nature of threats. The traditional security testing also fails to simulate real-world attacks effectively, which can further lead to undetected security breaches and hidden flaws. Furthermore, the existing tools have difficulty in generating realistic, large-scale test cases that mirror complex attack patterns, which may result in deficient/flawed security evaluations.
In order to improve security validation, various methods have been introduced. Some businesses rely on predefined security rule sets and vulnerability scanners, which can easily identify the known weaknesses, but they often fail to detect emerging threats or logic-based security flaws. There are other approaches as well that involve manual red team assessments, which offer deeper insights but are time-consuming, costly, and impractical for continuous security validation. Additionally, some organizations use automated input testing and penetration testing frameworks, but these methods lack in adapting the dynamic attack scenarios and thereby produce high false-positive rates.
Most existing security testing solutions fail to generate realistic attack simulations at scale and don’t adapt well to new threats. While some advanced artificial intelligence-driven tools exist, they are often limited to predefined attack patterns and static vulnerability databases, which makes them ineffective against multi-step attack chains and hidden security flaws. Moreover, the traditional security tools struggle to understand the impact of vulnerabilities across different components in a system, which leads to inefficient threat detection and delayed remediation efforts.
Prior Art:
For instance, US10057065B2 discloses a system for securely storing and utilizing password verification data in multi-user computer systems, utilizing cryptographic hashing to store password hashes. This approach prevents the leakage or exfiltration of password data even in the event of an attacker gaining access to the system. While this method addresses security data protection, it does not focus on the proactive detection of vulnerabilities and security leaks across enterprise systems as our invention does.
US10484365B2 discloses a network security system that employs space-time separated relationships and pseudorandom credentials for data access control in the event of a network breach. The system offers enhanced protection for at-rest data and real-time forensic capabilities. However, this system does not provide a dynamic simulation of security vulnerabilities or the proactive detection of security flaws across different systems, which is a key feature of our LLM-powered simulation framework.
US20240080332A1 outlines a cybersecurity system for gathering and analyzing cybersecurity threats, utilizing anomaly detection, NLP, and network analysis. While it focuses on identifying threats through data analysis, it lacks the proactive, self-adaptive simulation of attack scenarios driven by CVE data and synthetic scenarios, which is central to our invention for detecting security leaks in software systems.
To overcome the aforementioned technical problems and drawbacks of the available methods, the present invention provides an LLM powered system and method for detecting security leaks in software systems.

DEFINITIONS:
The expression “system” used hereinafter in this specification refers to an ecosystem comprising, but not limited to, system for identifying security leaks with input and output devices, processing unit, plurality of mobile devices, a mobile device-based application. It is extended to computing systems like mobile phones, laptops, computers, PCs, and other digital computing devices.
The term “input unit” used hereinafter in this specification refers to, but is not limited to, mobile, laptops, computers, PCs, keyboards, mouse, pen drives or drives.
The term “processing unit” refers to the computational hardware or software that performs the database analysis, generation of graphs, detection of dead code, processing, removal of dead code, and like. It includes servers, CPUs, GPUs, or cloud-based systems that handle intensive computations.
The term “output unit” used hereinafter in this specification refers to hardware or digital tools that present processed information to users including, but not limited to computer monitors, mobile screens, printers, or online dashboards.
The term “CVE intelligence core” used hereinafter in this specification refers to a core component that utilizes structured vulnerability data, such as Common Vulnerabilities and Exposures (CVE), to drive the creation of attack scenarios for detecting security flaws in software systems.
The term “test payload” used hereinafter in this specification refers to the data or code used to simulate a real-world attack or vulnerability, designed to trigger a specific response or reveal a weakness in a system.
The term “LLM-powered simulation framework” used hereinafter in this specification refers to a system that combines CVE-driven attack scenario synthesis, LLM-powered simulation of edge cases and user behavior, synthetic data generation, and distributed testing to detect vulnerabilities, role-based access violations, and semantic response inconsistencies.
The term “synthetic data generation” used hereinafter in this specification refers to the process of creating simulated data to replicate real-world data patterns, which is used in testing and detecting security leaks and vulnerabilities across different software systems.
The term “request sequence” used hereinafter in this specification refers to a sequence of test payloads across interfaces, such as http server, which means sending a payload, and then follow up payload till a workflow is complete.
The term “authorization tokens” or "auth token used hereinafter in this specification is a unique, machine-generated code that verifies a user's identity and grants access to a website, application, service, or API, without requiring repeated login credentials. The term “adversarial simulation or attack flow” used hereinafter in this specification refers to a method of simulating real-world attack patterns and behaviors to test software systems for potential security breaches and vulnerabilities that could be exploited by malicious actors.
The term “security leaks” used hereinafter in this specification refers to security flaws or vulnerabilities in software systems that could lead to unauthorized access, data exposure, or other breaches of security protocols.
The term “distributed testing” used hereinafter in this specification refers to a method of testing software systems using multiple systems or nodes that work in parallel to simulate a large-scale environment, allowing for comprehensive security validation across distributed architectures.
The term “IAM” or “Identity and Access Management” policy used hereinafter in this specification refers to a document that defines permissions, specifying which actions are allowed or denied on which resources, and is attached to IAM identities (users, groups, or roles) to control access to any resources.
The term “The Common Vulnerability Scoring System (CVSS)” used hereinafter in this specification refers to a technical standard for assessing the severity of vulnerabilities in computing systems, where the scores are calculated based on a formula with several metrics that approximate ease and impact of an exploit.
The term “red team tester” used hereinafter in this specification refers to a cybersecurity expert who thinks like an attacker; who are enabled to test the defences of a system by attempting to exploit its vulnerabilities ethically and systematically.
The term “edge-case testing” used hereinafter in this specification refers to a testing system behavior under extreme, uncommon, or unexpected conditions.
The term “role-based access violations” used hereinafter in this specification refers to instances where users or systems are able to access resources or data that they are not authorized to access, violating access control mechanisms based on user roles within an organization.
The term “semantic response inconsistencies” used hereinafter in this specification refers to the discrepancies or errors in system responses that arise due to incorrect or incomplete data handling, leading to potential security vulnerabilities or breaches in user intent interpretation.
The term “self-adaptive framework” used hereinafter in this specification refers to a system that continuously refines its testing and detection mechanisms by learning from new attack scenarios and evolving vulnerabilities, enabling real-time updates to security validation processes.

OBJECTS OF THE INVENTION:
The primary object of the invention is to provide a system and method for identifying security leaks in software systems.
Another object of the invention is to provide an LLM-powered simulation framework for detecting security leaks in software systems.
Yet another object of the invention is to enable self-adaptive security validation using CVE-driven attack scenario synthesis, LLM-powered simulation of edge cases, and synthetic data generation.
Yet another object of the invention is to provide a system with self-adaptive framework allowing continuous refinement of test cases.
Yet another embodiment of the system is to provide a system that enables businesses to detect and mitigate security flaws more efficiently.
Yet another embodiment of the invention is to reduce dependency on manual security assessments, and enhance overall security posture of software systems.

SUMMARY:
Before the present invention is described, it is to be understood that the present invention is not limited to specific methodologies and materials described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention.
The present invention relates to a system and method for identifying security leaks in software systems using CVE-driven attack scenario synthesis, synthetic data generation, and LLM-powered simulation of edge cases and user behavior to detect vulnerabilities, role-based access violations, and semantic response inconsistencies; comprising of a CVE (common vulnerabilities and exposures) intelligence core, an interface mapper and attack surface analyzer, LLM-based scenario engine, synthetic data and persona generator, distributed simulation engine, leak detection engine, root cause identifier and patch assistant, and a feedback loop; such that all the modules work in collaboration to identify the security leaks and provide actionable remediations to mitigate the leak.
In a preferred aspect of the invention, all the modules work in collaboration to identify the security leaks and provide actionable remediations to mitigate the leak, where the method comprises the steps of; importing and collecting CVE data from various databases, filtering and parsing of CVE metadata; categorizing the metadata, scanning software systems through static analysis, dynamic tracing and constructing a comprehensive endpoint, generating realistic and adversarial multi-step attack flows, creating edge cases, chain CVEs into composite attack graphs and translate exploit potential into executable test cases, fabricating realistic and malicious user personas and request sequences, launching and orchestrating synthetic requests, detecting semantic anomalies by LLMs, identifying the root cause, generating actionable remediations and adding newly discovered exploits back into the test case database.

BRIEF DESCRIPTION OF THE DRAWINGS:
A complete understanding of the present invention may be made by reference to the following detailed description which is to be taken in conjugation with the accompanying drawing. The accompanying drawing, which is incorporated into and constitutes a part of the specification, illustrates one or more embodiments of the present invention and, together with the detailed description, it serves to explain the principles and implementations of the invention.
Fig. 1. illustrates the system and the arrangement of various components of the system for identifying security leaks.
Fig. 2. illustrates the stepwise method employed by the system for identifying security leaks.

DETAILED DESCRIPTION OF THE INVENTION:
Before the present invention is described, it is to be understood that this invention is not limited to methodologies described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention. Throughout this specification, the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. The use of the expression “at least” or “at least one” suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the invention to achieve one or more of the desired objects or results. Various embodiments of the present invention are described below. It is, however, noted that the present invention is not limited to these embodiments, but rather the intention is that modifications that are apparent are also included.
The present invention provides a system and method for identifying security leaks in software systems using CVE-driven attack scenario synthesis, synthetic data generation, and LLM-powered simulation of edge cases and user behavior to detect vulnerabilities, role-based access violations, and semantic response inconsistencies. The system (100) works due to interaction of user, one or more input and output devices and processing unit comprising of a CVE (common vulnerabilities and exposures) intelligence core (110), an interface mapper and attack surface analyzer (120), LLM-based scenario engine (130), sysnthetic data and persona generator (140), distributed simulation engine (150), leak detection engine (160), root cause identifier and patch assistant (170), and a feedback loop (180); such that all the modules work in collaboration to identify the security leaks and provide actionable remediations to mitigate the leak.
According to an embodiment of the invention, the CVE intelligence core (110) maintains an internal repository of vulnerabilities using data sourced from trusted databases such as metadata including, but not limited to NVD, MITRE, ExploitDB, and GitHub security advisories. The said CVE intelligent core (110) enables contextual relevance thereby filtering of CVEs for the system under test; by parsing the CVE metadata and categorizing them with reference to plurality of parameters not limiting to:
• CWE classification,
• Exploit vector such as API misuse, injection, SSRF, IDOR,
• Impact such as confidentiality, integrity, availability, and/or
• Affected components such as authentication, file handling, or database queries; whereby the core (110) finally builds test payload libraries.
According to yet another embodiment of the invention, the interface mapper and attack surface analyzer (120) scans software systems through static analysis which includes source code or API definitions; and dynamic tracing of parameters such as runtime behavior or logs); based on which the interface mapper and attack surface analyzer constructs a comprehensive endpoint map including:
• HTTP routes, parameters, authentication methods
• File upload/download handlers
• User role access paths
• Background jobs or services;
such that each component is tagged with relevant CVEs from the intelligence core (110) thereby identifying potential vulnerability points.
According to yet another embodiment, the LLM- based scenario engine (130) uses pre-trained or fine-tuned LLMs to generate realistic and adversarial multi-step attack flows, create edge cases such as malformed tokens, invalid headers, partial sessions or out-of-sequence calls, chain CVEs into composite attack graphs and translate exploit potential into executable test cases. The prompts or inputs used to generate adversarial, edge-case, and chained attack paths are dynamically engineered to simulate behavior like a skilled red team tester which is enabled to test the defences of the system by attempting to exploit system vulnerabilities.
According to yet another embodiment of the invention, the synthetic data and persona generator (140) generates role-based profiles including, but not limited to admin, guest, user or revoked users, with varied objectives and behaviors; wherein the synthetic data and persona generator (140) fabricates realistic and malicious user personas and request sequences, using metadata comprising of IP address ranges, time zones and device fingerprints, session cookies, auth tokens, file uploads with hidden payloads.
According to yet another embodiment of the invention, the distributed simulation engine (150) is built using scalable infrastructure such as virtual machines, or containers so as to create and run the simulations using Docker, Kubernetes, FaaS depending upon the workload; where it deploys massive simulation workload across all mapped endpoints, thereby enabling the engine (150) to launch and orchestrate millions of synthetic requests that simulate stateful interactions across components, such that the server keeps track of the state of each session or interaction and maintains that information based on the user's past requests, and enable the user to return to the past sessions again. The stateful interactions includes the simulation parameters generally sent as payload, including:
• Login sequences
• Privilege escalations
• Resource access races
• Replay attacks; such that every request is traced and responses are logged for analysing what responses are sent by the server and confirm if the responses are exposing any personal data or internal information.
According to yet another embodiment of the invention, leak detection engine (160) compares expected vs. actual responses wherein the LLMs are enabled to detect semantic anomalies by understanding natural language or structural inconsistencies in responses; such that the leak detection engine (160) uses generative AI-driven analysis using combined rule-based and semantic LLM-powered approaches to analyze:
• API response content for confidential or internal data
• Cross-user data exposure
• Role-based access violations
• Inconsistent response behaviors.
According to yet another embodiment of the invention, when a leak is detected the root cause identifier and patch assistant (170) correlates simulation request to underlying service, code segment, IAM policy or misconfiguration, uses LLMs to explain the issue in plain language, suggests patches, IAM rule updates, or coding changes to mitigate the leak, scores each issue with estimated (Common Vulnerability Scoring System) CVSS-style severity where the CVSS scores approximate ease and impact of an exploit, ranging between 1 to 10, with 10 being most severe, maps the results to code/config/IAM and generates actionable remediations such as changing network security system settings, enabling or disabling a port, or blocking a service. The feedback loop (180) adds a newly discovered exploits back into the test case database.
According to a preferred embodiment of the invention, the system employs a method for identifying security leaks in software systems, comprising the steps as follows:
- Importing an collecting CVE data from various databases such as NVD, MITRE, ExploitDB, and GitHub security advisories;
- Filtering and parsing of CVE metadata; followed by categorizing the metadata according to pre-defined parameters using the CVE intelligent core;
- Scanning software systems through static analysis, dynamic tracing and constructing a comprehensive endpoint to identify vulnerability points using the interface mapper and attack surface analyzer (120);
- Generating realistic and adversarial multi-step attack flows, creating edge cases, chain CVEs into composite attack graphs and translate exploit potential into executable test cases using the pre-trained LLM- based scenario engine (130);
- fabricating realistic and malicious user personas and request sequences using the synthetic data and persona generator (140);
- launching and orchestrating millions of synthetic requests that simulate stateful interactions across all mapped endpoints using the distributed simulation generator (150);
- detecting semantic anomalies by LLMs thereby understanding natural language or structural inconsistencies in responses using the leak detection engine (160);identifying the root cause by correlating simulation requests to underlying service, code segment, IAM policy or misconfiguration, and generating actionable remediations such as changing network security system settings, enabling or disabling a port, or blocking a service;
- using the root cause analyzer and patch assistant (170)
- adding newly discovered exploits back into the test case database through the feedback loop (180)

According to an embodiment of the invention, the various advantages of the present system and method includes proactive security validation without relying on static rule sets or manual testing; the self-adaptive framework allows continuous refinement of test cases, which further ensures that businesses can detect and mitigate security flaws more efficiently, reduce dependency on manual security assessments, and therefore, enhance their overall security posture.
While considerable emphasis has been placed herein on the specific elements of the preferred embodiment, it will be appreciated that many alterations can be made and that many modifications can be made in preferred embodiment without departing from the principles of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
, Claims:CLAIMS:
We claim,
1. A system and method for identifying security leaks in software systems; using CVE-driven attack scenario synthesis, synthetic data generation, and LLM-powered simulation;
wherein the system (100) works due to interaction of user, one or more input and output devices and processing unit comprising of a CVE (common vulnerabilities and exposures) intelligence core (110), an interface mapper and attack surface analyzer (120), LLM-based scenario engine (130), synthetic data and persona generator (140), distributed simulation engine (150), leak detection engine (160), root cause identifier and patch assistant (170), and a feedback loop (180);

characterized in that:
the system (100) employs a method for identifying security leaks in software systems, such that all the modules work in collaboration to identify the security leaks and provide actionable remediations to mitigate the leak, where the method comprises the steps of;
- importing and collecting CVE data from various databases such as NVD, MITRE, ExploitDB, and GitHub security advisories;
- filtering and parsing of CVE metadata; followed by categorizing the metadata according to pre-defined parameters using the CVE intelligent core;
- scanning software systems through static analysis, dynamic tracing and constructing a comprehensive endpoint to identify vulnerability points using the interface mapper and attack surface analyzer (120);
- generating realistic and adversarial multi-step attack flows, creating edge cases, chain CVEs into composite attack graphs and translate exploit potential into executable test cases using the pre-trained LLM- based scenario engine (130);
- fabricating realistic and malicious user personas and request sequences using the synthetic data and persona generator (140);
- launching and orchestrating millions of synthetic requests that simulate stateful interactions across all mapped endpoints using the distributed simulation generator (150);
- detecting semantic anomalies by LLMs thereby understanding natural language or structural inconsistencies in responses using the leak detection engine (160);
- identifying the root cause by correlating simulation requests to underlying service, code segment, IAM policy or misconfiguration, and generating actionable remediations using the root cause analyzer and patch assistant (170)
- adding newly discovered exploits back into the test case database through the feedback loop (180).

2. The system and method as claimed in claim 1, wherein the CVE intelligence core (110) maintains an internal repository of vulnerabilities using data sourced from trusted databases such as metadata including, but not limited to NVD, MITRE, ExploitDB, and GitHub security advisories; and categorizes them with reference to plurality of parameters including, but not limited to CWE classification; exploit vector such as API misuse, injection, SSRF, IDOR; impact such as confidentiality, integrity, availability, and/or affected components such as authentication, file handling, or database queries.

3. The system and method as claimed in claim 1, wherein the interface mapper and attack surface analyzer (120) constructs a comprehensive endpoint map including HTTP routes, parameters, authentication methods; file upload/download handlers; user role access paths; background jobs or services; such that each component is tagged with relevant CVEs from the intelligence core (110) thereby identifying potential vulnerability points.

4. The system and method as claimed in claim 1, wherein the simulations by distributed simulation engine (150) include parameters such as login sequences, privilege escalations, resource access races, and replay attacks; such that every request is traced and responses are logged for analysing what responses are sent by the server and confirm if the responses are exposing any personal data or internal information.

5. The system and method as claimed in claim 1, wherein the leak detection engine (160) uses combined rule-based and semantic LLM-powered approaches to analyze API response content for confidential or internal data, cross-user data exposure, role-based access violations, and inconsistent response behaviors.

6. The system and method as claimed in claim 1, wherein system includes proactive security validation without relying on static rule sets or manual testing; the self-adaptive framework allows continuous refinement of test cases, which further ensures that businesses can detect and mitigate security flaws more efficiently, reduce dependency on manual security assessments, and therefore, enhance their overall security posture.

Dated this 11th day of April, 2025.

Documents

Application Documents

#	Name	Date
1	202521036188-STATEMENT OF UNDERTAKING (FORM 3) [11-04-2025(online)].pdf	2025-04-11
2	202521036188-POWER OF AUTHORITY [11-04-2025(online)].pdf	2025-04-11
3	202521036188-FORM 1 [11-04-2025(online)].pdf	2025-04-11
4	202521036188-FIGURE OF ABSTRACT [11-04-2025(online)].pdf	2025-04-11
5	202521036188-DRAWINGS [11-04-2025(online)].pdf	2025-04-11
6	202521036188-DECLARATION OF INVENTORSHIP (FORM 5) [11-04-2025(online)].pdf	2025-04-11
7	202521036188-COMPLETE SPECIFICATION [11-04-2025(online)].pdf	2025-04-11
8	202521036188-FORM-9 [26-09-2025(online)].pdf	2025-09-26
9	202521036188-FORM 18 [01-10-2025(online)].pdf	2025-10-01
10	Abstract.jpg	2025-10-07