System And Method To Compare Requirements From Documents To Identify

< Back

System And Method To Compare Requirements From Documents To Identify Gaps In Software Implementation

Abstract: The present invention describes a system and method to compare requirements from documents to identify gaps in software implementation. The system connects to various software development and project management systems such as for requirements and epics, for pull requests and code and documentation repositories. The system comprises of connectors for accessing project management tools, code repositories, and documentation; an extraction engine using language models to derive requirements and constraints; an analyser to interpret pull requests; a comparison module to identify implementation gaps; and a user interface to report and validate these gaps. The method for detecting implementation gaps comprising the steps of: connecting to relevant systems, extracting structured requirements, analysing pull request content, comparing requirements with code intent, validating gaps against codebase, and reporting them for human review. The system leverages machine learning techniques, semantic matching, code search algorithms, and user feedback to iteratively improve precision

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

11 April 2025

Publication Number

41/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Persistent Systems

Bhageerath, 402, Senapati Bapat Rd, Shivaji Cooperative Housing Society, Gokhale Nagar, Pune - 411016, Maharashtra, India

Inventors

1. Mr. Nitish Shrivastava

10764 Farallone Dr, Cupertino, CA 95014-4453, United States.

2. Mr. Pradeep Sharma

20200 Lucille Ave Apt 62 Cupertino CA 95014, United States.

3. Mr. Neil Fox

3053 Granville Dr. , Raleigh, NC, United States, 27609

4. Mr. Sanjeev Saxena

20050 Rodrigues Ave., Cupertino, CA, United States, 95014

Specification

Description:FIELD OF INVENTION
The present invention relates to Software Development Lifecycle (SDLC) management. More specifically, it relates to a system and method for comparing requirements from different documentations and project management tools with pull requests from code depositories to identifies gaps in software implementation.

BACKGROUND
In conventional software development, developers heavily rely on manually interpreting requirement from documentation, which often involves carefully reading, understanding, and translating the written requirements into actionable code and test cases. This process prone to errors due to ambiguity, misinterpretation, difficulty in knowledge transfer, oversight or changes in requirements that are not clearly tracked or implemented. These errors are leading to project delays, and consequently development teams often deliver features that partially or incorrectly satisfy the original intent. Even though code reviews and testing are helpful, they might not catch small mistakes or differences from the required standards. These small discrepancies, when overlooked, can accumulate over time, resulting in rework, technical debt, and potential dissatisfaction among end users. Consequently, development teams often find themselves struggling to balance efficiency, accuracy, and adaptability while ensuring that the delivered software aligns with both business goals and user expectations.
US9122422B2 describes set of SDLC resources that can be established, where each is separately addressable through a unique URL and is able to be managed through a simple set of operations. For example, a set of RESTful operations (GET, POST, PUT, and DELETE) can be used for the operations. Database management technologies can be leveraged for storing and indexing resources, but the underlying database schema for the solution can operate on a resource level, which results in the resources being stored as-is. Thus, storage (even when database based) of resources for the solution can be considered an Internet server exposing a space of URL addressable objects.

US11068651B2 describes a platform for analysing assessment data, including correlating assessment data with learning outcome data, and presenting analysis results through one or more interactive dashboards. A course syllabus and/or other course described information is analyzed to identify expected target outcomes to be achieved by students of the course, and to identify a relative weight of each of the various target outcomes based on their coverage in the syllabus. Assessment data describing a digitally administered assessment is analyzed to identify the categories that are assessed in the various questions of the test. The target outcomes are compared to the assessed categories to determine the extent to which the assessment is assessing each of the target outcomes, the degree of correspondence between the distribution of assessed categories and the distribution of target outcomes, and whether any gaps are present indicating that certain target outcomes are going unassessed or are insufficiently assessed.

Therefore, there is a need for a holistic system that address the above drawbacks and provides a solution which improves development compliance, reduces errors, and enables traceability in software development cycle.
DEFINITIONS:
The expression “system” used hereinafter in this specification refers to an ecosystem comprising, but is not limited to a system with a user, input and output devices, processing unit, plurality of mobile devices, a mobile device-based application to identify dependencies and relationships between diverse businesses, a visualization platform, and output; and is extended to computing systems like mobile, laptops, computers, PCs, etc.
The expression “input unit” used hereinafter in this specification refers to, but is not limited to, mobile, laptops, computers, PCs, keyboards, mouse, pen drives or drives.
The expression “output unit” used hereinafter in this specification refers to, but is not limited to, an onboard output device, a user interface (UI), a display kit, a local display, a screen, a dashboard, or a visualization platform enabling the user to visualize, observe or analyse any data or scores provided by the system.
The expression “processing unit” refers to, but is not limited to, a processor of at least one computing device that optimizes the system.
The expression “large language model (LLM)” used hereinafter in this specification refers to a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with many parameters, and are trained with self-supervised learning on a vast amount of text.
The expression “pull request” used hereinafter in this specification refers to a step to initiate the process of integrating new code changes into the main project repository. Pull requests are sent through git systems, to notify the rest of the team that a branch or fork is ready to be reviewed. By using pull requests, developers can add more features or fix bugs without altering the project’s source code or affecting the user experience. By doing so, they can test and develop code changes at a local machine without fear of disrupting the entire program.
The expression “API” used hereinafter in this specification stands for Application Programming Interface. It is a set of protocols, routines, and tools for building software and applications. An API specifies how software components should interact and allows different software systems to communicate with each other.
The expression “OAuth” used hereinafter in this specification stands for open Authorization. It is a technological standard that allows one to authorize one app or service to sign in to another without divulging private information, such as passwords. OAuth is designed to work with Hypertext Transfer Protocol (HTTP). It uses access tokens to prove the user’s identity and allow it to interact with another service on their behalf.
The expression “Document parsing” used hereinafter in this specification refers to the process of analysing a document and extracting information from it, either in a structured or unstructured format. It is essential for businesses that handle large volumes of documents such as invoices, contracts, and forms.

OBJECTS OF THE INVENTION:
The primary object of the present invention is to provide a system and method to compare requirements from documents to identify gaps in software implementation.

Another object of the present invention is to provide a method and system that connects to various software development and project management systems and documentation repositories.

Yet another object of the present invention is to provide a system and method that uses large language models (LLMs) and deterministic parsing to extract structured requirements and peripheral needs from input data.

Yet another object of the present invention is to provide a system and method that leverages machine learning techniques, semantic matching, code search algorithms, and user feedback to iteratively improve precision.

Yet another object of the present invention is to provide a system and method that improves development compliance, reduces errors, and enables traceability.

SUMMARY
Before the present invention is described, it is to be understood that the present invention is not limited to specific methodologies and materials described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention.

The present invention describes a system and method to compare requirements from documents to identify gaps in software implementation. The proposed invention discloses the system that connects to various software development and project management systems such as Jira, Aha (for requirements and epics), Git, Bitbucket (for pull requests and code), and documentation repositories (like Confluence, SharePoint).

According to an aspect of the present invention, the system first ingests data from requirement sources and product documentation. It uses large language models (LLMs) and deterministic parsing to extract structured requirements and peripheral needs. The system then parses pull request data including code diffs, summaries, and developer comments. The system then identifies gaps by comparing pull request content against requirement expectations. Then the system validates identified gaps by scanning the broader codebase and presents a validated list of gaps to stakeholders for resolution. The system leverages machine learning techniques, semantic matching, code search algorithms, and user feedback to iteratively improve precision.
According to an aspect of the present invention, the computer-implemented method for detecting implementation gaps comprising the steps of: connecting to relevant systems, extracting structured requirements, analysing pull request content, comparing requirements with code intent, validating gaps against codebase, and reporting them for human review.

BRIEF DESCRIPTION OF DRAWINGS
A complete understanding of the present invention may be made by reference to the following detailed description which is to be taken in conjugation with the accompanying drawing. The accompanying drawing, which is incorporated into and constitutes a part of the specification, illustrates one or more embodiments of the present invention and, together with the detailed description, it serves to explain the principles and implementations of the invention.

FIG. 1 illustrates a system architecture diagram that depicts how the system connects various input sources, processes them through modules, and outputs validated implementation gaps in the present invention.
FIG.2 illustrates a flowchart that depicts the end-to-end data flow from input documents and code to the gap reporting interface in the present invention.

DETAILED DESCRIPTION OF INVENTION:
Before the present invention is described, it is to be understood that this invention is not limited to methodologies described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention. Throughout this specification, the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. The use of the expression “at least” or “at least one” suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the invention to achieve one or more of the desired objects or results. Various embodiments of the present invention are described below. It is, however, noted that the present invention is not limited to these embodiments, but rather the intention is that modifications that are apparent are also included.
The present invention describes a system and method for comparing requirements from different documentations and project management tools with pull requests from code depositories to identify gaps in software implementation. System of the present invention connects to various software development and project management systems such as Jira, Aha (for requirements and epics), Git, Bitbucket (for pull requests and code), and documentation repositories (like Confluence, SharePoint).
The system comprises of an input unit, a processing unit and output unit, wherein the processing unit further comprises of connector module, requirement extractor module, pull request analyser module, gap detection engine module, validation module and gap reporting interface module. In the present invention, FIG. 1 illustrates a system architecture diagram that depicts how the system connects various input sources, processes them through modules, and outputs validated implementation gaps. The connector module is for accessing project management tools, code repositories, and documentation; the requirement extractor module uses language models to derive requirements and constraints; pull request analyser module interprets pull requests; gap detection engine module identifies implementation gaps; and the user interface functions to report and validate these gaps. The modules function in detail as follows:
1. Connector Module: Interfaces with APIs (Application Programming Interface) from various software development and project management systems such as Jira, Aha, Git, Bitbucket, Confluence, SharePoint. It uses OAuth and secure tokens to access user-specific or organizational data. This module also retrieves tickets, epics, user stories, and linked documentation. The module further extracts pull request metadata, code diffs (code differences), and discussions.
2. Requirement Extractor Module: This module applies LLMs (e.g., GPT-4, LLaMA, Claude) to parse documents, that is analysing a document and extracting information from it, either in a structured or unstructured format. The module then identifies and segments functional and non-functional requirements. It further detects references to architecture, compliance, security, testing, and environment configurations and converts natural language into a structured, query able format (e.g., JSON, vector embeddings).
3. Pull Request Analyzer Module: This module extracts summary, changed files, functions, lines of code, and inline comments. It identifies intent and scope of code changes using LLMs and token classifiers. The module tags the pull requests with topic names for e.g., feature, bug fix, refactoring, configuration update, etc.
4. Gap Detection Engine: This module compares the embeddings of requirements with the pull request summaries using cosine similarity. It further applies semantic search and deterministic techniques (e.g., regular expression, keyword match). The module then uses various methods of text classification such as TF-IDF (Term Frequency-Inverse Document Frequency), BERT (Bidirectional Encoder Representations from Transformers) Score, and Rule-Based Engines to evaluate completeness. The module also flags missing implementation areas (e.g., a required logging module or error handler) as gaps.
5. Validation Module: This module searches codebase using AST (abstract syntax tree) traversal, function signature detection, and LLM-based code summarizers. It confirms if the flagged requirements are already met at some other location in the codebase. The module then generates confidence scores based on evidence and historical training data. A confidence score, in essence, is a numerical representation of how sure an AI model is about its prediction. It's typically a value between 0 and 1 (or 0% and 100%), where: Close to 1 (or 100%) indicates high confidence.
6. Gap Reporting Interface module: This module displays a detailed report with requirement items, pull request mapping, and detected gaps. It also provides visual indicators (e.g., confidence score, severity level). This module allows developers to review, acknowledge, or dismiss findings. It further captures feedback to improve gap detection models.
According to an embodiment of the present invention, the method for comparing requirements from different documentations and project management tools with pull requests from code depositories to identifies gaps in software implementation comprises the steps of connecting to relevant systems, extracting structured requirements, analysing pull request content, comparing requirements with code intent, validating gaps against codebase, and reporting them for human review. As illustrated in FIG. 2, the steps of the method in detail are described below:
Step 1: Document and Code Ingestion: This step is carried out by the connector module that establishes API connection to all integrated systems. It fetches requirements, design documents, release notes, and pull requests. It also preprocesses the documents (tokenization, OCR, metadata extraction).
Step 2: Requirement Extraction: Requirement Extractor Module uses language models ( LLM) to perform text summarization, named entity recognition (NER), and coreference resolution. It then extracts actionable items and maps it to metadata like owner, version, date. The output is stored as structured knowledge graph or semantic vector store.
Step 3: Pull Request Understanding: The Pull Request Analyzer Module parses pull request differences and comments using AST tools and LLMs. It then creates feature map based on methods/functions altered or introduced and generates pull request intent embedding and metadata.
Step 4: Gap Detection: The Gap detection engine module matches the pull request intent vector with requirement embeddings. It then applies the similarity scoring algorithms such as Cosine Similarity (for sentence-level embedding match), BERT Score (for context-aware semantic matching) and Jaccard Index (for keyword overlap). It then flags mismatches where requirement similarity falls below threshold and labels it as a gap.
Step 5: Codebase Validation: Validation module carries out codebase validation that is traverse related modules/functions using AST parsing. It then performs symbolic search to confirm presence of features or gaps flagged as missing. It uses regex ( regular expression- that is a sequence of characters that define a search pattern used in string matching) and static analysis tools to catch compliance gaps (e.g., logging, null checks).
Step 6: Review and Feedback Loop: Gap reporting interface module presents all the gaps through UI interface with options to approve/reject. The feedback loop updates training corpus for LLMs and classifier thresholds.
Therefore, the system and method function to automatically compare requirements from documentation and project management tools with pull requests from code repositories and to identify and validate gaps in implementation using a combination of large language models, deterministic methods, and user feedback.
While considerable emphasis has been placed herein on the specific elements of the preferred embodiment, it will be appreciated that many alterations can be made and that many modifications can be made in preferred embodiment without departing from the principles of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
, Claims:We claim,
1. A system and method to compare requirements from documents to identify gaps in software implementation
characterized in that
the system connects to software development and project management systems and documentation repositories;
the system comprises of an input unit, a processing unit and output unit, wherein the processing unit comprises of connector module for accessing project management tools, code repositories, and documentation; requirement extractor module that uses language models to derive requirements and constraints; pull request analyser module interprets pull requests; gap detection engine module identifies implementation gaps; validation module validates said gaps and gap reporting interface module is the user interface to report the gaps;
and the method comprises the steps of document and code ingestion; requirement extraction; pull request understanding; gap detection; codebase validation; and review and feedback loop.
2. The system and method as claimed in claim 1, wherein the connector module interfaces with Application Programming Interfaces from various software development and project management systems, it uses secure tokens to access user-specific or organizational data and retrieves tickets, epics, user stories, and linked documentation to extract pull request metadata, code differences, and discussions.

3. The system and method as claimed in claim 1, wherein the Requirement Extractor Module applies large language models to parse documents and identify and segment functional and non-functional requirements and also detects references to architecture, compliance, security, testing, and environment configurations and converts natural language into a structured, query able format.

4. The system and method as claimed in claim 1, wherein the Pull Request Analyzer Module extracts summary, changed files, functions, lines of code, and inline comments and identifies intent and scope of code changes using LLMs and token classifiers and tags the pull requests with topic names.

5. The system and method as claimed in claim 1, wherein the Gap Detection Engine module compares the embeddings of requirements with the pull request summaries using cosine similarity, applies semantic search and deterministic techniques and uses methods of text classification such as TF-IDF (Term Frequency-Inverse Document Frequency), BERT (Bidirectional Encoder Representations from Transformers) Score, and Rule-Based Engines to evaluate completeness and flags missing implementation areas as gaps.

6. The system and method as claimed in claim 1, wherein the Validation Module searches codebase using abstract syntax tree traversal, function signature detection, and LLM-based code summarizers to confirm if the flagged requirements are already met at some other location in the codebase and then generates confidence scores based on evidence and historical training data.

7. The system and method as claimed in claim 1, wherein the Gap Reporting Interface module displays a detailed report with requirement items, pull request mapping, and detected gaps, provides visual indicators such as confidence score, severity level and . captures feedback to improve gap detection models.

8. The system and method as claimed in claim 1, wherein the Document and Code Ingestion step is carried out by the connector module that establishes API connection to all integrated systems and it fetches requirements, design documents, release notes, and pull request and also preprocesses the documents; the Requirement Extraction step uses language models to perform text summarization, named entity recognition, and coreference resolution, extracts actionable items and maps it to metadata like owner, version, date and the output is stored as structured knowledge graph or semantic vector store.

9. The system and method as claimed in claim 1, wherein the Pull Request Understanding step by The Pull Request Analyzer Module parses pull request differences and comments using AST tools and LLMs and creates feature map based on methods/functions altered or introduced and generates pull request intent embedding and metadata; and in the Gap Detection step, the The Gap detection engine module matches the pull request intent vector with requirement embeddings, then applies the similarity scoring algorithms for sentence-level embedding match, for context-aware semantic matching and for keyword overlap and then flags mismatches where requirement similarity falls below threshold and labels it as a gap.

10. The system and method as claimed in claim 1, wherein the Codebase Validation step performs symbolic search to confirm presence of features or gaps flagged as missing by using analysis tools to catch compliance gaps and the Review and Feedback Loop step presents all the gaps through UI interface with options to approve/reject and also updates training corpus for LLMs and classifier thresholds.

Documents

Application Documents

#	Name	Date
1	202521036191-STATEMENT OF UNDERTAKING (FORM 3) [11-04-2025(online)].pdf	2025-04-11
2	202521036191-POWER OF AUTHORITY [11-04-2025(online)].pdf	2025-04-11
3	202521036191-FORM 1 [11-04-2025(online)].pdf	2025-04-11
4	202521036191-FIGURE OF ABSTRACT [11-04-2025(online)].pdf	2025-04-11
5	202521036191-DRAWINGS [11-04-2025(online)].pdf	2025-04-11
6	202521036191-DECLARATION OF INVENTORSHIP (FORM 5) [11-04-2025(online)].pdf	2025-04-11
7	202521036191-COMPLETE SPECIFICATION [11-04-2025(online)].pdf	2025-04-11
8	202521036191-FORM-9 [26-09-2025(online)].pdf	2025-09-26
9	202521036191-FORM 18 [01-10-2025(online)].pdf	2025-10-01
10	Abstract.jpg	2025-10-07