System And Method For Evaluating And Routing Language Models For

< Back

System And Method For Evaluating And Routing Language Models For Product And Application Development

Abstract: A system and method for evaluating and routing language models for product and application development. The system comprises of request analyzer module, model repository module, dynamic scoring engine module, decision router module and evaluation feedback module. The method begins with the user submitting an inferencing request, the Request Analyzer module processes the input to extract task attributes, the Scoring Engine module queries the Model Repository module for predefined parameters, Model Effectiveness Scores are computed in the dynamic scoring engine module and passed to the Decision Router module, the task is then routed to the most suitable model and the selected model then returns the output to the user. The system analyzes incoming inferencing requests to extract intent, domain, complexity, and expected output, evaluate models using MES to select and route tasks to the most effective model based on the Model Effectiveness Score and incorporate metrics.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

31 December 2024

Publication Number

40/2025

Publication Type

INA

Invention Field

ELECTRONICS

Status

Parent Application

Applicants

Persistent Systems

Bhageerath, 402, Senapati Bapat Rd, Shivaji Cooperative Housing Society, Gokhale Nagar, Pune - 411016, Maharashtra, India.

Inventors

1. Mr. Nitish Shrivastava

10764 Farallone Dr, Cupertino, CA 95014-4453, United States

2. Mr. Shantanu Godbole

403, Manik Signia, S.B.Road, Pune 411009, Maharashtra, India

3. Mr. Thanu S

123 Dunforest Terrace, Nepean, ON, K2J3V1

Specification

Description:FIELD OF THE INVENTION
The present invention relates to the field of machine learning and natural language processing. More particularly, the invention pertains to a system and method for the evaluation and dynamic routing of large language models (LLMs) for tasks within the lifecycle of product and application development.
BACKGROUND OF THE INVENTION
With the rise of advanced machine learning models, Large Language Models (LLMs) have become a cornerstone of natural language processing, enabling automation across a variety of applications, including content generation, conversational systems, and workflow optimization. The rise of large language models (LLMs) has enabled automation and optimization across various domains.
Organizations increasingly demand systems that enable continuous adaptation of LLMs using domain-specific data, while ensuring security and privacy. However, selecting the most effective model for a given task in the context of software development and application lifecycle management remains a challenge. Existing systems do not dynamically assess the effectiveness of models based on task-specific attributes and real-world metrics, leading to suboptimal performance.
For instance, US8136068B2, this patent describes a method, a system, and a computer program product for implementing compact manufacturing model during various stages of electronic circuit designs. In some embodiments, the method or the system receives or identifies physics based data. In some embodiments, the method or the system receives or identifies the physics based data for the corresponding manufacturing process by using the golden manufacturing process model. In some embodiments, the method or the system uses the physics based data to fine tune, modify, or adjust the golden manufacturing process model. In some embodiments, the method or the system invokes the just-right module. In some embodiments, the method or the system implements the compact manufacturing model and the correct-by-design module and provides guidelines for the various stages of the electronic circuit design.
US11954112B2, this invention focuses on systems, methods, and devices for a cyber physical (IoT) software application development platform based upon a model driven architecture and derivative IoT SaaS applications are disclosed herein. The system may include concentrators to receive and forward time-series data from sensors or smart devices. The system may include message decoders to receive messages comprising the time-series data and storing the messages on message queues. The system may include a persistence component to store the time-series data in a key-value store and store the relational data in a relational database. The system may include a data services component to implement a type layer over data stores. The system may also include a processing component to access and process data in the data stores via the type layer, the processing component comprising a batch processing component and an iterative processing component.
Hence there is a need for a novel system and method with unique algorithm and framework that dynamically evaluates, scores, and routes tasks to the most suitable LLM, ensuring alignment with software development requirements and project goals.
OBJECTS OF THE INVENTION
The primary objective of the invention is to provide a system and method for evaluation and dynamic routing of large language models (LLMs) for tasks within the lifecycle of product and application development.
Another objective of the invention is to provide a system and method that dynamically evaluates, scores, and routes tasks to the most suitable LLM, ensuring alignment with software development requirements and project goals.
A further objective of the invention is to provide a system and method that integrates traditional natural language processing metrics with specific criteria to compute a unique Model Effectiveness Score (MES), ensuring optimal alignment with product and application development requirement.
SUMMARY OF THE INVENTION
Before the present invention is described, it is to be understood that the present invention is not limited to specific methodologies and materials described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention.
The present invention pertains to a system and method for the evaluation and dynamic routing of large language models (LLMs) for tasks within the lifecycle of product and application development. According to an aspect of the present invention, the invention describes a system and method to analyze incoming inferencing requests to extract intent, domain, complexity, and expected output, evaluate models using a dynamic scoring algorithm (“Model Effectiveness Score” or MES) tailored for engineering and product development, select and route tasks to the most effective model based on the Model Effectiveness Score and incorporate metrics specific to engineering, such as maintainability, testing performance, scalability, and human effort reduction.

BRIEF DESCRIPTION OF DRAWINGS
A complete understanding of the present invention may be made by reference to the following detailed description which is to be taken in conjugation with the accompanying drawing. The accompanying drawing, which is incorporated into and constitutes a part of the specification, illustrates one or more embodiments of the present invention and, together with the detailed description, it serves to explain the principles and implementations of the invention.
Fig1. illustrates the sequence diagram of the components of the present system and method.
Fig2. Illustrates the flow chart of working of the present invention.

DETAILED DESCRIPTION OF THE INVENTION
Before the present invention is described, it is to be understood that this invention is not limited to methodologies described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention. Throughout this specification, the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. The use of the expression “at least” or “at least one” suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the invention to achieve one or more of the desired objects or results. Various embodiments of the present invention are described below. It is, however, noted that the present invention is not limited to these embodiments, but rather the intention is that modifications that are apparent are also included.
The present invention pertains to a system and method for the evaluation and dynamic routing of large language models (LLMs) for tasks within the lifecycle of product and application development. According to an aspect of the present invention, the invention describes a system and method to analyze incoming inferencing requests to extract intent, domain, complexity, and expected output, evaluate models using a dynamic scoring algorithm (“Model Effectiveness Score” or MES) tailored for engineering and product development, select and route tasks to the most effective model based on the Model Effectiveness Score and incorporate metrics specific to engineering, such as maintainability, testing performance, scalability, and human effort reduction.

The system is composed of several key components that work together to enable a system and method that dynamically evaluates, scores, and routes tasks to the most suitable LLM, ensuring alignment with software development requirements and project goals. The system comprises of Request Analyzer module, Model Repository module, Dynamic Scoring Engine module, Decision Router module and Evaluation Feedback Loop module.

According to the embodiment of the invention, the different components of the present invention function in the following manner
• Request Analyzer module: Analyzes incoming requests to extract attributes like intent, domain, and complexity.
• Model Repository module: Stores pre-parameterized information about available models.
• Dynamic Scoring Engine module: Computes the Model Effectiveness Score (MES) for each model based on task attributes and predefined parameters.
• Decision Router module: Routes the task to the model with the highest Model Effectiveness Score (MES).
• Evaluation Feedback Loop module: Continuously refines model parameters based on real-world performance.

According to the embodiment of the invention, Figure 1 illustrates the sequence diagram of the components of the present system and method. This figure illustrates the interaction between the different components of the system. The process begins with the user submitting an inferencing request. The Request Analyzer module processes the input to extract task attributes. The Scoring Engine module queries the Model Repository module for predefined parameters. Model scores are computed in the dynamic scoring engine module and passed to the Decision Router module. The task is then routed to the most suitable model. The selected model then returns the output to the user.

According to the embodiment of the invention, Figure 2 illustrates the process flow of working of the present invention. The steps are:
• Start: Receive the inferencing request.
• Analyze Request: Extract attributes like intent, domain, and complexity.
• Compute Scores: Use the Model Effectiveness Score (MES) formula to calculate scores for all available models.
• Select Model: Identify the model with the highest score.
• Route Task: Forward the request to the selected model.
• Generate Output: The model processes the request and generates the result.
• Evaluate Feedback: Assess the output quality and refine future parameters.
• Update Parameters: Use feedback to enhance model scoring.

According to an embodiment of the present invention, the Model Effectiveness Score (MES) is defined and calculated using the Contextual Intelligence Score, Development Impact Score, Code Sustainability Score, Productivity and Human Effort Multiplier, Scalability and Usability Score, weights assigned to each component based on its importance in the project.

The Contextual Intelligence Score (CI) measures the model’s ability to understand the request’s intent and context, ensuring accurate outputs. This is defined using the parameters:
• Intent Support – The model’s capability to handle the specific task (e.g., code generation, bug fixing, QA, or documentation).
• Domain Relevance – How well the model understands the domain (e.g., frontend, backend, AI, mobile development).
• Context Adaptability – The model’s ability to adapt to the complexity and scope of the request (e.g., single-line fixes vs. multi-module application generation).
• Weight Suggestions: Emphasize Is for specific task alignment.

The Development Impact Score (DI) quantifies the value the model brings to the software development lifecycle (SDLC), including design, coding, and testing. This is defined using the parameters:
• Code Quality – How clean, readable, and maintainable the generated code is (measured using tools like linters or cyclomatic complexity).
• Testing Performance – The pass rate of the model-generated code against unit tests, integration tests, and edge cases.
• Requirement Deviation – The difference between expected functionality and the output generated by the model (lower deviation is better).
• Weight Suggestions: Prioritize code quality for long-term project success.

The Code Sustainability Score (CS) evaluates how future-proof the generated outputs are for long-term maintenance and scaling. This is defined using the parameters:
• Maintainability Index – How easy it is to modify and update the code (measured via metrics like Halstead complexity).
• Documentation Relevance – The quality and completeness of documentation accompanying the output (e.g., inline comments, API documentation).
• Architecture Optimization – Alignment with best practices for the application architecture (e.g., microservices, modular design).
• Weight Suggestions: Emphasize Maintainability Index and Architecture Optimization for enterprise-grade applications.

The Productivity and Human Effort Multiplier (PH) accounts for the model's ability to reduce developer effort and increase productivity.
• Time to Resolution – The time saved compared to manual implementation or debugging.
• Effort Saved – The reduction in developer effort (e.g., fewer iterations, minimal debugging required).
• Weight Suggestions: Balance Time to Resolution and Effort Saved for agile development environments.

The Scalability and Usability Score (SU) evaluates the generated output's scalability and usability in real-world applications.
• Performance Efficiency – The runtime performance of the generated code or application (e.g., latency, memory usage).
• Usability Index – How well the outputs integrate into existing workflows or tools (e.g., CI/CD pipelines, IDEs).
• Scalability Effectiveness – The ability to scale outputs for large datasets, users, or traffic.
• Weight Suggestions: Prioritize Performance Efficiency and Scalability Effectiveness for high-traffic applications.

According to the embodiment of the present invention, the designing of the System around the formula involves the following:

1. Design-Time Model Parameterization: Each model should be assigned fixed effectiveness scores for:
• Intent Support .
• Domain Relevance.
• Complexity Handling.
• Documentation Relevance.

2. Real-Time Request Analysis: Analyze incoming requests to extract:
• Intent: Identify the task type (e.g., “Build a REST API in Python” → Code Generation).
• Domain: Map the request to a domain (e.g., AI, backend, mobile).
• Complexity: Estimate complexity based on input length, task depth, or dependencies.

3. Dynamic Model Scoring: Calculate the Model Effectiveness Score (MES) for each model based on its parameters and the request's extracted attributes. Then normalize scores to account for variations across task types.

4. Decision Routing: Select the model with the highest Model Effectiveness Score (MES) for the specific task. Then route the request to the selected model.

Example:
Model Effectiveness for Product & Application Development is calculated as the Model Effectiveness Score (MES). The Model Effectiveness Score (MES) is defined as:
MES = w1 ⋅ CI + w2 ⋅ DI + w3 ⋅ CS + w4 ⋅ PH + w5 ⋅ SU
Where:
CI = Contextual Intelligence Score
DI = Development Impact Score
CS = Code Sustainability Score
PH = Productivity and Human Effort Multiplier
SU = Scalability and Usability Score
w1, w2, w3, w4, w5 = Weights assigned to each component based on its importance in the project.

1. Contextual Intelligence Score (CI)
CI = w1a ⋅ Is + w1b ⋅ Ds + w1c ⋅ Ca
Is: Intent Support, Ds: Domain Relevance , Ca: Context Adaptability

2. Development Impact Score (DI)
DI = w2a ⋅ Qc + w2b ⋅ Tp + w2c ⋅ Rd
Qc: Code Quality, Tp: Testing Performance, Rd: Requirement Deviation

3. Code Sustainability Score (CS)
CS = w3a ⋅ Mi + w3b ⋅ Dr + w3c ⋅ Ao
Mi: Maintainability Index, Dr: Documentation Relevance, Ao: Architecture Optimization

4. Productivity and Human Effort Multiplier (PH)
PH = w4a ⋅ Tr + w4b ⋅ Es
Tr: Time to Resolution, Es: Effort Saved

5. Scalability and Usability Score (SU)
SU = w5a ⋅ Pf + w5b ⋅ Ui + w5c ⋅ Se
Pf: Performance Efficiency, Ui: Usability Index, Se: Scalability Effectiveness

While considerable emphasis has been placed herein on the specific elements of the preferred embodiment, it will be appreciated that many alterations can be made and that many modifications can be made in preferred embodiment without departing from the principles of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
, Claims:We claim,
1. A system and method for evaluating and routing language models for product and application development
characterized in that
the system comprises of request analyzer module, model repository module, dynamic scoring engine module, decision router module and evaluation feedback
module,
and the method begins with the user submitting an inferencing request, the Request Analyzer module processes the input to extract task attributes, the Scoring Engine module queries the Model Repository module for predefined parameters, Model Effectiveness Scores are computed in the dynamic scoring engine module and passed to the Decision Router module, the task is then routed to the most suitable model and the selected model then returns the output to the user.
2. The system and method as claimed in claim 1, wherein the Request Analyzer module analyzes incoming requests to extract attributes like intent, domain, and complexity, model Repository module stores pre-parameterized information about available models, dynamic Scoring Engine module computes the Model Effectiveness Score for each model based on task attributes and predefined parameters, decision Router module routes the task to the model with the highest Model Effectiveness Score , and evaluation Feedback Loop module: Continuously refines model parameters based on real-world performance.
3. The system and method as claimed in claim 1, wherein the process flow of working of the present invention are
• Start- Receive the inferencing request.
• Analyze Request- Extract attributes like intent, domain, and complexity.
• Compute Scores- Use the Model Effectiveness Score (MES) formula to calculate scores for all available models.
• Select Model- Identify the model with the highest score.
• Route Task- Forward the request to the selected model.
• Generate Output- The model processes the request and generates the result.
• Evaluate Feedback- Assess the output quality and refine future parameters.
• Update Parameters- Use feedback to enhance model scoring.
4. The system and method as claimed in claim 1, wherein the Model Effectiveness Score is defined and calculated using the Contextual Intelligence Score, Development Impact Score, Code Sustainability Score, Productivity and Human Effort Multiplier, Scalability and Usability Score, weights assigned to each component based on its importance in the project.
5. The system and method as claimed in claim 1, wherein the Contextual Intelligence Score measures the model’s ability to understand the request’s intent and context, ensuring accurate outputs; the Development Impact Score quantifies the value the model brings to the software development lifecycle , including design, coding, and testing; the Code Sustainability Score evaluates how future-proof the generated outputs are for long-term maintenance and scaling; the Productivity and Human Effort Multiplier accounts for the model's ability to reduce developer effort and increase productivity; the Scalability and Usability Score evaluates the generated output's scalability and usability in real-world applications.
6. The system and method as claimed in claim 1, wherein in the Design-Time Model Parameterization, each model is assigned fixed effectiveness scores for Intent Support , Domain Relevance and Complexity Handling, Documentation relevance.

Documents

Application Documents

#	Name	Date
1	202421105156-STATEMENT OF UNDERTAKING (FORM 3) [31-12-2024(online)].pdf	2024-12-31
2	202421105156-POWER OF AUTHORITY [31-12-2024(online)].pdf	2024-12-31
3	202421105156-FORM 1 [31-12-2024(online)].pdf	2024-12-31
4	202421105156-FIGURE OF ABSTRACT [31-12-2024(online)].pdf	2024-12-31
5	202421105156-DRAWINGS [31-12-2024(online)].pdf	2024-12-31
6	202421105156-DECLARATION OF INVENTORSHIP (FORM 5) [31-12-2024(online)].pdf	2024-12-31
7	202421105156-COMPLETE SPECIFICATION [31-12-2024(online)].pdf	2024-12-31
8	Abstract1.jpg	2025-02-19
9	202421105156-POA [22-02-2025(online)].pdf	2025-02-22
10	202421105156-MARKED COPIES OF AMENDEMENTS [22-02-2025(online)].pdf	2025-02-22
11	202421105156-FORM 13 [22-02-2025(online)].pdf	2025-02-22
12	202421105156-AMMENDED DOCUMENTS [22-02-2025(online)].pdf	2025-02-22
13	202421105156-FORM-9 [25-09-2025(online)].pdf	2025-09-25
14	202421105156-FORM 18 [01-10-2025(online)].pdf	2025-10-01