System And Method For Deduplication And Ranking Of Modular

< Back

System And Method For Deduplication And Ranking Of Modular Computational Platform Services

Abstract: The present invention provides a system and method for deduplication and ranking of Modular Computational Platform (MCP) services using behavioral profiling and semantic mapping. The system ingests MCP services along with their specifications, documentation, and usage metadata, and applies simulated behavioral testing using synthetic prompts and edge-case inputs to generate dynamic behavioral profiles. These profiles are combined with statistical indicators such as latency, failure rates, usage frequency, and cost to create fused semantic and statistical embeddings. A similarity graph and a dependency graph are constructed to identify functionally overlapping services based on behavioral and orchestration-level relationships. Graph-based clustering is then applied to deduplicate services and retain the most robust and contextually relevant representative from each cluster. A ranking module assigns scores based on LLM compatibility, composability, uniqueness, and real-world performance. An output interface delivers a filtered and ranked set of services for use by large language models and agentic systems. A feedback loop continuously updates profiles and rankings based on runtime performance, enabling adaptive, intelligent, and scalable service orchestration across dynamic environments.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

17 July 2025

Publication Number

40/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Persistent Systems

Bhageerath, 402, Senapati Bapat Rd, Shivaji Cooperative Housing Society, Gokhale Nagar, Pune - 411016, Maharashtra, India.

Inventors

1. Mr. Nitish Shrivastava

10764 Farallone Dr, Cupertino, CA 95014-4453, United States.

Specification

Description:FIELD OF THE INVENTION
The present invention relates to the field of artificial intelligence systems and Modular Computational Platforms (MCPs). More particularly, it pertains to a system and method for deduplication and ranking of MCP services using behavioral profiling and semantic mapping for effective use by large language models (LLM) or agentic systems.
BACKGROUND OF THE INVENTION
Large Language Models (LLMs) and autonomous agentic systems increasingly rely on diverse toolsets and Application Programming Interfaces (APIs) hosted on Modular Computational Platforms (MCPs) to accomplish tasks ranging from query execution and data processing to decision-making and real-time orchestration. MCPs serve as repositories of modular, reusable services that can be programmatically invoked in dynamic workflows. However, the rapid proliferation of services with overlapping functionalities and varied operational behaviors poses significant challenges for intelligent systems attempting to discover, evaluate, and select the most relevant tools.
Conventional approaches to service discovery and ranking are predominantly metadata-driven, relying on name-matching, static tags, or descriptive filters. These approaches fail to capture critical aspects such as behavioral nuances, edge-case robustness, composability, prompt/response fidelity, and statistical performance indicators such as latency, failure rates, and usage patterns. Furthermore, these methods are unable to deduplicate functionally similar services that differ only superficially in their interfaces but behave identically or similarly in practice. In the context of LLM orchestration, where intelligent agents must autonomously select and chain services, such redundancy and lack of behavioral insight result in suboptimal performance, inefficiency, and poor reasoning outcomes.
Prior Art:
US10545956B2 discloses a system for automated software service selection and execution based on contextual user input and a weight-based scoring mechanism. While it provides a rudimentary framework for service matching, the invention is centered around user-driven input contexts and does not evaluate composability, real-world behavior, or LLM compatibility, nor does it include mechanisms for deduplication based on dynamic characteristics.
US10374910B2 presents a technique for intent-based API service discovery using semantic classification of user queries. While it incorporates a level of semantic understanding, the method lacks behavioral profiling, statistical analysis, or graph-based clustering of services and does not consider LLM-agent-driven usage patterns, prompt engineering factors, or feedback-driven optimization.
WO2022235821A1 focuses on AI-based orchestration systems in the domain of sales and marketing, wherein services are selected and invoked based on entity behavior and campaign engagement data. Although it involves predictive modeling and multi-source signal analysis, it is domain-specific, does not generalize to tool orchestration across MCPs, and lacks support for LLM-driven service ranking, graph-based deduplication, or behavioral simulation.
While the aforementioned prior arts address aspects of service selection and recommendation, none comprehensively solve the problem of deduplication, ranking, and contextual optimization of MCP services through a unified behavioral and semantic framework suited to the needs of modern agentic workflows and LLM-based orchestration.
To overcome these limitations, there is a need for a novel system that evaluates services not only based on static descriptions but also on dynamic behavioral signals, edge-case robustness, statistical utility, and prompt-response alignment, using fused embeddings and graph-based clustering to intelligently deduplicate and rank modular tools. Such a system must further incorporate a self-learning feedback loop to adjust rankings based on real-world usage and ensure continuous optimization.
DEFINITIONS
The expression “system” used hereinafter in this specification refers to an ecosystem comprising, but is not limited to a system with a user, input and output devices, processing unit, plurality of mobile devices, a mobile device-based application to identify dependencies and relationships between diverse businesses, a visualization platform, and output; and is extended to computing systems like mobile, laptops, computers, PCs, etc.
The expression “input unit” used hereinafter in this specification refers to, but is not limited to, mobile, laptops, computers, PCs, keyboards, mouse, pen drives or drives.
The expression “output unit” used hereinafter in this specification refers to, but is not limited to, an onboard output device, a user interface (UI), a display kit, a local display, a screen, a dashboard, or a visualization platform enabling the user to visualize, observe or analyse any data or scores provided by the system.
The expression “processing unit” refers to, but is not limited to, a processor of at least one computing device that optimizes the system.
The expression “large language model (LLM)” used hereinafter in this specification refers to a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with many parameters, and are trained with self-supervised learning on a vast amount of text.
The expression “Modular Computational Platform (MCP)”, as used hereinafter in this specification, refers to an environment or repository comprising modular, reusable services, tools, or APIs that can be invoked programmatically in dynamic, intelligent workflows by artificial intelligence systems, particularly large language models (LLMs) and agentic systems.
The expression “MCP service”, as used hereinafter in this specification, refers to an individual API, tool, or endpoint hosted on a Modular Computational Platform (MCP) that provides a discrete functionality and can be programmatically invoked as part of a larger computational workflow.
The expression “behavioral profiling”, as used hereinafter in this specification, refers to the simulated evaluation of an MCP service using synthetic queries, edge-case scenarios, malformed input, and multi-turn interactions to assess behavioral robustness, composability, error handling, predictability, and alignment with LLM prompt-response expectations.
The expression “semantic and statistical embedding”, as used hereinafter in this specification, refers to the fused vector representation of an MCP service that combines language model-based semantic understanding of API specifications and behavioral summaries with quantitative indicators such as usage frequency, latency, failure rates, and composability metrics.
The expression “similarity graph”, as used hereinafter in this specification, refers to a graph-based structure in which each node represents an MCP service, and edges represent degrees of semantic or functional similarity as calculated from the fused embeddings.
The expression “dependency graph”, as used hereinafter in this specification, refers to a graph-based structure capturing real-world usage relationships between MCP services, where edges denote co-occurrence or conditional invocation patterns across observed or simulated workflows.
The expression “graph-based deduplication”, as used hereinafter in this specification, refers to the process of identifying and clustering functionally similar MCP services using both the similarity graph and the dependency graph, and selecting a representative high-utility service within each cluster.
The expression “LLM compatibility score”, as used hereinafter in this specification, refers to a metric assigned to an MCP service based on its suitability for interaction with large language models, including prompt clarity, output consistency, response alignment, and token efficiency.
The expression “composability score”, as used hereinafter in this specification, refers to a metric reflecting the ability of an MCP service to integrate effectively within multi-tool workflows, as determined by observed or simulated orchestration chains and behavioral tests.
The expression “feedback loop”, as used hereinafter in this specification, refers to the optional self-learning mechanism wherein real-world usage data and performance outcomes of selected MCP services are used to update behavioral profiles, adjust ranking weights, detect embedding drift, and continuously improve service recommendations.
OBJECTS OF THE INVENTION
The primary object of the present invention is to provide a system and method for deduplication and ranking of Modular Computational Platform (MCP) services using behavioral profiling and semantic mapping for improved service selection by large language models (LLMs) and agentic systems.
Another object is to evaluate MCP services through simulated behavior testing and generate fused semantic and statistical embeddings representing both functional behavior and performance metrics.
A further object is to perform graph-based clustering using similarity and dependency graphs to identify and eliminate redundant MCP services.
Another object is to rank MCP services based on LLM compatibility, behavioral robustness, composability, and statistical utility.
A final object is to integrate a feedback mechanism to update service profiles and rankings based on real-world usage and performance.
SUMMARY
Before the present invention is described, it is to be understood that the present invention is not limited to specific methodologies and materials described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention.
The present invention provides a system and method for deduplication and ranking of Modular Computational Platform (MCP) services using behavioral profiling and semantic mapping. At its core is a system architecture that ingests diverse MCP services along with their specifications, usage history, and documentation, and applies synthetic queries and edge-case scenarios to generate behavioral profiles capturing robustness, composability, and prompt-response fidelity. These profiles are fused with statistical indicators such as usage frequency, latency, failure rates, and dependency footprint to generate semantically and statistically rich embeddings. The system constructs a similarity graph and a dependency graph to identify functional overlap and service co-occurrence in orchestration workflows. Graph-based clustering is applied to deduplicate MCP services, selecting the most behaviorally robust and statistically performant representatives within each cluster. A scoring function ranks services based on LLM compatibility, composability, uniqueness entropy, and real-world feedback. An output interface provides ranked service lists, embeddings, and composability metadata for use by large language models and agentic systems. The system optionally includes a self-learning feedback loop to dynamically adjust rankings based on runtime performance, embedding drift, and orchestration outcomes.
According to an aspect of the invention, the process flow begins when a set of MCP services is submitted to the input unit of the system. Each service is processed by the ingestion layer, which extracts relevant data including API specifications, documentation, tags, usage metrics, and behavioral indicators. The behavioral profiling engine then executes a series of synthetic tests such as test prompt flows, malformed inputs, and multi-turn interaction scenarios to evaluate the operational characteristics of each service, including robustness, composability, and prompt-response alignment. The results are combined with statistical data such as usage frequency, failure rates, latency, and cost to generate a fused embedding for each MCP service via the embedding layer. These embeddings are input into the graph construction module, which builds a similarity graph and a dependency graph representing both semantic proximity and co-occurrence in workflows. The clustering and deduplication unit identifies groups of functionally similar services and selects the most behaviorally rich and statistically reliable representative from each group. The ranking module then scores the remaining services based on LLM compatibility, composability, uniqueness, and feedback metrics. A final ranked list is delivered through the output unit along with embeddings and metadata to guide LLM-based orchestration. Real-world usage signals and execution outcomes are logged by the optional feedback loop to refine future behavioral profiles, adjust rankings, and continuously optimize the selection process.
BRIEF DESCRIPTION OF DRAWINGS
A complete understanding of the present invention may be made by reference to the following detailed description which is to be taken in conjugation with the accompanying drawing. The accompanying drawing, which is incorporated into and constitutes a part of the specification, illustrates one or more embodiments of the present invention and, together with the detailed description, it serves to explain the principles and implementations of the invention.
FIG. 1 illustrates the system architecture for deduplication and ranking of MCP services.
FIG. 2 illustrates the process flow from service ingestion to ranking.
FIG. 3 illustrates the feedback loop for updating service profiles and rankings.
DETAILED DESCRIPTION OF INVENTION:
Before the present invention is described, it is to be understood that this invention is not limited to methodologies described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention. Throughout this specification, the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. The use of the expression “at least” or “at least one” suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the invention to achieve one or more of the desired objects or results. Various embodiments of the present invention are described below. It is, however, noted that the present invention is not limited to these embodiments, but rather the intention is that modifications that are apparent are also included.
The present invention provides a system and method for deduplication and ranking of Modular Computational Platform (MCP) services using behavioral profiling and semantic mapping. At its core is a system architecture that ingests diverse MCP services along with their specifications, usage history, and documentation, and applies synthetic queries and edge-case scenarios to generate behavioral profiles capturing robustness, composability, and prompt-response fidelity. These profiles are fused with statistical indicators such as usage frequency, latency, failure rates, and dependency footprint to generate semantically and statistically rich embeddings. The system constructs a similarity graph and a dependency graph to identify functional overlap and service co-occurrence in orchestration workflows. Graph-based clustering is applied to deduplicate MCP services, selecting the most behaviorally robust and statistically performant representatives within each cluster. A scoring function ranks services based on LLM compatibility, composability, uniqueness entropy, and real-world feedback. An output interface provides ranked service lists, embeddings, and composability metadata for use by large language models and agentic systems. The system optionally includes a self-learning feedback loop to dynamically adjust rankings based on runtime performance, embedding drift, and orchestration outcomes.
To overcome the limitations of metadata-driven and static service selection in artificial intelligence environments, the present invention offers a more intelligent and behavior-aware approach. It introduces a system that enables large language models (LLMs) and agentic systems to reason over and select Modular Computational Platform (MCP) services based on real operational behavior rather than superficial descriptions. The system ingests MCP services along with their documentation, specifications, and usage metadata, and simulates real-world behavior using synthetic prompts, malformed input, and composability tests. It generates fused semantic and statistical embeddings representing each service’s behavioral profile and performance indicators such as latency, failure rate, and usage frequency. These embeddings are used to construct a similarity graph and a dependency graph, enabling graph-based clustering to identify and eliminate redundant services. A scoring function ranks the most distinct, LLM-compatible, and composable services for intelligent orchestration. The system optionally incorporates a feedback loop that adjusts rankings and profiles based on real-world usage, making it self-improving over time.
According to the embodiment of the present invention, the system includes an input unit, a processing unit, and an output unit. The core functionality lies within the processing unit, which is composed of several interconnected components: a behavioral profiling engine, a semantic and statistical embedding layer, a graph construction and deduplication layer, a ranking module, Output Interface for LLMs and an optional feedback mechanism. The process begins when the input unit ingests MCP services, including their API specifications, documentation, tags, and historical usage data. The behavioral profiling engine simulates service behavior through synthetic queries and edge-case prompts to assess characteristics such as robustness, composability, predictability, and prompt-response alignment. These results, along with statistical data like usage frequency, failure rates, latency, and cost efficiency, are used to generate fused embeddings for each service. The graph construction module builds a similarity graph and a dependency graph based on these embeddings and observed orchestration patterns. The clustering and deduplication unit applies graph-based algorithms to identify functionally similar MCP services and retain the most statistically and behaviorally valuable representatives. The ranking module scores each service based on LLM compatibility, composability, uniqueness, entropy, and other factors. The output unit delivers a filtered, deduplicated, and ranked list of MCP services along with embeddings and metadata for consumption by LLMs or agentic orchestration engines. Optionally, a feedback loop captures real-world performance data to update behavioral profiles and refine future rankings over time.
According to the embodiment of the present invention, the input unit ingests each MCP service or API endpoint with the components including but not limited to Name, description, and tags; API specification (OpenAPI, GraphQL, etc.); tool behavior documentation and historical usage data (if available). The system parses and normalizes the above into a standard internal schema. In addition to basic normalization, the system applies programmatic probing to detect undocumented behavior, undocumented HTTP responses, default parameters, or rate-limiting logic. The behavioral profiling engine is a test engine that dynamically generates a behavioral profile for each MCP service by simulating real-world usage scenarios through synthetic queries and edge-case inputs. These simulations test various aspects of service behavior, including robustness to malformed input, composability with other services, prompt-response alignment, and adaptability in multi-turn interactions. The resulting behavioral profile captures key operational traits that are relevant for LLM-driven orchestration, such as predictability, error clarity, and compositional flexibility. This engine operates in conjunction with the ingestion layer, which provides service metadata, API specifications, and historical usage data.
According to the embodiment of the present invention, the semantic and statistical embedding layer then fuses the behavioral characteristics with quantitative metrics like usage frequency, failure rates, average latency, cost efficiency, and dependency footprint. The fused embedding uniquely represents each MCP service in a high-dimensional vector space. These embeddings are subsequently used by the graph construction module to build a similarity graph and a dependency graph, which are input into the clustering and deduplication process. The system thereby ensures that only contextually robust, semantically distinct, and statistically reliable services are retained and ranked for downstream use by LLMs or agentic systems. These vectors are fused using hybrid projection (e.g., multi-view contrastive learning) to form a multi-modal service representation. These embeddings are dynamically updated with usage and feedback data.
According to the embodiment of the present invention, in the graph construction and deduplication layer, two graphs are built. One is similarity graph in which the nodes are MCPs; edges denote semantic/API proximity using cosine similarity over fused embeddings. Other is dependency Graph that is built from real workflows or orchestrations. In the dependency graph, the edges denote co-occurrence or conditional invocation patterns. The nodes in the dependency graph represent individual MCP services. Each node is one modular, programmatically callable service/tool/API hosted on the Modular Computational Platform (MCP). These MCP services are: APIs (e.g., OpenAPI,- GraphQL endpoints), Modular tools (e.g., data transformer, summarizer, translator) or agents or microservices invoked in an orchestration pipeline. While the nodes are the services, the edges in the dependency graph represent: Observed or simulated orchestration relationships such as how frequently and under what conditions two MCP services are invoked together in a workflow.
Examples:
- If Service A’s output feeds into Service B in actual or simulated usage
- If Services C and D frequently co-occur in execution chains
Both are used to detect clusters of tightly connected services and understand real-world usage patterns and perform graph-based deduplication and ranking.
According to the embodiment of the present invention, the graph-based clustering (e.g., Leiden or Louvain algorithms) is applied to group redundant services. Nodes with structural holes or bridge roles (high betweenness centrality) are flagged as core components. Betweenness Centrality is the measure of how often a node appears on the shortest paths between pairs of nodes in the graph. A high betweenness centrality means the node is a connector or gateway and it’s likely critical for workflow composition and may be hard to replace without disrupting many orchestration paths. It is used to flag "core components" — MCP services that are crucial for interoperability and prevent accidental deletion or de-ranking of services that appear “redundant” in function but are central to workflows. The goal of Graph-Based Clustering in the present invention is to group redundant MCP services — i.e., services that may have different names/interfaces but perform the same function, have similar behavioral profiles and appear together in orchestration workflows. In order to achieve this the system constructs a similarity graph with semantic and statistical embedding proximity and a dependency graph with co-invocation frequency from orchestration logs or simulations. Then it applies community detection to find groups (clusters) of similar/related services.
According to the embodiment of the present invention, the Leiden / Louvain Algorithms are unsupervised community detection algorithms that work on weighted graphs. They aim to partition the graph into clusters of densely connected nodes by optimizing a metric called modularity (which captures how tightly knit a group is, compared to the rest of the network). Louvain is a fast, greedy optimization of modularity and Leiden is an improved version of Louvain that guarantees better-connected and more stable clusters. These algorithms are well-suited for large graphs, overlapping and noisy relationships and non-parametric clustering (no fixed number of clusters needed). In this case, they group MCP services into functional clusters — where each cluster ideally contains variants of the same underlying capability. Within each cluster the most statistically performant and behaviorally rich MCP is selected and others are either deprecated, sub-ranked, or retained as scenario-specific variants.
According to the embodiment of the present invention, in the ranking module each MCP service is ranked via a scoring function that includes LLM Compatibility Score (prompt-ease, output clarity, token-efficiency), Composability Score (based on observed invocation chains), Behavioral Robustness Score (based on synthetic test coverage), Statistical Utility Score (reliability, speed, efficiency), Uniqueness Entropy (calculated across embedding and graph dimensions) and Feedback Incorporation Score (from past LLM orchestration usage). The ranking formula is tuneable based on use-case (e.g., latency-sensitive workflows vs high-throughput analytical chains). In the output Interface for LLMs module, a final API or interface provides JSON or vector list of services with rankings, embeddings for search/retrieval, composability metadata for workflow planning and categorization by capability type and cluster ID. The system can optionally output call scaffolds, prompt suggestions, and template wrappers for each ranked MCP for direct LLM use.
According to the embodiment of the present invention, there is an optional self-learning feedback loop. If a selected MCP fails or underperforms during real use, feedback is routed back to the behavioral engine, score weights are dynamically adjusted, rankings are re-evaluated and embedding drift is tracked, outliers and regressions are automatically flagged for pruning or retraining.
In the present invention, validation and refinement of MCP service selection are achieved through the coordinated operation of the behavioral profiling engine, the embedding layer, the graph-based clustering module, and the feedback loop. Once a service is selected for use by a large language model or agentic system, its behavior during runtime is monitored against the expected profile generated during the synthetic testing phase. The fused semantic and statistical embeddings provide a reference for expected performance, including output consistency, latency, composability, and error handling. If the invoked MCP service fails to meet these behavioral or statistical expectations for example, by producing unexpected responses, demonstrating instability in a workflow, or deviating from previously observed usage patterns the system may trigger corrective actions. These include re-ranking the service within its cluster, adjusting the embedding to reflect observed drift, or deprioritizing it in future selection cycles. The feedback loop captures real-world interaction data and continuously updates the behavioral profile and ranking metrics. This ensures that the system remains adaptive to dynamic service conditions, maintains orchestration reliability, and consistently exposes only high-utility, LLM-compatible services for downstream use.
According to an embodiment of the invention, the process flow begins when a set of MCP services is submitted to the input unit of the system. Each service is processed by the ingestion layer, which extracts relevant data including API specifications, documentation, tags, usage metrics, and behavioral indicators. The behavioral profiling engine then executes a series of synthetic tests such as test prompt flows, malformed inputs, and multi-turn interaction scenarios to evaluate the operational characteristics of each service, including robustness, composability, and prompt-response alignment. The results are combined with statistical data such as usage frequency, failure rates, latency, and cost to generate a fused embedding for each MCP service via the embedding layer. These embeddings are input into the graph construction module, which builds a similarity graph and a dependency graph representing both semantic proximity and co-occurrence in workflows. The clustering and deduplication unit identifies groups of functionally similar services and selects the most behaviorally rich and statistically reliable representative from each group. The ranking module then scores the remaining services based on LLM compatibility, composability, uniqueness, and feedback metrics. A final ranked list is delivered through the output unit along with embeddings and metadata to guide LLM-based orchestration. Real-world usage signals and execution outcomes are logged by the optional feedback loop to refine future behavioral profiles, adjust rankings, and continuously optimize the selection process.

According to the embodiment of the present invention, the method for deduplication and ranking of Modular Computational Platform (MCP) services using behavioral profiling and semantic mapping, as illustrated in FIGs. 1–3, comprises the following steps:
● Receiving MCP services at the input unit (FIG. 1): The process begins when a set of MCP services, tools, or APIs along with their associated specifications, documentation, tags, and historical usage data is received by the input unit and passed to the processing unit.
● Generating behavioral profiles via the Behavioral Profiling Engine (FIG. 1): Each MCP service is programmatically tested using synthetic queries, edge-case scenarios, malformed inputs, and multi-turn prompts. The system captures behavioral traits such as robustness, error clarity, composability, and alignment with LLM prompt expectations, forming a dynamic behavioral profile.
● Generating fused embeddings via the Embedding Layer (FIG. 1): The system combines semantic data from behavioral profiles with statistical metrics such as usage frequency, latency, failure rate, cost efficiency, and dependency footprint to create a multi-dimensional fused embedding representing each MCP service.
● Constructing graphs via the Graph Construction Module (FIG. 1): Two types of graphs are created namely - A similarity graph, where edges reflect semantic or functional closeness based on embeddings and A dependency graph, where edges represent service co-occurrence in real or simulated orchestration workflows.
● Performing graph-based clustering and deduplication (FIG. 2): The system applies clustering algorithms to group MCP services with overlapping functionalities. Within each cluster, it selects a representative service based on behavioral richness and statistical performance, while deprioritizing or flagging redundant variants.
● Ranking MCP services via the Ranking Module (FIG. 2): A scoring function ranks each remaining MCP service using weighted criteria including LLM compatibility, composability, behavioral robustness, statistical utility, uniqueness entropy, and prior usage feedback. The result is a prioritized list of services optimized for orchestration.
● Delivering results via the Output Interface (FIG. 1): The ranked and deduplicated list of MCP services along with fused embeddings and metadata is passed through the output unit. This output supports downstream use by LLMs or agentic systems, including prompt planning and workflow generation.
● Self- learning feedback loop (FIG. 3): During orchestration, if a selected service behaves unexpectedly such as deviating from its profile, failing in execution, or causing breakdowns in tool chains the system may trigger fallback logic, re-evaluate the ranking, or suppress the service in future cycles.
The outputs produced during MCP service orchestration are continuously checked against each service’s behavioral profile and ranking benchmarks set by the system. These checks are guided by embedded constraints derived from the service’s semantic-statistical embedding, clustering insights, previous feedback data, and the context captured during profiling. This process ensures that the service behaves as expected in terms of consistent responses, successful integration with other tools, acceptable latency, and proper error handling. If a service performs within these expected limits, the workflow continues smoothly. However, if something unusual is detected like frequent failures, unexpected outputs, disruptions in the orchestration chain, or deviations from simulated behavior the system steps in to correct it. It may lower the service’s rank, temporarily exclude it, trigger re-profiling, or swap it out with a better alternative from the same functional group. This ongoing evaluation ensures that only well-tested, reliable, and LLM-friendly services are allowed to participate in active orchestration, maintaining the overall quality and trustworthiness of the system.

According to an embodiment of the present invention, the MCP service deduplication and ranking system performs a structured sequence of operations to ensure that only statistically reliable, semantically distinct, and behaviorally compatible services are made available for large language model (LLM) orchestration. The orchestration process includes:
1. Ingestion Layer: The system begins by collecting structured inputs, including API specifications, documentation, tags, usage metadata, and performance history for each MCP service. This data is processed by the ingestion layer to initiate behavioral profiling.
2. Behavioral Profiling Engine: The system executes synthetic prompts, malformed inputs, and multi-turn interaction scenarios to simulate real-world service behavior. Traits such as robustness, prompt-response alignment, composability, and predictability are captured to form dynamic behavioral profiles.
3. Semantic and Statistical Embedding Layer: Behavioral characteristics are fused with statistical indicators including usage frequency, failure rates, average latency, cost efficiency, and dependency footprint to generate high-dimensional embeddings for each MCP service.
4. Graph Construction Module: The system constructs both a similarity graph (based on functional or semantic proximity) and a dependency graph (based on observed or simulated orchestration relationships).
5. Clustering and Deduplication Unit: Using graph-based clustering techniques, the system groups functionally similar MCP services. Within each cluster, it selects the most statistically and behaviorally optimal service, while deprioritizing redundant variants.
6. Ranking Module: A scoring function ranks the remaining services based on LLM compatibility, behavioral robustness, composability, uniqueness entropy, and prior feedback metrics, producing a prioritized service list.
7. Runtime Validation: If any selected service during orchestration deviates from its expected behavior such as inconsistent outputs or performance degradation the system enforces corrective measures, which may include demotion in rank, re-clustering, or substitution with a more reliable service from the same cluster.
8. Feedback Loop: All orchestration outcomes, profiling adjustments, ranking changes, and behavioral deviations are logged. This feedback data is used to refine embeddings, update profiles, and improve ranking accuracy over time.
This modular, learning-enabled, and context-aware framework ensures that large language models and agentic systems consistently operate with high-quality, deduplicated, and semantically diverse MCP services, enabling intelligent, efficient, and scalable tool orchestration across diverse application domains.
Advantages:
The present invention offers several advantages that make MCP-based orchestration in artificial intelligence environments more intelligent, reliable, and contextually optimized. It enables the generation of behaviorally rich and statistically grounded profiles for each Modular Computational Platform (MCP) service by simulating real-world usage conditions and capturing key operational characteristics such as robustness, composability, and LLM prompt-response alignment. This allows the system to deduplicate overlapping services and prioritize those that best align with orchestration needs. The system dynamically ranks services using a multi-dimensional scoring model based on semantic relevance, behavioral fidelity, uniqueness entropy, and compatibility with large language models (LLMs). It continuously monitors performance during orchestration and adapts rankings through a feedback loop, ensuring services that degrade or drift over time are automatically deprioritized or re-profiled. The architecture supports modular integration, allowing seamless embedding of the system into enterprise-scale LLM workflows and agentic frameworks. Its ability to produce high-quality, deduplicated, and ranked service sets enhances orchestration precision, reduces toolchain redundancy, and improves reliability across dynamic and complex AI-driven environments.
, Claims:We claim,
1. A system and method for deduplication and ranking of modular computational platform services
characterized in that
the system comprises of an input unit, a processing unit, and an output unit and the processing unit, comprises of a behavioral profiling engine, a semantic and statistical embedding layer, a graph construction and deduplication layer, a ranking module, Output Interface for LLMs and an optional feedback mechanism;
the method for deduplication and ranking of Modular Computational Platform- services using behavioral profiling and semantic mapping comprises the steps of
• receiving Modular Computational Platform services, tools, or APIs along with their associated specifications, documentation, tags, and historical usage data by the input unit and passed to the processing unit;
• generating behavioral profiles using synthetic queries, edge-case scenarios, malformed inputs, and multi-turn prompts and behavioral traits such as robustness, error clarity, composability, and alignment with LLM prompt expectations by the Behavioral Profiling Engine;
• generating multi-dimensional fused embeddings representing each Modular Computational Platform service that combines semantic data from behavioral profiles with statistical metrics such as usage frequency, latency, failure rate, cost efficiency, and dependency footprint via the Embedding Layer;
• constructing similarity graph and dependency graph via the Graph Construction Module;
• performing graph-based clustering and deduplication to group Modular Computational Platform services with overlapping functionalities, such that within each cluster, a representative service is selected based on behavioral richness and statistical performance, while deprioritizing or flagging redundant variants;
• ranking Modular Computational Platform services using weighted criteria including LLM compatibility, composability, behavioral robustness, statistical utility, uniqueness entropy, and prior usage feedback to generate a prioritized list of services optimized for orchestration via the Ranking Module;
• delivering results of ranked and deduplicated list of Modular Computational Platform services along with fused embeddings and metadata via the Output Interface that supports downstream use by LLMs or agentic systems, including prompt planning and workflow generation;
• triggering corrective actions such as re-ranking the service within its cluster, adjusting the embedding to reflect observed drift, or deprioritizing it in future selection cycles, if the invoked Modular Computational Platform service fails to meet these behavioral or statistical expectations.

2. The system and method as claimed in claim 1, wherein the input unit ingests each MCP service or API endpoint with the components including but not limited to Name, description, and tags; API specification ; tool behavior documentation and historical usage data and it parses and normalizes the above into a standard internal schema and applies programmatic probing to detect undocumented behavior, undocumented HTTP responses, default parameters, or rate-limiting logic.

3. The system and method as claimed in claim 1, wherein the behavioral profiling engine dynamically generates a behavioral profile for each MCP service by simulating real-world usage scenarios through synthetic queries and edge-case inputs such that these simulations test various aspects of service behavior, including robustness to malformed input, composability with other services, prompt-response alignment, and adaptability in multi-turn interactions and the resulting behavioral profile captures key operational traits that are relevant for LLM-driven orchestration, such as predictability, error clarity, and compositional flexibility and this engine operates in conjunction with the ingestion layer, which provides service metadata, API specifications, and historical usage data.

4. The system and method as claimed in claim 1, wherein the semantic and statistical embedding layer fuses the behavioral characteristics with quantitative metrics like usage frequency, failure rates, average latency, cost efficiency, and dependency footprint such that the fused embedding uniquely represents each Modular Computational Platform service in a high-dimensional vector space and these vectors are fused using hybrid projection (to form a multi-modal service representation.

5. The system and method as claimed in claim 1, wherein in the similarity graph the nodes are Modular Computational Platforms and edges denote semantic or API proximity using cosine similarity over fused embeddings.

6. The system and method as claimed in claim 1, wherein in the dependency graph, the edges denote co-occurrence or conditional invocation patterns and the nodes represent individual MCP services where each node is one modular, programmatically callable service/tool/API hosted on the Modular Computational Platform.

7. The system and method as claimed in claim 1, wherein in the ranking module each Modular Computational Platform service is ranked via a scoring function that includes LLM Compatibility Score, Composability Score, Behavioral Robustness Score, Statistical Utility Score, Uniqueness Entropy and Feedback Incorporation Score and the ranking formula is tuneable based on use-case.

8. The system and method as claimed in claim 1, wherein in the output Interface for LLMs module, a final API or interface provides JSON or vector list of services with rankings, embeddings for search/retrieval, composability metadata for workflow planning and categorization by capability type and cluster ID and the system can optionally output call scaffolds, prompt suggestions, and template wrappers for each ranked Modular Computational Platform for direct LLM use.

9. The system and method as claimed in claim 1, wherein the graph-based clustering such as Leiden or Louvain algorithms is applied to group redundant services and the nodes with structural holes or bridge roles having high betweenness centrality are flagged as core components.

Documents

Application Documents

#	Name	Date
1	202521068258-STATEMENT OF UNDERTAKING (FORM 3) [17-07-2025(online)].pdf	2025-07-17
2	202521068258-POWER OF AUTHORITY [17-07-2025(online)].pdf	2025-07-17
3	202521068258-FORM 1 [17-07-2025(online)].pdf	2025-07-17
4	202521068258-FIGURE OF ABSTRACT [17-07-2025(online)].pdf	2025-07-17
5	202521068258-DRAWINGS [17-07-2025(online)].pdf	2025-07-17
6	202521068258-DECLARATION OF INVENTORSHIP (FORM 5) [17-07-2025(online)].pdf	2025-07-17
7	202521068258-COMPLETE SPECIFICATION [17-07-2025(online)].pdf	2025-07-17
8	Abstract.jpg	2025-08-04
9	202521068258-FORM-9 [26-09-2025(online)].pdf	2025-09-26
10	202521068258-FORM 18 [01-10-2025(online)].pdf	2025-10-01