System And Method For Conflict Resolution And Truth Extraction From

< Back

System And Method For Conflict Resolution And Truth Extraction From Multi Source Document Ingestion

Abstract: The present invention provides a system and method for conflict resolution and truth extraction from multi-source document ingestion using large language models, factual web crawling, and statistical modeling. The system ingests structured and unstructured data from heterogeneous sources. Using a dual-pass approach combining named entity recognition and LLM-based zero-shot extraction, the Entity Extractor identifies discrete factual units. Conflicting values are clustered by the Conflict Resolver Engine and resolved using a hybrid strategy integrating source trust scores, LLM-generated reasoning, real-time web verification, and plausibility estimation. The Source Scoring Model, LLM Validator, Web Crawler Verifier, and Plausibility Estimator collaboratively evaluate each candidate fact. The Truth Generator Module consolidates these signals to output the most probable factual value, a confidence score, a natural language explanation, and a traceable source list. A Feedback and Learning Loop tunes model parameters over time for improved resolution accuracy.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

17 July 2025

Publication Number

40/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Persistent Systems

Bhageerath, 402, Senapati Bapat Rd, Shivaji Cooperative Housing Society, Gokhale Nagar, Pune - 411016, Maharashtra, India.

Inventors

1. Mr. Nitish Shrivastava

10764 Farallone Dr, Cupertino, CA 95014-4453, United States.

Specification

Description:FIELD OF THE INVENTION
The present invention relates to artificial intelligence and data reconciliation systems. More particularly, it pertains to a system and method for truth extraction and conflict resolution from multi-source documents using LLMs, factual crawling, and statistical modeling.
BACKGROUND OF THE INVENTION
In contemporary information ecosystems, data consumers across domains such as finance, governance, law, healthcare, and journalism are increasingly reliant on unstructured, semi-structured, and structured information sources ranging from PDF documents, HTML web pages, research articles, news feeds, government reports, and live APIs. However, the inherent diversity and asynchrony of such sources result in frequent conflicts, discrepancies, and contradictory factual claims that challenge the integrity and reliability of automated knowledge systems.
Conventional systems that aim to resolve such discrepancies typically rely on static source prioritization, deterministic extraction rules, or simplistic trust hierarchies. These methods fail to capture contextual nuance, lack real-time adaptability, and are unable to semantically reconcile multiple versions of a fact. Moreover, such systems are generally opaque, providing no rational explanation or traceability of the “resolved truth” they arrive at, thereby undermining both user confidence and regulatory defensibility.
Large Language Models (LLMs), while powerful in zero-shot contextual reasoning and summarization, are inherently generative and non-deterministic. Their outputs are prone to hallucination, and they generally lack mechanisms for grounded attribution or cross-verification. In parallel, traditional web crawlers or plausibility scoring models offer limited semantic reasoning, focusing instead on binary verification or statistical anomaly detection. Neither class of systems is independently capable of resolving complex inter-source conflicts with justification and confidence scoring. Thus, existing architectures are inadequate for real-world knowledge-intensive environments that demand high-integrity factual synthesis across conflicting document inputs, grounded validation against trusted external data, and transparent justification of outcomes.
Prior Arts:
US9602505B1 discloses a system for secure and modular workflow automation based on agentic orchestration principles. Although it emphasizes distributed execution, secure access, and task delegation, the system does not provide any framework for resolving contradictory factual assertions across documents. It lacks support for semantic disambiguation, conflict detection, or the integration of real-time factual verification mechanisms.
US20240143722A1 introduces a policy-regulated AI execution model wherein LLM outputs are filtered and controlled within sandboxed environments. While it addresses compliance and output regulation, it does not incorporate document-level fact extraction, inter-source conflict resolution, or external web validation pipelines. Its emphasis remains on output safety rather than factual synthesis.
US20230237126A1 presents a secure agentic execution framework that governs Error Handling and Resilience, execution traceability, and permission-controlled workflows. However, the system operates at the level of task execution governance and not on semantic or content-level reasoning. It does not support document-level synthesis, statistical plausibility modeling, or source-level conflict clustering.
While the aforementioned prior arts address important facets such as agentic orchestration, secure execution, and AI compliance, none of them resolve the core challenge of reconciling semantically inconsistent information from multiple sources in an explainable, traceable, and statistically robust manner. There is no existing integrated framework capable of ingesting multi-format documents, semantically evaluating conflicting facts, verifying assertions through targeted factual crawling, and synthesizing resolved outputs with confidence scores and justification metadata.
DEFINITIONS
The expression “system” used hereinafter in this specification refers to an ecosystem comprising, but is not limited to a system with a user, input and output devices, processing unit, plurality of mobile devices, a mobile device-based application to identify dependencies and relationships between diverse businesses, a visualization platform, and output; and is extended to computing systems like mobile, laptops, computers, PCs, etc.
The expression “input unit” used hereinafter in this specification refers to, but is not limited to, mobile, laptops, computers, PCs, keyboards, mouse, pen drives or drives.
The expression “output unit” used hereinafter in this specification refers to, but is not limited to, an onboard output device, a user interface (UI), a display kit, a local display, a screen, a dashboard, or a visualization platform enabling the user to visualize, observe or analyse any data or scores provided by the system.
The expression “processing unit” refers to, but is not limited to, a processor of at least one computing device that optimizes the system.
The expression “large language model (LLM)” used hereinafter in this specification refers to a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with many parameters, and are trained with self-supervised learning on a vast amount of text.
The expression “multi-source document ingestion”, as used hereinafter in this specification, refers to the process of reading and collecting data from diverse document types including but not limited to PDF files, Word documents, HTML pages, APIs, and RSS feeds, for downstream processing.
The expression “entity extractor”, as used hereinafter in this specification, refers to the component that extracts candidate factual units such as numerical values, dates, measurements, and relationships using named entity recognition (NER) and large language model (LLM)-based zero-shot extraction.
The expression “conflict resolver engine”, as used hereinafter in this specification, refers to the module that performs semantic clustering of conflicting data points and applies weighted resolution strategies to determine the most accurate factual value for each entity.
The expression “source trust score”, as used hereinafter in this specification, refers to a composite metric derived from factors such as domain type (e.g., government, news, blog), content recency, citation count, and historical accuracy, used to evaluate the reliability of a source.
The expression “LLM validator”, as used hereinafter in this specification, refers to the subcomponent that prompts large language models to compare conflicting factual versions, justify the most probable one, and output token-level confidence and explanations.
The expression “web crawler verifier”, as used hereinafter in this specification, refers to the module that performs real-time web searches using targeted keyword and entity queries to retrieve and rank matching information snippets for external factual verification.
The expression “plausibility estimator”, as used hereinafter in this specification, refers to a machine learning model trained on real-world signals such as logical consistency, co-occurring values, and valid numerical ranges, to assess the plausibility of a given fact.
The expression “truth generator module”, as used hereinafter in this specification, refers to the component that consolidates evidence from source trust scores, LLM validation, web verification, and plausibility modeling to output the final factual answer with explanation, confidence score, and traceable sources.
The expression “human-in-the-loop”, as used hereinafter in this specification, refers to the fallback workflow wherein unresolved or ambiguous high-trust conflicts are escalated to manual reviewers for adjudication and feedback.
The expression “confidence score”, as used hereinafter in this specification, refers to a probabilistic value generated by the system to indicate the level of certainty associated with the resolved factual data point, based on combined model outputs.
OBJECTS OF THE INVENTION
The primary object of the present invention is to provide a system and method for conflict resolution and truth extraction from multi-source documents using large language models (LLMs), factual web crawling, and statistical modeling.
Another object of the invention is to extract and semantically cluster conflicting factual data points from heterogeneous sources for each identified entity.
A further object of the invention is to resolve factual conflicts using a hybrid approach that combines source trust scoring, LLM-generated reasoning, real-time factual verification, and plausibility estimation.
Another object of the invention is to generate a confidence score, explanation, and source traceability for each resolved data point to ensure transparency and explainability.
Yet another object of the invention is to incorporate a human-in-the-loop workflow for ambiguous or high-trust conflicts that cannot be resolved automatically.
A final object of the invention is to continuously improve resolution accuracy through user feedback and learning loops that tune model parameters and scoring weights.
SUMMARY
Before the present invention is described, it is to be understood that the present invention is not limited to specific methodologies and materials described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention.
The present invention provides a system and method for conflict resolution and truth extraction from multi-source document ingestion using large language models (LLMs), factual web crawling, and statistical modeling. The system is designed to extract, reconcile, and validate conflicting data points across heterogeneous sources such as PDFs, HTML pages, Word documents, APIs, and RSS feeds.
According to an aspect of the present invention, the system reads, processes, and reconciles data from diverse documents and sources, using a hybrid approach combining LLMs, deterministic crawlers, statistical modeling, and ML-driven plausibility estimation. The system ingests structured and unstructured data from various document types and extracts discrete factual units using LLMs and NER systems. It constructs semantic clusters of conflicting values for each entity and applies a weighted resolution strategy. The resolution engine factors in: Source trust scores (e.g., government > blog), Recency and citation count, LLM-generated reasoning with confidence, Real-time factual verification using crawled data, Plausibility models trained on real-world signals (e.g., GDP trends, biological ranges)
According to an aspect of the present invention, the system outputs the most likely true data point, a confidence score, supporting rationale and traceable source trail. The Error handling is also robust such that if LLM calls fail, fallback rule-based systems engage. Crawler timeouts are retried with proxy rotation. Human-in-the-loop workflows are triggered for edge cases. This invention is scalable, explainable, and applicable across industries—financial reporting, health records, legal discovery, scientific publishing, and more.
BRIEF DESCRIPTION OF DRAWINGS
A complete understanding of the present invention may be made by reference to the following detailed description which is to be taken in conjugation with the accompanying drawing. The accompanying drawing, which is incorporated into and constitutes a part of the specification, illustrates one or more embodiments of the present invention and, together with the detailed description, it serves to explain the principles and implementations of the invention.
FIG. 1 illustrates the system architecture for conflict resolution and truth extraction from multi-source document ingestion using LLMs, factual crawling, and statistical modeling.
FIG. 2 illustrates the end-to-end process flow from document ingestion and entity extraction to conflict resolution, truth generation, and output delivery.
DETAILED DESCRIPTION OF INVENTION:
Before the present invention is described, it is to be understood that this invention is not limited to methodologies described, as these may vary as per the person skilled in the art. It is also to be understood that the terminology used in the description is for the purpose of describing the particular embodiments only and is not intended to limit the scope of the present invention. Throughout this specification, the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. The use of the expression “at least” or “at least one” suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the invention to achieve one or more of the desired objects or results. Various embodiments of the present invention are described below. It is, however, noted that the present invention is not limited to these embodiments, but rather the intention is that modifications that are apparent are also included.
The present invention provides a system and method for conflict resolution and truth extraction from multi-source document ingestion using large language models (LLMs), factual web crawling, and statistical modeling. The system is designed to extract, reconcile, and validate conflicting data points across heterogeneous sources such as PDFs, HTML pages, Word documents, APIs, and RSS feeds.
According to the embodiment of the invention, the system comprises an input unit, a processing unit, and an output unit. The processing unit comprises of document normalizer, an entity extractor, a conflict resolver engine, a source scoring model, an LLM validator, a web crawler verifier, a plausibility estimator, and a truth generator module. The input unit is configured to ingest heterogeneous documents from a variety of structured and unstructured sources including but not limited to APIs, web URLs, PDF files, HTML pages, and RSS feeds. The input unit is configured to ingest structured and unstructured data from various sources and normalize this information into a unified JSON schema through a Document Normalizer, which parses metadata and removes non-content elements. The input unit also incorporates deduplication mechanisms using SHA256 fingerprints to eliminate redundant content. The Entity Extractor employs a dual-pass approach using Named Entity Recognition (NER) and LLM-based zero-shot extraction to identify discrete factual units such as numerical values, dates, and named relationships. These extracted facts are passed to the Conflict Resolver Engine, which clusters them into semantic groups using cosine similarity over contextual embeddings to identify and isolate contradictory claims. Each semantic cluster is evaluated using a hybrid multi-signal resolution approach. The Source Scoring Model computes trust scores for each document based on domain authority (e.g., government > news > blogs), recency, citation frequency, and historical accuracy. The LLM Validator is prompted to compare conflicting claims, generate explanatory reasoning, and provide token-level confidence scores. In parallel, the Web Crawler Verifier conducts real-time entity-specific searches and ranks semantically relevant results from trusted external sources. The Plausibility Estimator evaluates the logical consistency and domain-aligned statistical thresholds for each candidate fact using a trained machine learning model.
According to the embodiment of the present invention, the truth generator module consolidates outputs from all upstream components to produce the final resolved fact, including a confidence score (between 0 and 1), a natural language explanation, and a list of supporting or contradicting sources with timestamps. The Output Unit formats and delivers this structured result via REST APIs, dashboards, or exports (PDF/CSV), and optionally stores it in a SQL or graph database. The system includes robust error handling and resilience mechanisms: LLM API failures are recovered using fallback rule-based extraction; web crawler timeouts are managed with proxy rotation and retries; unresolved high-trust conflicts are escalated to a human-in-the-loop adjudication pipeline.
An optional Feedback and Learning Loop continuously monitors system performance, embedding drift, and user feedback to dynamically adjust model weights, prompt strategies, and trust score thresholds. This closed loop ensures the system adapts to evolving data patterns and improves accuracy over time, enabling scalable and explainable conflict resolution for use in domains such as finance, governance, legal research, and scientific publishing.

According to the embodiment of the present invention, the method for conflict resolution and truth extraction from multi-source document ingestion using large language models (LLMs), factual web crawling, and statistical modeling, as illustrated in FIGs. 1–2, comprises the following steps:
● Receiving documents and normalizing input via the Ingestion Engine (FIG. 1): The input unit receives structured and unstructured content from sources such as PDF files, Word documents, HTML pages, APIs, and RSS feeds. The ingestion engine deduplicates the content using SHA256 fingerprints and standardizes each document into a unified JSON schema. It extracts metadata including source domain, publication date, and title, and removes non-content elements such as headers, footers, and advertisements to retain relevant factual material.
● Extracting entities and candidate facts via the Entity Extractor (FIG. 1): The normalized documents are processed using a dual-pass approach combining Named Entity Recognition (NER) and LLM-based zero-shot extraction. This step identifies discrete factual units such as numerical values, dates, named entities, measurements, and relationships, producing a structured representation of facts linked to their originating documents and contextual metadata.
● Clustering conflicting values via the Conflict Resolver Engine (FIG. 1): Extracted facts are transformed into contextual embeddings and grouped using cosine similarity into semantic clusters. Each cluster represents conflicting data points associated with the same factual entity and forms the basis for downstream resolution.
● Scoring source credibility via the Source Scoring Model (FIG. 1): Each document in a conflict cluster is assigned a trust score based on a weighted function that considers domain reputation (e.g., government, news, blog), recency, citation frequency, and historical accuracy. These source trust scores are retained for use during conflict resolution.
● Validating claims via the LLM Validator and Web Crawler Verifier (FIG. 2): The LLM validator is prompted with all conflicting factual variants and returns comparative reasoning, token-level confidence scores, and explanation strings. In parallel, the web crawler verifier performs targeted searches using keyword and entity queries across trusted domains. It retrieves relevant factual snippets, which are ranked by semantic similarity and domain authority.
● Estimating factual plausibility via the Plausibility Estimator (FIG. 2): Each fact is evaluated for logical and statistical consistency using a trained machine learning model that incorporates domain-specific constraints such as acceptable numerical ranges, co-occurring value patterns, and rule-based plausibility thresholds.
● Synthesizing the final output via the Truth Generator Module (FIG. 2): This module consolidates results from the source scoring model, LLM validator, web crawler verifier, and plausibility estimator to select the most probable true value. The output includes the resolved fact, a confidence score, a natural language explanation, and a list of supporting or contradicting sources with citations and timestamps.
● Delivering the resolved output via the output unit (FIG. 2): The verified factual outputs are made available through a REST API, dashboard, or structured export such as CSV or PDF. The outputs may also be stored in a SQL or graph database to support downstream querying, orchestration, and auditing.
● Monitoring runtime outcomes and incorporating feedback (FIG. 2): During or after deployment, the system monitors the runtime behavior of resolved facts. If inconsistencies or anomalies are detected, the system may trigger re-evaluation, adjust confidence scores, or escalate the fact to a human-in-the-loop resolution process. An optional feedback and learning loop captures corrections, usage signals, and embedding drift to continuously update model weights, scoring functions, and prompt strategies for improved resolution accuracy.
The system further includes runtime monitoring mechanisms that track the stability, accuracy, and relevance of previously resolved facts. When conflicts reappear or when new documents introduce contradictory data, the system dynamically re-evaluates prior resolutions by reactivating the validation pipeline. Additionally, user feedback and human-in-the-loop reviews are captured to identify edge cases and tune future model behavior. Over time, the system self-adapts by updating prompt strategies, refining scoring weights, and retraining plausibility estimators, resulting in a robust, explainable, and continuously improving framework for high-fidelity truth extraction from noisy, multi-source environments.

According to the embodiment of the present invention, the system for conflict resolution and truth extraction performs a structured sequence of operations to reconcile conflicting factual information extracted from multi-source document ingestion using large language models (LLMs), factual web crawling, and statistical modeling. The architecture comprises modular components working in coordination to ensure that resolved data points are accurate, explainable, and supported by traceable evidence from credible sources.
1. Ingestion engine : The system begins by ingesting structured and unstructured content from a variety of sources including APIs, RSS feeds, web URLs, PDF files, HTML documents, and Word files. The ingestion engine performs deduplication using SHA256 fingerprints and standardizes all incoming content into a unified JSON schema. It extracts relevant metadata such as title, date, and domain, and removes non-content elements including footers, headers, and advertisements to retain only the core factual material.
2. Entity Extractor: Normalized documents are processed by the entity extractor using a dual-pass mechanism that combines named entity recognition (NER) with LLM-based zero-shot extraction. This module identifies discrete factual units such as numerical values, dates, named relationships, and measurable indicators. Each fact is tagged with its originating document and positional metadata to enable traceability and contextual interpretation in later stages.
3. Conflict Resolver Engine : The extracted facts are embedded into vector representations and compared using cosine similarity to identify semantically similar yet conflicting values. These are grouped into semantic clusters, each representing multiple factual variants associated with the same real-world entity or attribute. The conflict resolver engine prepares these clusters for downstream evaluation by enabling multi-source, multi-factor comparison.
4. Source Scoring Model: Each source contributing to a conflict cluster is evaluated for trustworthiness. The source scoring model assigns a weighted score to each document based on factors such as domain classification (e.g., government, journalistic, blog), content recency, citation count, and prior historical accuracy. These scores are attached to individual facts and influence prioritization during the resolution phase.
5. LLM Validator and Web Crawler Verifier : The LLM validator is prompted with all conflicting factual variants within a cluster. It performs comparative reasoning and outputs a judgment on the most plausible fact, including an explanation and token-level confidence score. In parallel, the web crawler verifier performs real-time entity-specific searches across trusted external domains and knowledge sources, returning and ranking relevant snippets based on semantic alignment and domain authority.
6. Plausibility Estimator : Each candidate fact is evaluated for logical and statistical coherence using a plausibility estimator trained on domain-specific models. This component assesses the validity of a fact based on expected numerical ranges, co-occurring values, and logical constraints (e.g., income cannot be negative, biological rates must fall within feasible limits). It provides an independent confidence signal to support or question each claim.
7. Truth Generator : The truth generator consolidates outputs from all upstream components source scoring, LLM reasoning, factual crawling, and plausibility checks to select the most probable true value for each cluster. The final output includes the resolved factual value, a confidence score ranging between 0 and 1, a natural language explanation string, and a traceable list of supporting or contradicting sources with metadata and timestamps. These results are exposed through REST APIs, query interfaces, and reporting dashboards for downstream applications.
8. Error Handling and Resilience : During operation, the system continuously monitors previously resolved facts for consistency. If contradictions emerge from newly ingested sources or downstream systems report inconsistencies, the system may trigger re-evaluation of the resolution process, update confidence scores, or route the fact to a human-in-the-loop workflow for manual adjudication.
9. Feedback and Learning Loop : All inference activities LLM prompts, source evaluations, crawler outputs, plausibility model decisions, and user feedback are recorded and processed by a feedback module. This module dynamically tunes model weights, scoring thresholds, and prompt templates while detecting embedding drift and behavioral anomalies. Over time, the Feedback and Learning Loop ensures adaptive learning and continuous improvement in the system’s ability to resolve conflicting factual information accurately and efficiently.

Example:
In accordance with the present invention, robust error handling and resilience mechanisms are implemented to ensure system stability under various failure scenarios. When the Large Language Model (LLM) experiences downtime resulting in API timeouts, the system employs a retry mechanism and, if necessary, gracefully falls back to rule-based extraction methods to maintain functionality. In cases where the web crawler encounters access restrictions such as CAPTCHA challenges or bot detection mechanisms, the system utilizes proxy rotation, strategic time delays, and headless browser retries to restore access and continue data retrieval. To address situations where there is no consensus due to conflicting high-trust responses, the system incorporates a human-in-the-loop review queue to validate and resolve discrepancies. Furthermore, when API rate limits are triggered, indicated by HTTP 429 errors, the system applies exponential backoff strategies combined with the use of a local cache to optimize request flow and prevent service disruption.
Advantages:
The present invention offers several key advantages that enhance the accuracy, trustworthiness, and contextual relevance of factual data in artificial intelligence systems. It is specifically designed to resolve conflicting information extracted from diverse sources such as PDFs, HTML documents, Word files, APIs, and web feeds by combining large language model (LLM) reasoning, real-time factual crawling, and statistical plausibility modeling. Using a structured conflict resolution pipeline that integrates semantic clustering, source trust scoring, LLM-based validation, and plausibility estimation, the system produces factual outputs that are confidence-scored, explainable, and supported by a traceable source path.
The system continuously adapts to evolving data by monitoring previously resolved facts and triggering re-evaluation when inconsistencies arise. A feedback and learning loop captures inference outcomes, user inputs, and embedding drift allowing the system to update scoring weights, refine prompt strategies, and recalibrate plausibility models over time. This makes the invention robust, self-improving, and context-aware.
The modular architecture of the invention allows seamless integration into enterprise-scale systems, including LLM pipelines, decision-support tools, and knowledge graphs. By automatically synthesizing reliable and traceable facts from noisy, multi-source inputs, the invention reduces manual validation effort, increases transparency in reasoning workflows, and builds confidence in downstream AI applications. It is particularly effective in domains that require high-integrity factual reconciliation, such as financial analysis, legal research, regulatory reporting, scientific publishing, and public sector intelligence.
, Claims:We claim,
1. A system and method for conflict resolution and truth extraction from multi-source document ingestion
characterised in that
the system comprises an input unit, a processing unit, and an output unit and the processing unit comprises of document normalizer, an entity extractor, a conflict resolver engine, a source scoring model, an LLM validator, a web crawler verifier, a plausibility estimator, and a truth generator module;
and the method for conflict resolution and truth extraction from multi-source document ingestion comprises the steps of
• receiving structured and unstructured content in the input unit and normalizing input via the Ingestion Engine that deduplicates the content and standardizes each document into a unified JSON schema;
• extracting and identifying discrete factual units such as numerical values, dates, named entities, measurements, and relationships, producing a structured representation of facts linked to their originating documents and contextual metadata by Named Entity Recognition and LLM-based zero-shot extraction via the Entity Extractor;
• extracted facts are transformed into contextual embeddings and grouped using cosine similarity into semantic clusters by the Conflict Resolver Engine such that each cluster represents conflicting data points associated with the same factual entity and forms the basis for downstream resolution.
• scoring source credibility to each document in a conflict cluster based on a weighted function that considers domain reputation, recency, citation frequency, and historical accuracy via the Source Scoring Model;
• validating claims via the LLM Validator by prompting all conflicting factual variants and returns comparative reasoning, token-level confidence scores, and explanation strings and the web crawler verifier performs targeted searches using keyword and entity queries across trusted domains by retrieving relevant factual snippets, which are ranked by semantic similarity and domain authority.
• estimating factual plausibility via the Plausibility Estimator such that each fact is evaluated for logical and statistical consistency using a trained machine learning model that incorporates domain-specific constraints such as acceptable numerical ranges, co-occurring value patterns, and rule-based plausibility thresholds;
• synthesizing the final output via the Truth Generator Module by consolidating results from the source scoring model, LLM validator, web crawler verifier, and plausibility estimator to select the most probable true value wherein the output includes the resolved fact, a confidence score, a natural language explanation, and a list of supporting or contradicting sources with citations and timestamps;
• delivering the resolved output via the output unit where the verified factual outputs are made available and are stored in database to support downstream querying, orchestration, and auditing;
• monitoring runtime outcomes and incorporating feedback and if inconsistencies or anomalies are detected, the system triggers re-evaluation, adjust confidence scores, or escalate the fact to a human-in-the-loop resolution process.

2. The system and method as claimed in claim 1, wherein the input unit is configured to ingest heterogeneous documents from a variety of structured and unstructured sources including but not limited to APIs, web URLs, PDF files, HTML pages, and RSS feeds.

3. The system and method as claimed in claim 1, wherein the truth generator module consolidates outputs from all upstream components to produce the final resolved fact, including a confidence score (between 0 and 1), a natural language explanation, and a list of supporting or contradicting sources with timestamps.

4. The system and method as claimed in claim 1, wherein the output unit formats and delivers the structured result via REST APIs, dashboards, or exports, and optionally stores it in a SQL or graph database.

5. The system and method as claimed in claim 1, wherein Feedback and Learning Loop continuously monitors system performance, embedding drift, and user feedback to dynamically adjust model weights, prompt strategies, and trust score thresholds.

6. The system and method as claimed in claim 1, wherein the conflict resolution module uses both LLM-generated reasoning and real-time web validation for enhanced factual confidence.

7. The system and method as claimed in claim 1, wherein the system extracts, reconciles, and validates conflicting data points across heterogeneous sources.

8. The system and method as claimed in claim 1, wherein the machine learning model trained to estimate the plausibility of numerical or textual facts based on historical ranges and logical consistency.

9. The system and method as claimed in claim 1, wherein the output includes an explanation, confidence score, and list of supporting or contradicting sources.

10. The system and method as claimed in claim 1, wherein the method includes handling failure in resolution by switching between deterministic rules, AI-based inference, and escalation to human reviewers.

Documents

Application Documents

#	Name	Date
1	202521068259-STATEMENT OF UNDERTAKING (FORM 3) [17-07-2025(online)].pdf	2025-07-17
2	202521068259-POWER OF AUTHORITY [17-07-2025(online)].pdf	2025-07-17
3	202521068259-FORM 1 [17-07-2025(online)].pdf	2025-07-17
4	202521068259-FIGURE OF ABSTRACT [17-07-2025(online)].pdf	2025-07-17
5	202521068259-DRAWINGS [17-07-2025(online)].pdf	2025-07-17
6	202521068259-DECLARATION OF INVENTORSHIP (FORM 5) [17-07-2025(online)].pdf	2025-07-17
7	202521068259-COMPLETE SPECIFICATION [17-07-2025(online)].pdf	2025-07-17
8	Abstract.jpg	2025-08-04
9	202521068259-FORM-9 [26-09-2025(online)].pdf	2025-09-26
10	202521068259-FORM 18 [01-10-2025(online)].pdf	2025-10-01