Sign In to Follow Application
View All Documents & Correspondence

Systems And Method For Fine Tuning Primary Large Language Model (Llm) Using Secondary Llm

Abstract: A method (300) for fine-tuning primary LLM using secondary LLM is disclosed. The secondary LLM determines feedback associated with an output generated by the primary LLM for user query. The primary LLM is pre-trained based on first training data related to first set of domains. The secondary LLM determines feedback context information by determining context type associated with the feedback. The secondary LLM determines domain associated with the user query and the feedback. The secondary LLM determines deviation information of the output based on analysis of the feedback, the feedback context information and the output with respect to predefined attributes. Secondary LLM retrieves data related to the domain from an external database or the first training data based on comparison of the domain and the first set of domains. Secondary LLM determines finetuning data by processing data based on deviation information and the comparison to adaptively train primary LLM periodically. (To be published with FIG. 1)

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
15 September 2025
Publication Number
40/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

HCL Technologies Limited
806, Siddharth, 96, Nehru Place, New Delhi, 110019, India

Inventors

1. Navin Sabharwal
N-3A, Jangpura Extension, New Delhi, 110014, India
2. Punith Krishnamurthy
No. 21, 5th Cross, Shakthiganapathinagar, 8th Main, Basaveshwaranagar, Bengaluru, Karnataka, 560079, India

Specification

Description:FIELD OF INVENTION
[0001] This disclosure relates generally to the field of the large language models (LLM) and particularly relates to the method and system for fine-tuning.
BACKGROUND
[0002] In realm of Natural Language Processing (NLP) and Machine Learning, training Large Language Models (LLM) plays a vital role in understanding and generating the text that resembles the human language. In order to understand and generate the human-like text of a user query, the LLM should have the contextual knowledge corresponding to a specific domain. Further, the LLM should provide a user-centric response. Moreover, the LLM generated response should be tailored by adhering to domain specific norms, communication styles, user-centric language, real-world scenarios, cross-domain specific norms, emotions and sentiments. However, the LLM may occasionally hallucinate and misinterpret the context of the user query, resulting in irrelevant and inaccurate responses.
[0003] Existing systems rely on manual intervention, where AI trainers manually analyze the feedback provided by the user in order to fine-tune the LLM. Due to which there is a dependency on AI trainers as well as on the feedback provided by the user. As feedback provided by the user can vary significantly due to the diverse user expectations and preferences, that may lead to inconsistent and biased results. Additionally, LLMs rely completely on a massive training dataset, which may be high in computational cost due to iterative training. The massive training dataset may include the high amount of noisy, redundant and inconsistent data. Further, cleaning and refining such training datasets is a significant challenge. Therefore, there is a need in the present state of art to automate the training of LLMs.
SUMMARY
[0004] This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
[0005] In an embodiment, a method for fine-tuning a primary large language model (LLM) using a secondary LLM is disclosed. The method may include determining, by the secondary LLM, at least one user feedback associated with at least one output generated by the primary LLM for at least one user query. In an embodiment, the primary LLM may be pre-trained based on first training data related to a first set of domains. The method may further include determining, by the secondary LLM, feedback context information of the at least one user feedback by determining a context type from a plurality of context types associated with the at least one user feedback. The method may further include determining, by the secondary LLM, at least one domain associated with the at least one user query and the at least one user feedback. The method may further include determining, by the second LLM, deviation information of the at least one output based on an analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes. In an embodiment, the secondary LLM is trained to determine the deviation information from the set of predefined attributes. The method may further include retrieving, by the secondary LLM, data related to the at least one domain from one of an external database or the first training data based on a comparison of the at least one domain and the first set of domains. The method may further include determining, by the secondary LLM, finetuning data by processing the data based on the deviation information and the comparison. The method may further include adaptively training, by the secondary LLM, the primary LLM based on the finetuning data periodically.
[0006] In another embodiment, a system for fine-tuning a primary large language model (LLM) using a secondary LLM is disclosed. The system may include a processor enabling the secondary LLM and communicably coupled to a server enabling the primary LLM, and a memory communicably coupled to the processor, wherein the memory stores processor-executable instructions, which, on execution by the processor, may cause the processor to determine, using the secondary LLM, at least one user feedback associated with at least one output generated by the primary LLM for at least one user query. In an embodiment, the primary LLM is pre-trained based on a first training data related to a first set of domains. The processor may further determine, using the secondary LLM, feedback context information of the at least one user feedback by determining a context type from a plurality of context types associated with the at least one user feedback. The processor may further determine, using the secondary LLM, at least one domain associated with the at least one user query and the at least one user feedback. The processor may further determine, using the secondary LLM, deviation information of the at least one output based on an analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes. In an embodiment, the secondary LLM may be trained to determine the deviation information from the set of predefined attributes. The processor may further retrieve, using the secondary LLM, data related to the at least one domain from one of: an external database or the first training data based on a comparison of the at least one domain and the first set of domains. The processor may further determine, using the secondary LLM, finetuning data by processing the data based on the deviation information and the comparison. The processor may further adaptively train, using the secondary LLM, the primary LLM based on the finetuning data periodically. [0006] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
[0007] The foregoing general description of the illustrative embodiments and the following detailed description thereof are merely exemplary aspects of the teachings of this disclosure and are not restrictive.
BRIEF DESCRIPTION OF FIGURES
[0008] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles.
[0009] FIG. 1 illustrates a network system for fine-tuning a primary large language model using a secondary large language model, in accordance with some embodiments of the present disclosure.
[0010] FIG. 2 illustrates a functional block diagram implementing various modules within the memory to enabled the secondary LLM to perform comprehensive fine-tuning operations on the primary LLM, in accordance with some embodiments of the present disclosure., in accordance with an embodiment of the present disclosure.
[0011] FIG. 3 illustrates a flow diagram of a methodology to fine-tuning a primary LLM using a secondary LLM, in accordance with an embodiment of the present disclosure.
[0012] FIG. 4 illustrates a flow diagram of a methodology to determine an inclusive state of the at least one domain, in accordance with an embodiment of the present disclosure.
DETAILED DESCRIPTION
[0013] The following description sets forth exemplary aspects of the present disclosure. It should be recognized, however, that such description is not intended as a limitation on the scope of the present disclosure. Rather, the description also encompasses combinations and modifications to those exemplary aspects described herein.
[0014] A detailed description of systems, devices, and methods consistent with embodiments of the present disclosure is provided below. While several embodiments are described, it should be understood that disclosure is not limited to any one embodiment, but instead encompasses numerous alternatives, modifications, and equivalents. In addition, while numerous specific details are set forth in the following description in order to provide a thorough understanding of the embodiments disclosed herein, some embodiments can be practiced without some or all of these details. Moreover, for the purpose of clarity, certain technical material that is known in the related art has not been described in detail in order to avoid unnecessarily obscuring the disclosure.
[0015] Exemplary embodiments are described with reference to the accompanying drawings. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is intended that the following detailed description be considered exemplary only, with the true scope being indicated by the following claims. Additional illustrative embodiments are listed.
[0016] Further, the phrases “in some embodiments”, “in accordance with some embodiments”, “in the embodiments shown”, “in other embodiments”, and the like mean a particular feature, structure, or characteristic following the phrase is included in at least one embodiment of the present disclosure and may be included in more than one embodiment. In addition, such phrases do not necessarily refer to the same embodiments or different embodiments. It is intended that the following detailed description be considered exemplary only, with the true scope being indicated by the following claims.
[0017] Referring to FIG. 1, a network system 100 provides a comprehensive architecture for fine-tuning a primary large language model using a secondary large language model. The network system 100 includes a primary LLM server 102 that hosts and operates the primary large language model being pre-trained based on first training data related to a first set of domains. The primary LLM server 102 comprises a first processor 104 and a memory 106 that work together to execute the primary language model operations. The first processor 104 may be configured to handle computational tasks related to language model inference, response generation, and parameter updates during the fine-tuning process.
[0018] The primary LLM may be implemented using various large-scale language models depending on the specific application requirements and computational resources available. In some embodiments, the primary LLM may comprise transformer-based models such as GPT-3, GPT-4, or GPT-4 Turbo that provide general-purpose natural language understanding and generation capabilities across multiple domains. The primary LLM may alternatively include models such as BERT, RoBERTa, or T5 that offer specialized capabilities for natural language understanding, text classification, or text-to-text generation tasks. In some implementations, the primary LLM may be implemented using open-source models such as LLaMA, Alpaca, or Vicuna that provide customizable language processing capabilities, or may include proprietary models such as Claude, PaLM, or Bard that offer advanced conversational and reasoning abilities. The primary LLM may also comprise adaptable foundation models that can be fine-tuned for various domains through transfer learning techniques, including models with cross-domain adaptation capabilities, multi-task learning frameworks, or modular architectures that allow for domain-specific knowledge integration while maintaining general language understanding capabilities.
[0019] The first processor 104 may be implemented using various types of processing units depending on the computational requirements and system architecture. In some embodiments, the first processor 104 may comprise a central processing unit (CPU) such as an Intel Core i7, Intel Xeon, AMD Ryzen, or ARM Cortex processor that provides general-purpose computing capabilities for language model operations. The first processor 104 may alternatively include graphics processing units (GPUs) such as NVIDIA Tesla V100, NVIDIA A100, NVIDIA H100, AMD Radeon Instinct, or Intel Xe processors that offer parallel processing capabilities particularly suited for machine learning computations and neural network operations.
[0020] In some implementations, the first processor 104 may be implemented as specialized artificial intelligence processors including tensor processing units (TPUs) such as Google TPU v4 or TPU v5, neural processing units (NPUs), or field-programmable gate arrays (FPGAs) such as Xilinx Versal or Intel Stratix series that may be optimized for deep learning workloads. The first processor 104 may also comprise application-specific integrated circuits (ASICs) designed specifically for language model processing tasks.
[0021] In some embodiments, the first processor 104 may include distributed processing architectures such as multi-core processors, cluster computing systems, or cloud-based processing instances including Amazon EC2 instances, Google Cloud Compute instances, or Microsoft Azure virtual machines. The first processor 104 may further comprise hybrid processing configurations that combine different processor types, such as CPU-GPU combinations or heterogeneous computing platforms that leverage both traditional processors and specialized AI accelerators to optimize performance for different aspects of the language model fine-tuning process.
[0022] Further, the memory 106 may store the primary language model parameters, training data related to the first set of domains, and various modules that facilitate the fine-tuning operations. In some implementations, the memory 106 may include volatile and non-volatile storage components to support different aspects of the language model operations.
[0023] The memory 106 may be implemented using various types of storage technologies to accommodate the substantial data requirements and performance needs of large language model operations. In some embodiments, the memory 106 may comprise volatile memory components such as dynamic random access memory (DRAM) including DDR4, DDR5, or high bandwidth memory (HBM) such as HBM2 or HBM3 that provide fast access to frequently used model parameters and intermediate computation results during inference and training operations.
[0024] In some implementations, the memory 106 may include non-volatile memory components such as solid-state drives (SSDs) using NAND flash memory, NVMe SSDs, or 3D XPoint memory technologies like Intel Optane that offer persistent storage for model weights, training datasets, and checkpoint data. The memory 106 may also comprise traditional storage devices such as hard disk drives (HDDs) for long-term storage of large training datasets and historical model versions.
[0025] In some embodiments, the memory 106 may be implemented as distributed memory architectures including network-attached storage (NAS) systems, storage area networks (SANs), or cloud-based storage solutions such as Amazon S3, Google Cloud Storage, or Microsoft Azure Blob Storage that provide scalable storage capacity for handling large-scale language model data requirements. The memory 106 may further include memory pooling technologies that allow dynamic allocation of memory resources across multiple processing units.
[0026] The memory 106 may also comprise specialized memory configurations such as persistent memory modules that combine the speed of DRAM with the persistence of storage, or memory-centric computing architectures that integrate processing capabilities directly within memory modules. In some implementations, the memory 106 may include tiered storage systems that automatically move data between different storage tiers based on access patterns and performance requirements, optimizing both cost and performance for language model operations.
[0027] A secondary LLM server 110 may operate within the network system 100 to provide specialized fine-tuning capabilities for the primary language model. The secondary LLM server 110 includes a second processor 108 and memory 109 that work together to execute the secondary language model operations. The secondary LLM server 110 may host a secondary large language model that may function as a specialized domain-oriented language model trained on extensive corpus including domain-specific lexicons, terminologies, and contextual understanding of industry processes and workflows. The secondary LLM server 110 may be configured to analyze user feedback, determine deviation information, and generate fine-tuning data based on predefined attributes and domain-specific requirements. In some implementations, the secondary LLM server 110 may operate multiple instances of the secondary language model to handle different domains or user contexts simultaneously.
[0028] The secondary LLM server 110 may implement various types of specialized language models depending on the specific domain requirements and fine-tuning objectives. In some embodiments, the secondary LLM may comprise a domain-specific variant of models such as BERT, RoBERTa, or DistilBERT that have been fine-tuned on industry-specific corpora, or may include specialized models like BioBERT for healthcare domains, FinBERT for financial services, or LegalBERT for legal applications. The secondary LLM may alternatively be implemented using smaller, more efficient models such as ALBERT, DeBERTa, or T5-small that provide faster processing capabilities while maintaining sufficient analytical depth for feedback analysis and deviation detection. In some implementations, the secondary LLM may comprise custom-trained transformer models or ensemble models that combine multiple specialized language models to provide comprehensive domain coverage and contextual understanding across different industry verticals and user interaction patterns.
[0029] As further shown in FIG. 1, a communication network 112 may serve as the central interconnection medium that enables data exchange between all components of the network system 100. The communication network 112 may facilitate communication between the primary LLM server 102, the secondary LLM server 110, and other system components. The communication network 112 may comprise various networking technologies including local area networks, wide area networks, or cloud-based networking infrastructure to support the distributed nature of the fine-tuning operations. In some implementations, the communication network 112 may implement secure communication protocols to protect sensitive training data and model parameters during transmission.
[0030] The network system 100 also includes a user device 114 that connects to the communication network 112 to enable user interactions with the primary LLM. The user device 114 may submit user queries to the primary LLM enabled by the primary LLM server 102 and provide user feedback on the generated output by the primary LLM. The user device 114 serves as the interface through which users interact with the system and provides the feedback that drives the fine-tuning process. In some cases, multiple user devices 114 may be connected to the network system 100 to support concurrent user interactions and feedback collection from different users or user groups.
[0031] The user device 114 may be implemented as various types of computing devices that enable user interaction with the network system 100. In some embodiments, the user device 114 may comprise desktop computers, laptop computers, tablet devices such as iPad or Android tablets, smartphones including iPhone or Android devices, workstations, or thin client terminals that provide access to the primary LLM through web browsers or dedicated applications. The user device 114 may also include specialized computing devices such as kiosks, point-of-sale terminals, or embedded systems in industrial environments that require domain-specific language model interactions. In some implementations, the user device 114 may be implemented as wearable devices such as smartwatches or augmented reality headsets, voice-activated devices like smart speakers or virtual assistants, or Internet of Things (IoT) devices that can process and transmit user queries and feedback to the communication network 112 for processing by the primary LLM server 102 and secondary LLM server 110.
[0032] With continued reference to FIG. 1, a database 116 connects to the communication network 112 to provide data storage and retrieval capabilities for the fine-tuning process. The database 116 may store various types of data including data related to the first set of domains, external domain-specific data, user feedback records, and historical interaction data. The database 116 may function as an external database that includes industry sites, SharePoint, Google Drive, and other cloud storage or document repositories as sources for updated domain-specific data. In some implementations, the database 116 may be implemented as a distributed storage system with multiple storage nodes to handle large volumes of training data and support scalable data access patterns. The database 116 may also maintain categorized domain-specific information that the secondary LLM server 110 can access during the fine-tuning process to retrieve relevant data based on domain comparisons and deviation analysis.
[0033] The database 116 may be implemented using various database technologies to support the diverse data storage and retrieval requirements of the fine-tuning process. In some embodiments, the database 116 may comprise relational database management systems such as MySQL, PostgreSQL, Oracle Database, or Microsoft SQL Server that provide structured data storage with ACID compliance for maintaining data integrity during concurrent access operations. The database 116 may alternatively include NoSQL databases such as MongoDB, Cassandra, Amazon DynamoDB, or Redis that offer flexible schema designs and horizontal scalability for handling large volumes of unstructured or semi-structured training data.
[0034] In some implementations, the database 116 may be implemented as distributed database systems including Apache Hadoop, Apache Spark, or Elasticsearch clusters that provide parallel processing capabilities for large-scale data analytics and retrieval operations. The database 116 may also comprise graph databases such as Neo4j or Amazon Neptune that may be particularly suited for storing and querying complex relationships between domains, user feedback patterns, and contextual information.
[0035] In some embodiments, the database 116 may include hybrid database architectures that combine multiple database technologies, such as polyglot persistence systems that use different database types for different data categories based on access patterns and performance requirements. The database 116 may further comprise in-memory databases such as SAP HANA or Redis Enterprise that provide high-speed data access for real-time fine-tuning operations.
[0036] Further the network system 100 may perform various functions to fine-tune the primary LLM using the secondary LLM. By way of an example, the secondary LLM enabled by the secondary LLM server 110 may determine at least one user feedback associated with at least one output generated by the primary LLM for at least one user query received from the user device 114 via the communication network 112. It should be noted that the user query is fed to the primary LLM in order to receive a relevant output corresponding to the user query. In an embodiment, the primary LLM may be pre-trained based on first training data related to a first set of domains. The secondary LLM may determine feedback context information of the at least one user feedback by determining a context type from a plurality of context types associated with the at least one user feedback. The context type may include corrections, clarifications, or specific contextual queries. The secondary LLM may further determine at least one domain associated with the at least one user query and the at least one user feedback.
[0037] The secondary LLM may determine deviation information of the at least one output based on analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes stored in the memory 109. In an embodiment, the set of predefined attributes may include, but are not limited to, lexicon attributes, terminology attributes, contextual attributes, human-interaction attributes, user-centric language attributes, real-world scenario simulation attributes, continuous learning attributes, cross-domain adaptability attributes and emotional, sentiment attributes, and so on. In an embodiment, the secondary LLM may be trained to determine the deviation information from the set of predefined attributes.
[0038] The secondary LLM may retrieve data related to the at least one domain from the database 116 or from first training data stored in the memory 106 of the primary LLM server 102 based on a comparison of the at least one domain and the first set of domains. The secondary LLM may determine finetuning data by processing the retrieved data based on the deviation information and the comparison results of the at least one domain and the first set of domains. The secondary LLM may then adaptively train the primary LLM hosted on the primary LLM server 102 by transmitting the finetuning data through the communication network 112 to update the model parameters stored in the memory 106 periodically. The primary LLM may be trained periodically after every predefined training time period or based on accumulation of user feedback.
[0039] Referring to FIG. 2, a functional block diagram 200 implementing various modules within the memory 109 to enabled the secondary LLM to perform comprehensive fine-tuning operations on the primary LLM is illustrated, in accordance with some embodiments of the present disclosure. The functional block diagram 200 may comprise multiple interconnected modules that work together to analyze user feedback, determine contextual information, and generate appropriate fine-tuning data for the primary LLM. Each module within the functional block diagram 200 may be configured to handle specific aspects of the fine-tuning process while maintaining communication pathways with other modules to ensure coordinated operation. The modular design of the functional block diagram 200 may allow for scalable processing of user feedback and domain-specific data while maintaining separation of concerns between different functional components. In some cases, the functional block diagram 200 may be implemented as software modules stored in the memory 109 and executed by the processor 108 to perform the various fine-tuning operations described herein. The memory 109 may include a feedback determination module 202, a context information determination module 204, a domain determination module 206, a deviation information module 208, a data retrieval module 210, a finetuning data determination module 212, and a training module 214.
[0040] The feedback determination module 202 may serve as the initial processing component for analyzing user interactions and extracting feedback information from user responses to primary LLM outputs. The primary LLM may provide an output for a user query input by the user device 114. It may be noted that the primary LLM may pre-trained based on first training data related to a first set of domains. The first set of domains may include one or more domains such as, but not limited to, IT, HR, etc. But it may still lack capability to answer a query to other domains such as, but not limited to, healthcare, retail, finance, manufacturing, and telecommunications, etc. Thus, the primary LLM may process the user query and generate an output according to its training related to the first set of domains. In case the user feels that the output generated is not relevant to the user query and lacks in terms of accuracy, clarity, structure, tone, etc., the user may provide their input for improvement of the generated output by the primary LLM.
[0041] Thus, the feedback determination module 202 may determine the user feedback data received from the user device 114 via the communication network 112. Further, the context information determination module 204 may determine feedback context information of the at least one user feedback. The feedback context information may be determined by determining a context type from a plurality of context types associated with the at least one user feedback. In an embodiment, the context type may be, but not limited to, critical, corrections, clarifications, or specific contextual queries related to the user query. The feedback context information is determined in order to understand where the actual output lacks in terms of the context of the feedback. Thus, the at least one user feedback may be processed by the context information determination module 204 to identify specific issues, inaccuracies, or areas for improvement in the primary LLM responses. In some cases, the feedback determination module 202 and the context information determination module 204 may implement natural language processing algorithms to parse user feedback text and extract structured information about user satisfaction, correction requests, or contextual misunderstandings. Accordingly, each of the plurality of context types may be used to categorize feedback based on feedback types such as factual corrections, stylistic preferences, domain-specific terminology issues, or emotional tone mismatches. The context information determination module 204 may further incorporate emotion and sentiment detection capabilities to understand the emotional context of user feedback and identify whether feedback indicates frustration, satisfaction, confusion, or other emotional states that may inform the fine-tuning process.
[0042] In some cases, the context information determination module 204 may analyze the sequence of interactions between users and the primary LLM to understand the conversational flow and identify contextual dependencies that may influence the appropriateness of primary LLM's responses. The context information determination module 204 may also examine metadata associated with user queries such as timestamp information, user location data, device type information, or session history to provide additional contextual insights. The context information determination module 204 may further analyze communication styles, tone preferences, and unwritten rules governing effective human-machine interactions within specific domains to ensure that contextual analysis aligns with domain-specific communication norms.
[0043] By way of an example, for an exemplary user query “Tell me something about Apple” the corresponding output generated by the primary LLM may be “Apple Inc. is a leading technology company known for its iPhone, MacBook, and innovative design”. The user feedback provided for the output generated for the exemplary user query by the primary LLM may be “I was asking about the fruit, not the tech company”. Accordingly, the user provided feedback clearly mentions that the generated output by primary LLM lacks the contextual information of the fruit Apple and generated the inaccurate results.
[0044] By way of another example, for an exemplary user query “Help me buy a laptop” the corresponding output generated by the primary LLM may be “Here are 20 laptop options: Dell XPS 13, MacBook Air, HP Spectre x360, Lenovo ThinkPad X1 Carbon, Asus ZenBook, Acer Swift 5, Microsoft Surface Laptop 5...”. Accordingly, the user feedback provided for the corresponding output generated by the primary LLM may be “Too much of options in the list, it should've asked about my preferences like budget, OS, usage type and then suggested a few tailored laptops”. In the given example, the user provided feedback clearly mentions that the generated output by primary LLM lacks the contextual information of the user preference to tailor down the list of the options of the laptops and generated the vague results.
[0045] The domain determination module 206 may determine at least one domain associated with the at least one user query and the at least one user feedback. At least one domain is determined based on the user query and the corresponding user feedback, in order to understand the field or industry domain in which the user query is related to and the feedback has been provided by the user. Referring to one of the above mentioned examples for the user query “Help me buy a laptop” , the domain determination module 206 may determine the domain associated with the exemplary user query as “E-commerce”.
[0046] Thus, domain determination module 206 may identify and classify domains associated with user queries and user feedback through domain classification algorithms and domain-specific pattern recognition. The domain determination module 206 may analyze the content of user queries and feedback to determine at least one domain from a set of possible domains such as information technology, human resources, healthcare, finance, legal, education, or other specialized fields. In some cases, the domain determination module 206 may implement machine learning classifiers trained on domain-specific vocabularies, terminologies, and linguistic patterns to accurately identify the primary domain and any secondary domains associated with user interactions. The domain determination module 206 may also maintain domain taxonomies that define relationships between different domains and enable cross-domain adaptability by identifying overlapping or related domains that may provide relevant training data. The domain determination module 206 may further categorize domain-specific data based on predefined categories such as domain-specific lexicons, domain-specific terminologies, human interaction values, user-centric language, real-world scenarios, cross-domain relationships, user interactions, emotions and sentiments, contextual data, and industry processes.
[0047] With continued reference to FIG. 2, the deviation information module 208 may may determine deviation information of the at least one output based on an analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes. The deviation information is the difference between the actual output and a desirable output from the primary LLM. The deviation information may include information about the identified gaps, areas of improvement of the output generated by the primary LLM based on the set of predefined attributes. In an embodiment, the set of predefined attributes may include, but are not limited to, lexicon attributes, terminology attributes, contextual attributes, human-interaction attributes, user-centric language attributes, real-world scenario simulation attributes, continuous learning attributes, cross-domain adaptability attributes and emotional and sentiment attributes. Thus, the deviation information module 208 may analyze the outputs generated by the primary LLM in comparison to user expectations and domain-specific standards to identify areas where the primary LLM performance deviates from desired outcomes. The deviation information module 208 may determine deviation information based on analysis of user feedback, feedback context information, and primary LLM outputs with respect to a set of predefined attributes that define quality standards for language model responses. The predefined attributes may include lexicon attributes that evaluate vocabulary usage, terminology attributes that assess domain-specific term accuracy, contextual attributes that measure contextual appropriateness, human-interaction attributes that evaluate communication effectiveness, user-centric language attributes that assess user preference alignment, real-world scenario simulation attributes that measure practical applicability, continuous learning attributes that evaluate adaptability, cross-domain adaptability attributes that assess versatility, and emotional and sentiment attributes that measure emotional intelligence. In some cases, the deviation information module 208 may implement scoring algorithms that quantify the degree of deviation for each attribute and prioritize areas that require the most improvement during fine-tuning operations.
[0048] In an embodiment the secondary LLM may be trained to determine the deviation information from the set of predefined attributes. The secondary LLM may have the contextual information based on the historical conversations of the user and the primary LLM. Further, the secondary LLM may be trained based on the prompt engineering and Retrieval-Augmented Generation (RAG) framework, that may be used to retrieve the relevant data from the external database 116.
[0049] The data retrieval module 210 may retrieve data related to the at least one domain from one of an external database or the first training data based on a comparison of the at least one domain and the first set of domains. In order to retrieve the data, the data retrieval module 210 may first determine an inclusive state of the at least one domain as one of inclusive or not-inclusive. It may be noted that the inclusive state of the at least one domain may be determined as inclusive in case the first set of domains is inclusive of the at least one domain. Further, it may be noted that the inclusive state of the at least one domain is determined as non-inclusive in case the first set of domains is not inclusive of the at least one domain.
[0050] In an embodiment, when the inclusive state of the at least one domain is determined as inclusive, then the data retrieval module 210 may further retrieve the data related to the at least one domain from the first training data. In another embodiment, when the inclusive state of the at least one domain is determined as non-inclusive, then the data retrieval module 210 may retrieve the data related to the at least one domain from the external database.
[0051] Thus, the data retrieval module 210 may access and retrieve relevant data from various sources based on domain classifications and deviation analysis results provided by other modules. The data retrieval module 210 may retrieve data related to identified domains associated with the at least one user query and the at least one user feedback, from the database 116 when the data related to the identified domains is not included in the first training data. Further, the data retrieval module 210 may retrieve data related to identified domains from the first training data stored in the memory 106 when the data related to the identified domains are already present in the first training data and included in the primary LLM training. In some cases, the data retrieval module 210 may access external databases 116 that include industry sites, SharePoint repositories, Google Drive storage, and other cloud storage or document repositories as sources for updated domain-specific data. The data retrieval module 210 may also implement one or more re-ranker models to analyze updated knowledge databases and determine relevant domain-specific data corresponding to historical queries based on predefined relevancy parameters comprising domain-specific norms, communication styles, emotions and sentiments, and contextual information. The data retrieval module 210 may further categorize retrieved domain-specific data based on the predefined categories established by the domain determination module 206 and update knowledge databases with the categorized data to maintain organized and accessible training resources.
[0052] Referring to the one of the above mentioned exemplary user query “Help me buy a laptop” , the data retrieval module 210 may first determine that the inclusive state of the determined domain that is “E-commerce”, associated with the user query and the corresponding feedback. In case the “E-commerce” domain is inclusive, means that the domain related data is already present in first training data, then the data retrieval module 210 may retrieve the data from the first training data stored in memory 106. But in case the “E-commerce” domain is not-inclusive, means that the domain related data is absent in the first training data, then the data retrieval module 210 may further retrieve the domain related data from the external database 116.
[0053] As shown in FIG. 2, the finetuning data determination module 212 may determine finetuning data by processing the data based on the deviation information and the comparison. The finetuning data determination module 212 may process the data retrieved by the data retrieval module 210 based on the deviation information determined by the deviation information module 208 to generate finetuning data that addresses identified performance gaps in the primary LLM. The finetuning data determination module 212 may determine finetuning data by processing retrieved domain data based on deviation information and domain comparison results to create targeted training examples that address specific deficiencies. In some cases, the finetuning data determination module 212 may implement contextual feedback content generation where the secondary LLM actively tailors content to address user feedback by integrating industry-specific scenarios and adapting to evolving language patterns within the identified domains. The finetuning data determination module 212 may also determine whether to use a subset of predefined attributes or the complete set of attributes for finetuning based on the deviation information. The finetuning data determination module 212 may further incorporate cross-verification mechanisms to assess and validate the quality of generated finetuning data, enhancing reliability and accuracy while mitigating potential biases that could be introduced during the fine-tuning process.
[0054] Accordingly, the training module 214 may serve as the final component within the functional block diagram 200 to implement the adaptive training of the primary LLM using the finetuning data generated by the finetuning data determination module 212. The training module 214 may periodically modify parameters of the primary LLM using the finetuning data when new domains are being incorporated, or may finetune labels associated with existing training data when working within domains already included in the first set of domains. In some cases, the training module 214 may implement iterative training procedures that continuously refine the primary LLM based on ongoing user feedback and evolving domain requirements. The training module 214 may also coordinate with the primary LLM server 102 via the communication network 112 to ensure that training updates are properly synchronized and applied to the primary LLM hosted on the primary LLM server 102. The training module 214 may further implement comprehensive insights into human interactions within specific domains by incorporating understanding of communication styles, tone preferences, and unwritten rules governing effective human-machine interactions to ensure that the trained primary LLM aligns with domain-specific interaction expectations and user preferences.
[0055] It should be noted that all such aforementioned modules 202-214 may be represented as a single module or a combination of different modules. Further, as will be appreciated by those skilled in the art, each of the modules 202-214 may reside, in whole or in parts, on one device or multiple devices in communication with each other. In some embodiments, each of the modules 202-214 may be implemented as dedicated hardware circuit comprising custom application-specific integrated circuit (ASIC) or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. Each of the modules 202-214 may also be implemented in a programmable hardware device such as a field programmable gate array (FGPA), programmable array logic, programmable logic device, and so forth. Alternatively, each of the modules 202-214 may be implemented in software for execution by various types of processors (e.g. first processor 104, second processor 108). An identified module of executable code may, for instance, include one or more physical or logical blocks of computer instructions, which may, for instance, be organized as an object, procedure, function, or other construct. Nevertheless, the executables of an identified module or component need not be physically located together but may include disparate instructions stored in different locations which, when joined logically together, include the module and achieve the stated purpose of the module. Indeed, a module of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different applications, and across several memory devices.
[0056] Referring now to FIG. 3, a flow diagram 300 of a methodology to fine-tuning a primary LLM using a secondary LLM is illustrated, in accordance with an embodiment of the present disclosure. FIG. 3 is explained in conjunction with the FIG. 1 and FIG. 2. In an embodiment, the flow diagram 300 may include a plurality of steps that may be performed by various modules of the secondary LLM server 110 so as to fine-tuning a primary LLM using a secondary LLM.
[0057] At step 302, the secondary LLM may be determine at least one user feedback associated with at least one output generated by the primary LLM for at least one user query. A user query may be asked related to a specific domain based on the user interest. Further the user query is fed to the primary LLM through a user device 118. Accordingly, the primary LLM may process the user query and generate an output to the user. Further, the user may provide their input as one or more feedback for improvement of the output generated by the primary LLM. It may be noted that the primary LLM may be pre-trained based on first training data related to a first set of domains. Thus, the primary LLM may be proficient in generating an output for queries related to the first set of domains. In case the user query includes matter related to other domains different from the first set of domains, the generated output may lack in terms of accuracy, clarity, structure, tone, etc. Thus, the feedback provided by the user may be indicative of deficiency in the output generated by the primary LLM.
[0058] At step 304, the secondary LLM may determine feedback context information of the at least one user feedback by determining a context type from a plurality of context types associated with the at least one user feedback. In an embodiment, the plurality of context types may be, but not limited to, a corrections, clarifications, or a specific contextual queries related to the one or more feedback. The feedback context information is determined in order to understand where the actual output lacks in terms of the context of the user query in order to understand how output misaligns with the user expectations.
[0059] Further at step 306, the secondary LLM may determine one or more domains associated with the at least one user query and the at least one user feedback. Further at step 308, the secondary LLM may determine deviation information of the at least one output based on an analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes. In an embodiment, the set of predefined attributes may include lexicon attributes, terminology attributes, contextual attributes, human-interaction attributes, user-centric language attributes, real-world scenario simulation attributes, continuous learning attributes, cross-domain adaptability attributes and emotional and sentimental attributes. Thus, the deviation information may include information about the identified gaps, areas of improvement of the output generated by the primary LLM based on the set of predefined attributes. Further, it may be noted that the secondary LLM may be trained to determine the deviation information based on the set of predefined attributes.
[0060] Further at step 310, the secondary LLM may retrieve data related to the one or more domains from one of an external database or the first training data based on a comparison of the one or more domains and the first set of domains. Retrieval based on a comparison of the one or more domains and the first set of domains is described in greater detail in FIG. 4 below. Further at step 312, the secondary LLM may determine finetuning data by processing the retrieved data based on the deviation information and the comparison. In case the data is retrieved from the first training data, the secondary LLM may determine a subset of attributes from the set of predefined attributes based on the deviation information. Accordingly, the finetuning data may be determined based on the subset of attributes. Further, in case the data is retrieved from the external database, the secondary LLM may determine the finetuning data based on each of the set of attributes.
[0061] Further at step 314, the secondary LLM may adaptively train the primary LLM based on the finetuning data in a periodic manner. In case the finetuning data is determined based on the subset of attributes, the adaptive training of the primary LLM may be performed by finetuning a set of labels associated with the data based on the subset of attributes. Further, in case the finetuning data is determined based on each of the set of attributes, the adaptive training of the primary LLM may be performed periodically by modifying a set of parameters of the primary LLM using the finetuning data.
[0062] Referring now to FIG. 4, a flow diagram 400 of a methodology to determine an inclusive state of the one or more domains is illustrated, in accordance with an embodiment of the present disclosure. At step 402, an inclusive state of the at least one domain is determined based on the comparison of the one or more domains and the first set of domains. It may be noted that the one or more domains associated with the user query and the one or more feedback may be determined. Accordingly, the primary LLM may not be trained in the one or more domains or may be hallucinating in generating outputs with respect to the user query pertaining to the one or more domains. It may be noted that based on presence of each of the one or more domains in the first set of domains based on which the primary LLM has been pre-trained the secondary LLM may determine the inclusive state of the one or more domains as one of inclusive or not-inclusive.
[0063] Further at step 404, in case the inclusive state of the one or more domains is determined as inclusive, the secondary LLM may retrieve the data may from the first training data saved in the memory 106. In case, at step 404, the inclusive state of the one or more domains is determined as not-inclusive, the secondary LLM may retrieve the data from the external database. Accordingly, at step 406, the second LLM may determine a subset of attributes from the set of predefined attributes based on the deviation information. Accordingly, the primary LLM may be trained accordingly based on the finetuning data as discussed above in step 312 of FIG. 3.
[0064] Thus, the disclosed method 300 and system 100 overcome the challenges associated with the manual finetuning of LLMs by providing an approach to finetune a primary LLM by using a secondary LLM. The disclosed method 300 and system 100 continuously fine-tunes a primary LLM using insights extracted by secondary LLM. This continuous fine-tuned loop allows the system to keep up to date with the contextual knowledge of the specific domain. The disclosed method 300 and system 100 are time and cost effective in terms of saving the time and cost consumed by the manual intervention by the human to finetune the primary LLM. Further the invention provides the user centric output with right communication styles and right tone based on the user expectations.
[0065] As will be appreciated by those skilled in the art, the techniques described in the various embodiments discussed above are not routine, or conventional, or well-understood in the art. The techniques discussed above provide for generating editable schematic illustration of non-editable schematic illustration.
[0066] In light of the above-mentioned advantages and the technical advancements provided by the disclosed method and system, the claimed steps as discussed above are not routine, conventional, or well understood in the art, as the claimed steps enable the following solutions to the existing problems in conventional technologies. Further, the claimed steps clearly bring an improvement in the functioning of the device itself as the claimed steps provide a technical solution to a technical problem.
[0067] Machine readable storage including machine-readable instructions, when executed, to implement a method or realize the system in any of the examples of the present application. The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0068] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, non-volatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0069] Various techniques, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, a non-transitory computer readable storage medium, or any other machine-readable storage medium wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the various techniques. In the case of program code execution on programmable computers, the computing device may include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. The volatile and non-volatile memory and/or storage elements may be a RAM, an EPROM, a flash drive, an optical drive, a magnetic hard drive, or another medium for storing electronic data. The eNB (or other base station) and UE (or other mobile station) may also include a transceiver component, a counter component, a processing component, and/or a clock component or timer component. One or more programs that may implement or utilize the various techniques described herein may use an application programming interface (API), reusable controls, and the like. Such programs may be implemented in a high-level procedural or an object-oriented programming language to communicate with a computer system. However, the program(s) may be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or an interpreted language, and combined with hardware implementations.
[0070] It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims. , Claims:CLAIMS
I/We Claim:
1. A method (300) for fine-tuning a primary large language model (LLM) using a secondary LLM, the method (300) comprising:
determining (302), by the secondary LLM, at least one user feedback associated with at least one output generated by the primary LLM for at least one user query,
wherein the primary LLM is pre-trained based on first training data related to a first set of domains;
determining (304), by the secondary LLM, feedback context information of the at least one user feedback by determining a context type from a plurality of context types associated with the at least one user feedback;
determining (306), by the secondary LLM, at least one domain associated with the at least one user query and the at least one user feedback;
determining (308), by the second LLM, deviation information of the at least one output based on an analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes,
wherein the secondary LLM is trained to determine the deviation information from the set of predefined attributes;
retrieving (310), by the secondary LLM, data related to the at least one domain from one of: an external database or the first training data based on a comparison of the at least one domain and the first set of domains;
determining (312), by the secondary LLM, finetuning data by processing the data based on the deviation information and the comparison; and
adaptively training (314), by the secondary LLM, the primary LLM based on the finetuning data periodically.

2. The method (300) as claimed in claim 1, wherein the set of predefined attributes comprises lexicon attributes, terminology attributes, contextual attributes, human-interaction attributes, user-centric language attributes, real-world scenario simulation attributes, continuous learning attributes, cross-domain adaptability attributes and emotional and sentiment attributes.

3. The method (300) as claimed in claim 1, comprising:
determining an inclusive state of the at least one domain as one of inclusive or not-inclusive based on the comparison,
wherein the inclusive state of the at least one domain is determined as inclusive in case the first set of domains is inclusive of the at least one domain, and
wherein the inclusive state of the at least one domain is determined as non-inclusive in case the first set of domains is not inclusive of the at least one domain.

4. The method (300) as claimed in claim 3, comprising:
retrieving, by the secondary LLM, the data related to the at least one domain from the first training data in case the inclusive state of the at least one domain is determined as inclusive; and
determining, by the secondary LLM, a subset of attributes from the set of predefined attributes based on the deviation information,
wherein the finetuning data is determined based on the subset of attributes, and
wherein the adaptive training of the primary LLM comprises finetuning a set of labels associated with the data based on the subset of attributes.

5. The method (300) as claimed in claim 3, comprising:
retrieving, by the secondary LLM, the data related to the at least one domain from the external database in case the inclusive state of the at least one domain is determined as non-inclusive,
wherein the finetuning data is determined based on each of the set of attributes, and
wherein the adaptive training of the primary LLM comprises periodically modifying a set of parameters of the primary LLM using the finetuning data.

6. A system (100) for fine-tuning a primary large language model (LLM) using a secondary LLM, the system (100) comprising:
a processor (108) enabling the secondary LLM and communicably coupled to a server (102) enabling the primary LLM;
a memory (109) communicably coupled to the processor (108), wherein the memory (109) stores processor-executable instructions, which, on execution by the processor (108), cause the processor (108) to:
determine, using the secondary LLM, at least one user feedback associated with at least one output generated by the primary LLM for at least one user query,
wherein the primary LLM is pre-trained based on a first training data related to a first set of domains;
determine, using the secondary LLM, feedback context information of the at least one user feedback,
wherein the feedback context information comprises determining a context type from a plurality of context types associated with the at least one user feedback;
determine, using the secondary LLM, at least one domain associated with the at least one user query and the at least one user feedback;
determine, using the secondary LLM, deviation information of the at least one output based on an analysis of the at least one user feedback, the feedback context information, and the at least one output with respect to a set of predefined attributes,
wherein the secondary LLM is trained to determine the deviation information from the set of predefined attributes;
retrieve, using the secondary LLM, data related to the at least one domain from one of: an external database or the first training data based on a comparison of the at least one domain and the first set of domains;
determine, using the secondary LLM, finetuning data by processing the data based on the deviation information and the comparison; and
adaptively train, using the secondary LLM, the primary LLM based on the finetuning data periodically.

7. The system (100) as claimed in claim 6, wherein the set of predefined attributes comprises lexicon attributes, terminology attributes, contextual attributes, human-interaction attributes, user-centric language attributes, real-world scenario simulation attributes, continuous learning attributes, cross-domain adaptability attributes and emotional and sentiment attributes.

8. The system (100) as claimed in claim 6, wherein the processor (108) is configured to:
determine an inclusive state of the at least one domain as one of inclusive or not-inclusive based on the comparison,
wherein the inclusive state of the at least one domain is determined as inclusive in case the first set of domains is inclusive of the at least one domain, and
wherein the inclusive state of the at least one domain is determined as non-inclusive in case the first set of domains is not inclusive of the at least one domain.

9. The system (100) as claimed in claim 8, wherein the processor (108) is configured to:
retrieve, the data related to the at least one domain from the first training data in case the inclusive state of the at least one domain is determined as inclusive; and
determine, a subset of attributes from the set of predefined attributes based on the deviation information,
wherein the finetuning data is determined based on the subset of attributes, and
wherein the adaptive training of the primary LLM comprises finetuning a set of labels associated with the data based on the subset of attributes.

10. The system (100) as claimed in claim 8, wherein the processor (108) is configured to:
retrieve the data related to the at least one domain from the external database in case the inclusive state of the at least one domain is determined as non-inclusive,
wherein the finetuning data is determined based on each of the set of attributes, and
wherein the adaptive training of the primary LLM comprises periodically modifying a set of parameters of the primary LLM using the finetuning data.

Documents

Application Documents

# Name Date
1 202511087630-STATEMENT OF UNDERTAKING (FORM 3) [15-09-2025(online)].pdf 2025-09-15
2 202511087630-REQUEST FOR EXAMINATION (FORM-18) [15-09-2025(online)].pdf 2025-09-15
3 202511087630-REQUEST FOR EARLY PUBLICATION(FORM-9) [15-09-2025(online)].pdf 2025-09-15
4 202511087630-PROOF OF RIGHT [15-09-2025(online)].pdf 2025-09-15
5 202511087630-POWER OF AUTHORITY [15-09-2025(online)].pdf 2025-09-15
6 202511087630-FORM-9 [15-09-2025(online)].pdf 2025-09-15
7 202511087630-FORM 18 [15-09-2025(online)].pdf 2025-09-15
8 202511087630-FORM 1 [15-09-2025(online)].pdf 2025-09-15
9 202511087630-FIGURE OF ABSTRACT [15-09-2025(online)].pdf 2025-09-15
10 202511087630-DRAWINGS [15-09-2025(online)].pdf 2025-09-15
11 202511087630-DECLARATION OF INVENTORSHIP (FORM 5) [15-09-2025(online)].pdf 2025-09-15
12 202511087630-COMPLETE SPECIFICATION [15-09-2025(online)].pdf 2025-09-15