Abstract: ABSTRACT SYSTEM AND METHOD FOR TRAINING A PLURALITY OF AI MODELS IN A SEQUENTIAL ORDER The present invention relates to a system (120) and a method (500) for training of a plurality of ai models in a sequential order is disclosed. The system (120) includes a receiving unit (220) configured to receive data required to perform a task from data sources. The system (120) includes a training unit (225) to train a plurality of AI models (310) in the sequential order. Each of the plurality of AI models (310) is configured to learn from interactions and transformations as the data. The system (120) includes a comparing unit (230) to compare a final output generated by a final AI model as per the sequence of the plurality of AI models (310) with a task specific objective function. The system (120) includes an updating unit (235) to update one or more parameters of the plurality of AI models (310) to minimize the task specific objective function. Ref. Fig. 2
DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION
SYSTEM AND METHOD FOR TRAINING A PLURALITY OF AI MODELS IN A SEQUENTIAL ORDER
2. APPLICANT(S)
NAME NATIONALITY ADDRESS
JIO PLATFORMS LIMITED INDIAN OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA
3.PREAMBLE TO THE DESCRIPTION
THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.
FIELD OF THE INVENTION
[0001] The present invention relates to the field of wireless communication networks, more particularly to a system and a method for training a plurality of Artificial Intelligence (AI) models arranged in a sequential order.
BACKGROUND OF THE INVENTION
[0002] With the increase in number of users, the network service provisions have been implemented for upgradations to enhance the service quality so as to keep pace with such high demand. With the advancement of technology, there is a demand for the telecommunication service to induce up to date features into the scope of provision so as to enhance user experience. For this purpose, integrating Artificial Intelligence (AI) and Machine Learning (ML) for various network practices like estimating network performance, tracking health of a network, enhancing user interactive features, and monitoring security has become essential. Incorporating advanced AI/ML methodology has become a priority to keep up with rapidly evolving telecom sector. The AI/ML incorporation is usually performed by introducing training models with specific data sets to enable them to recognize patterns, trends and based on these, to predict required output. ML training for the given data extracted from data source is performed by a specifically constructed system.
[0003] Traditional machine learning models, which depend on predefined input features, face limitations in their capacity for end-to-end learning. This hinders their ability to leverage valuable insights that could arise from interactions within a sequential model pipeline. Then again, the complexity and multimodality of data in real-world applications pose a significant challenge for effective processing and the current approaches struggle to seamlessly integrate various data types or sources, hindering the development of robust and adaptable solutions.
[0004] In contemporary machine learning approaches, achieving hierarchical learning, where higher-level concepts are built upon lower-level ones. There exists a lack of appropriate mechanisms to effectively propagate and integrate information across different levels of abstraction, limiting the model's capacity to grasp complex relationships within the data. Some tasks, like time series prediction or natural language processing, require models to understand and process sequences of data. This pipeline approach might naturally handle such sequential data.
[0005] Considering the above, there is a requirement for a method which would effectively propagate and integrate data to recognize patterns clearly and refine the learning process by stimulating complex machine learning methodologies. There is also the need for a system that may adapt to changing data over time with progressive learning e.g., evolving for providing recommendations for improvement of the system or for anomaly detection, and for the continuous training by incrementally updating the models in the pipeline.
[0006] There is a need for a method and a system thereof to effectively propagate and integrate information across different levels of abstraction. The system and method thereof may be further configured to perform sequential AI model training through output-to-input pipelining.
SUMMARY OF THE INVENTION
[0007] One or more embodiments of the present disclosure provide a method and a system for training of a plurality of Artificial Intelligence (AI) models arranged in a sequential order.
[0008] In one aspect of the present invention, the method for training of a plurality of AI models arranged in a sequential order is disclosed. The method includes the step of receiving, by one or more processors, data required to perform a task from one or more data sources. The method includes the step of training, by the one or more processors, the plurality of AI models in the sequential order utilizing the received data. Each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models. The method includes the step of comparing, by the one or more processors, a final output generated by a final AI model as per the sequence of the plurality of AI models with a task specific objective function. The method includes the step of updating, by the one or more processors, one or more parameters of each AI model of the plurality of AI models to minimize the task specific objective functions.
[0009] In one embodiment, the one or more data sources is at least one of a Network Management System (NMS) and a Fulfilment Management System (FMS), and wherein a type of the data is one of a text type data, an image type data, and a numerical type data.
[0010] In another embodiment, a first AI model in the plurality of AI models extracts one or more features of the received data. An output of the first AI model is an input for a subsequent AI model as per the sequence of the plurality of AI models. An output of each of the AI model in the plurality of AI models is an input for the subsequent AI model.
[0011] In another aspect of the present invention, the system for training of a plurality of Artificial Intelligence (AI) models arranged in a sequential order is disclosed. The system includes a receiving unit configured to receive data required to perform a task from one or more data sources. The system includes a training unit configured to train, a plurality of AI models in the sequential order utilizing the received data. Each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models. The system includes a comparing unit configured to compare a final output generated by a final AI model as per the sequence of the plurality of AI models with a task specific objective function. The system includes an updating unit configured to update one or more parameters of each AI model of the plurality of AI models to minimize the task specific objective functions.
[0012] In another aspect of the embodiment, a non-transitory computer-readable medium stored thereon computer-readable instructions that, when executed by a processor is disclosed. The processor is configured to receive data required to perform a task from one or more data sources. The processor is configured to train a plurality of AI models in the sequential order utilizing the received data. Each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models. The processor is configured to compare a final output generated by a final AI model as per the sequence of the plurality of AI models with a task specific objective function. The processor is configured to update one or more parameters of each AI model of the plurality of AI models to minimize the task specific objective functions.
[0013] Other features and aspects of this invention will be apparent from the following description and the accompanying drawings. The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art, in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0015] FIG. 1 is an exemplary block diagram of an environment for training of a plurality of Artificial Intelligence (AI) models arranged in a sequential order, according to one or more embodiments of the present disclosure;
[0016] FIG. 2 is an exemplary block diagram of a system for training of the plurality of AI models arranged in the sequential order, according to the one or more embodiments of the present disclosure;
[0017] FIG. 3 is a block diagram of an architecture that can be implemented in the system of FIG.2, according to the one or more embodiments of the present disclosure;
[0018] FIG. 4 is a flow chart illustrating a method for training of the plurality of AI models arranged in the sequential order, according to the one or more embodiments of the present disclosure; and
[0019] FIG. 5 is a flow diagram illustrating the method for training of the plurality of AI models arranged in the sequential order, according to the one or more embodiments of the present disclosure.
[0020] The foregoing shall be more apparent from the following detailed description of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0021] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. It must also be noted that as used herein and in the appended claims, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise.
[0022] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure including the definitions listed here below are not intended to be limited to the embodiments illustrated but is to be accorded the widest scope consistent with the principles and features described herein.
[0023] A person of ordinary skill in the art will readily ascertain that the illustrated steps detailed in the figures and here below are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0024] Referring to FIG. 1, FIG. 1 illustrates an exemplary block diagram of an environment 100 for training of a plurality of Artificial Intelligence (AI) models 310 (as shown in FIG.3) arranged in a sequential order, according to one or more embodiments of the present invention. The environment 100 includes a network 105, a User Equipment (UE) 110, a server 115, and a system 120. The UE 110 aids a user to interact with the system 120 for training of the plurality of AI models 310 arranged in a sequential order. In an embodiment, the user is at least one of, a network operator, and a service provider. Training the plurality of AI models 310 arranged in the sequential order typically involves a multi-stage process where each AI model builds upon the outputs or features learned by the previous AI model. The sequential training approach can enhance performance, especially in complex tasks like natural language processing or computer vision.
[0025] For the purpose of description and explanation, the description will be explained with respect to the UE 110, or to be more specific will be explained with respect to a first UE 110a, a second UE 110b, and a third UE 110c, and should nowhere be construed as limiting the scope of the present disclosure. Each of the UE 110 from the first UE 110a, the second UE 110b, and the third UE 110c is configured to connect to the server 115 via the network 105. In an embodiment, each of the first UE 110a, the second UE 110b, and the third UE 110c is one of, but not limited to, any electrical, electronic, electro-mechanical or an equipment and a combination of one or more of the above devices such as smartphones, virtual reality (VR) devices, augmented reality (AR) devices, laptop, a general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer, or any other computing device.
[0026] The network 105 includes, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof. The network 105 may include, but is not limited to, a Third Generation (3G), a Fourth Generation (4G), a Fifth Generation (5G), a Sixth Generation (6G), a New Radio (NR), a Narrow Band Internet of Things (NB-IoT), an Open Radio Access Network (O-RAN), and the like.
[0027] The server 115 may include by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof. In an embodiment, the entity may include, but is not limited to, a vendor, a network operator, a company, an organization, a university, a lab facility, a business enterprise, a defense facility, or any other facility that provides content.
[0028] The environment 100 further includes the system 120 communicably coupled to the server 115 and each of the first UE 110a, the second UE 110b, and the third UE 110c via the network 105. The system 120 is configured for training of the plurality of AI models 310 arranged in the sequential order. The system 120 is adapted to be embedded within the server 115 or is embedded as the individual entity, as per multiple embodiments of the present invention.
[0029] Operational and construction features of the system 120 will be explained in detail with respect to the following figures.
[0030] FIG. 2 is an exemplary block diagram of the system 120 for training of the plurality of AI models 310 arranged in the sequential order, according to one or more embodiments of the present disclosure.
[0031] The system 120 includes a processor 205, a memory 210, a user interface 215, and a database 240. For the purpose of description and explanation, the description will be explained with respect to one or more processors 205, or to be more specific will be explained with respect to the processor 205 and should nowhere be construed as limiting the scope of the present disclosure. The one or more processor 205, hereinafter referred to as the processor 205 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, single board computers, and/or any devices that manipulate signals based on operational instructions.
[0032] As per the illustrated embodiment, the processor 205 is configured to fetch and execute computer-readable instructions stored in the memory 210. The memory 210 may be configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory 210 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as EPROM, flash memory, and the like.
[0033] The User Interface (UI) 215 includes a variety of interfaces, for example, interfaces for a Graphical User Interface (GUI), a web user interface, a Command Line Interface (CLI), and the like. The user interface 215 facilitates communication of the system 120. In one embodiment, the user interface 215 provides a communication pathway for one or more components of the system 120. Examples of the one or more components include, but are not limited to, the UE 110, and the database 240.
[0034] The database 240 is one of, but not limited to, a centralized database, a cloud-based database, a commercial database, an open-source database, a distributed database, an end-user database, a graphical database, a No-Structured Query Language (NoSQL) database, an object-oriented database, a personal database, an in-memory database, a document-based database, a time series database, a wide column database, a key value database, a search database, a cache databases, and so forth. The foregoing examples of database types are non-limiting and may not be mutually exclusive, e.g., a database can be both commercial and cloud-based, or both relational and open-source, etc.
[0035] Further, the processor 205, in an embodiment, may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processor 205. In the examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processor 205 may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for processor 205 may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the memory 210 may store instructions that, when executed by the processing resource, implement the processor 205. In such examples, the system 120 may comprise the memory 210 storing the instructions and the processing resource to execute the instructions, or the memory 210 may be separate but accessible to the system 120 and the processing resource. In other examples, the processor 205 may be implemented by electronic circuitry.
[0036] In order for the system 120 to train the plurality of AI models 310 arranged in the sequential order, the processor 205 includes a receiving unit 220, a training unit 225, a comparing unit 230, and an updating unit 235 communicably coupled to each other. In an embodiment, the operations and functionalities of the receiving unit 220, the training unit 225, the comparing unit 230, and the updating unit 235 can be used in combination or interchangeably.
[0037] The receiving unit 220 is configured to receive data required to perform a task from one or more data sources. In an embodiment, the one or more data sources are at least one of a Network Management System (NMS) and a Fulfilment Management System (FMS). In an embodiment, a type of the data received from the one or more data sources is one of a text type data, an image type data, and a numerical type data. The text type data involves natural language data or other textual information. The image type data involves visual data, which is relevant for tasks like image recognition or processing. The numerical type data involves quantitative data that could be used for statistical analysis or modeling. In an embodiment, the task is at least one of predicting future values in a time series and generating natural language responses. In another embodiment, the task includes, but not limited to, a classification task, a clustering task, and a regression task. The predicting future values in the time series involve forecasting trends or values based on historical data. Generating natural language responses involve creating responses or dialogues in natural language. The task can provide an effective way to process and understand the data.
[0038] Upon receiving the data required to perform the task from one or more data sources, the training unit 225 is configured to train the plurality of AI models 310 in the sequential order. The training unit 225 trains the plurality of AI models 310 in a specific sequence. Each model is trained one after the other, rather than simultaneously. The output of one model may serve as input for the next, allowing for cumulative learning. Each of the plurality of AI models 310 are configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models 310. Each model is designed to learn from its interactions with the data, which can involve understanding patterns, relationships, and nuances within the data. The interactions refer to how each AI model engages with the data it receives. The data from the one or more data sources is analyzed to identify patterns or features. The output is generated based on the data, which may inform subsequent models in the sequence. By utilizing the outputs from previous AI models or validation data to adjust the model parameters and improve future performance.
[0039] As per the above embodiment, the transformations involve altering the data passing through each AI model, the data undergoes various transformations, which include feature extraction, normalization or scaling, dimensionality reduction, or other preprocessing techniques. The feature extraction involves identifying and isolating relevant features from the raw data to simplify the input for subsequent models. The normalization or scaling refers to adjusting the data range or distribution to ensure consistency across the inputs. The dimensionality reduction refers to reducing the number of variables under consideration to focus on the significant features. The continuous learning implies that each AI model adapts and improves over time based on the data processes and the feedback it receives, which allows for a more robust learning framework. Owing to continuous learning process, each of the plurality of AI models 310 refine their capabilities through iterative training. Together, the interactions and transformations enable each AI model of the plurality of AI models 310 to learn effectively from the data, refining its outputs and improving overall performance as each AI model processes information in a sequential manner.
[0040] As per the above embodiment, a first AI model 310a (as shown in FIG.3) in the plurality of AI models 310 extracts one or more features of the received data. The first AI model 310a analyzes the received data to identify and extract the one or more features. The first AI model 310a involve transforming the raw data into a set of characteristics that are more informative for the task at hand. In an exemplary embodiment, in image processing, the image type data include edges, colors, or textures, while in text analysis, the text type data involves keywords or sentiment indicators. The one or more features extracted by the first AI model 310a become the output that serves as input for the next model in the sequence. Each AI model creates a flow of information, where each model builds upon the previous one’s findings. An output of the first AI model 310a is an input for a subsequent AI model as per the sequence of the plurality of AI models 310. The output of each of the AI model in the plurality of AI models 310 is an input for the subsequent AI model. Each subsequent AI model in the sequence takes the output from the preceding model as its input. The second AI model 310b processes the features extracted by the first AI model 310a. The third AI model (not shown in the figure) uses the output of the second AI model 310b, and so on. By structuring the plurality of AI models 310 in this way, each AI model focuses on refining the information it receives, allowing for more complex analyses and improved predictions as the data is transformed through the sequence.
[0041] Upon the sequential training of the AI models, the final AI model produces a final output. The final output represents the culmination of the data processing and transformations applied throughout the entire sequence. Upon training the plurality of AI models 310 in the sequential order, the comparing unit 230 is configured to compare the final output generated by the final AI model as per the sequence of the plurality of AI models 310 with a task specific objective function. The task specific objective function serves as a benchmark or criterion that defines the desired outcome for the specific task. The task specific objective function corresponds to a loss function if the task corresponds to one of classification and regression tasks. The task specific objective function quantifies how well the final output meets the criterion. In an exemplary embodiment, the task specific objective function might measure accuracy in a classification task, error in a regression task, or relevancy in a natural language processing task. By comparing the final output to the task specific objective function, the comparing unit 230 assesses the effectiveness of the entire model sequence. The comparing unit 230 further helps to determine if the plurality of AI models 310 have successfully learned from the data and are producing useful, accurate results.
[0042] Upon comparing the final output generated by the final AI model with the task specific objective function, the updating unit 235 is configured to update one or more parameters of each AI model of the plurality of AI models 310 to minimize the task specific objective function. In an embodiment, the one or more parameters is updated via at least one of gradient descent. The final model output is compared to the desired output using a task-specific objective function (loss function). The gradients of the loss function with respect to the one or more parameters are computed using techniques like backpropagation. The gradients indicate the direction and rate of change needed to minimize the loss. The one or more parameters are then updated using the gradient descent, which involves adjusting each parameter in an opposite direction of the gradient by a certain learning rate. The updating process is repeated iteratively, continually refining the one or more parameters until the loss function converges to an acceptable level.
[0043] Upon updating the one or more parameters of each AI model of the plurality of AI models 310 to minimize the task specific objective function, the system 120 is further configured to validate performance of the trained plurality of AI models 310 utilizing a validation dataset. The validation dataset is an unseen data received from the one or more data sources. Upon validation, the plurality of AI models 310 as per the sequence is deployed for performing the task on new data. The sequence of the plurality of AI models 310 is determined based on the performance of each AI model during validation. By performing the task on new data, which enhances reliability and robustness of the plurality of AI models 310.
[0044] FIG. 3 is a block diagram of an architecture 300 that can be implemented in the system of FIG.2, according to one or more embodiments of the present disclosure. The architecture 300 of the system 120 includes a data collection and integration unit 305, the plurality of AI models 310, a workflow manager 315, the database 240, and the UI 215.
[0045] Initially, the user transmits the request to the data collection and integration module 305 for collecting the data. The data collection and integration module 305 to collect and integrate the data from the one or more data sources. In an embodiment, the one or more data sources are at least one of the Network Management System (NMS) and the Fulfilment Management System (FMS). The type of the data received from the one or more data sources is one of the text type data, the image type data, and the numerical type data. In another embodiment, the system 120 is configured to analyze various data input. The data include, but not limited to, a file input, a source path, an input stream, a Hypertext Transfer Protocol (HTTP2), a Hadoop Distributed File System (HDFS), Network Data Analytics Function (NWDAF), NMS, servers and Network Attached Storage (NAS).
[0046] Upon collecting and integrating the data from the one or more data sources, the processor 205 includes the plurality of AI models 310. The processor 205 is configured to train the plurality of AI models 310 arranged in a sequence order. The first AI model 310a extracts the one or more features of the received data. The first AI model 310a analyzes the received data to identify and extract the one or more features. The one or more features extracted by the first AI model 310a become the output that serves as input for the next model in the sequence, then the output is fed back the same model as feedback and to the next model in sequence as input. This creates the flow of information where each model builds upon the previous one’s findings.
[0047] As per the above embodiment, the output of the first AI model 310a is the input for the subsequent AI model as per the sequence of the plurality of AI models 310. The output of each of the AI model in the plurality of AI models 310 is the input for the subsequent AI model. Each subsequent AI model in the sequence takes the output from the preceding model as its input. The second AI model 310b processes the one or more features extracted by the first AI model 310a. The third model uses the output of the second AI model 310b, and so on. By structuring the plurality of AI models 310 in this way, each AI model focuses on refining the information it receives, allowing for more complex analyses and improved predictions as the data is transformed through the sequence.
[0048] The system 120 may further include the UI 215 to interact with the user and receive instructions, and the workflow manager 315 configured to relay information to the user via the user interface 215. The system 120 is further configured to interact with the database 240 in the network 105. The data is collected from the database 240 or the server 115 across the network 105, for further data preprocessing which is used for data definition, data normalization or data cleaning. The system 120 is also connected to the workflow manager 315 to which the UI 215 sends the request to and which further sends the output to the present system 120. The system 120 is further connected to an Integrated Performance Management (IPM) that helps the system operators to interrelate a set of activities, connecting the metrics, processes and systems used to monitor and manage business performance while constantly monitoring the performance counters and Key Performance Indicators (KPIs) of the network elements.
[0049] The database 240 is configured to store past data, dynamic data, and trained models for future necessity. The system 120 may also allow the operators to initiate the process of data cleaning and normalization manually if required and the system 120 transmits notification or reports generated to operators (configurable) via the user interface 215.
[0050] The present system 120 is configured to interact with the application servers and the Integrated Performance Management (IPM) in the network 105 via the Application Programming Interface (API) as medium of communication and may perform the process by means of various formats such as JavaScript Object Notation (JSON), Python or any other compatible formats.
[0051] FIG. 4 is a flow chart illustrating a method 400 for training of the plurality of AI models 310 arranged in the sequential order, according to one or more embodiments of the present disclosure.
[0052] At step 405, the data collection and integration unit 305 is configured to receive data required to perform the task from one or more data sources. In an embodiment, the one or more data sources are at least one of the NMS and the FMS. The type of the data received from the one or more data sources is one of the text type data, the image type data, and the numerical type data. In an embodiment, the task is at least one of predicting future values in a time series and generating natural language responses.
[0053] At step 410, the training unit 225 is configured to train the plurality of AI models 310 in the sequential order upon receiving the collected and integrated data from the one or more data sources. The training unit 225 trains the plurality of AI models 310 in a specific sequence. Each model is trained one after the other, rather than simultaneously. The output of one model may serve as input for the next, allowing for cumulative learning. Each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models 310. Each model is designed to learn from its interactions with the data, which can involve understanding patterns, relationships, and nuances within the data. The interactions refer to how each AI model engages with the data it receives. The data from the one or more data sources is analyzed to identify patterns or features. The output is generated based on the data, which may then inform subsequent models in the sequence. By utilizing the outputs from previous models or validation data to adjust the model parameters and improve future performance.
[0054] At step 415, in an exemplary embodiment, the output of the first AI model 310a is the input for the subsequent AI model as per the sequence of the plurality of AI models 310. The output of each of the AI model in the plurality of AI models 310 is the input for the subsequent AI model. Each subsequent AI model in the sequence takes the output from the preceding model as its input. The second AI model 310b processes the features extracted by the first AI model 310a. The third model uses the output of the second AI model 310b, and so on. By structuring the plurality of AI models 310 in this way, each AI model focuses on refining the information it receives, allowing for more complex analyses and improved predictions as the data is transformed through the sequence.
[0055] At step 420, each of the plurality of AI models 310 is configured to learn from interactions and transformations as the data. The transformations involve altering the data passing through each AI model upon generating the output of the each AI model, the data undergoes various transformations, which include feature extraction, normalization or scaling, dimensionality reduction, or other preprocessing techniques that prepare the data for the next model in the sequence. The feature extraction involves identifying and isolating relevant features from the raw data to simplify the input for subsequent models. The normalization or scaling refers to adjusting the data range or distribution to ensure consistency across the inputs. The continuous learning implies that each AI model adapts and improves over time based on the data processes and the feedback it receives, which allows for a more robust learning framework. Owing to continuous learning process, each of the plurality of AI models 310 refine their capabilities through iterative training. Together, interactions and transformations enable each AI model to learn effectively from the data, refining its outputs and improving overall performance as it processes information in a sequential manner.
[0056] At step 425, the comparing unit 230 is configured to compare the final output generated by the final AI model as per the sequence of the plurality of AI models with the task specific objective function. The task specific objective function corresponds to a loss function if the task corresponds to one of classification and regression tasks. By comparing the final output to the task specific objective function, the comparing unit 230 assesses the effectiveness of the entire model sequence. The comparing unit 230 further helps to determine if the AI models have successfully learned from the data and are producing useful, accurate results. The final model output is compared to the desired output using the task-specific objective function (loss function). The updating process is repeated iteratively, continually refining the one or more parameters until the loss function converges to an acceptable level.
[0057] At step 430, upon updating the one or more parameters of each AI model of the plurality of AI models 310 to minimize the task specific objective function, the system 120 is further configured to validate performance of the trained plurality of AI models 310 utilizing the validation dataset. The validation dataset is an unseen data received from the one or more data sources. At step 435, upon validation, the plurality of AI models 310 as per the sequence is deployed for performing the task on new data. The sequence of the plurality of AI models 310 is determined based on the performance of each AI model during validation. By performing the task on new data, which enhances reliability and robustness of the plurality of AI models 310.
[0058] FIG. 5 is a flow diagram illustrating the method 500 for training of the plurality of AI models 310 arranged in the sequential order, according to one or more embodiments of the present disclosure.
[0059] At step 505, the method 500 includes the step of receiving the data required to perform the task from the one or more data sources by the receiving unit 220. In an embodiment, the one or more data sources are at least one of the Network Management System (NMS) and the Fulfilment Management System (FMS). The type of the data received from the one or more data sources is one of the text type data, the image type data, and the numerical type data. In an embodiment, the task is at least one of predicting future values in a time series and generating natural language responses. In another embodiment, the task includes, but not limited to, a classification task, a clustering task, and a regression task. The predicting future values in the time series involve forecasting trends or values based on historical data. Generating natural language responses might entail creating responses or dialogues in natural language. The task can provide an effective way to process and understand the data.
[0060] At step 510, the method 500 includes the step of training the plurality of AI models 310 in the sequential order by the training unit 225. The training unit 225 trains the plurality of AI models 310 in a specific sequence. Each model is trained one after the other, rather than simultaneously. The output of one model may serve as input for the next, allowing for cumulative learning. Each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models 310. Each model is designed to learn from its interactions with the data, which can involve understanding patterns, relationships, and nuances within the data. The interactions refer to how each AI model engages with the data it receives. The data from the one or more data sources is analyzed to identify patterns or features. The output is generated based on the data, which may then inform subsequent models in the sequence. By utilizing the outputs from previous models or validation data to adjust the model parameters and improve future performance.
[0061] The transformations involve altering the data passing through each AI model, the data undergoes various transformations, which include feature extraction, normalization or scaling, dimensionality reduction, or other preprocessing techniques that prepare the data for the next model in the sequence. The feature extraction involves identifying and isolating relevant features from the raw data to simplify the input for subsequent models. The normalization or scaling refers to adjusting the data range or distribution to ensure consistency across the inputs. The dimensionality reduction refers to reducing the number of variables under consideration to focus on the significant features. The continuous learning implies that each AI model adapts and improves over time based on the data processes and the feedback it receives, which allows for a more robust learning framework. Owing to continuous learning process, each of the plurality of AI models 310 refine their capabilities through iterative training. Together, interactions and transformations enable each AI model to learn effectively from the data, refining its outputs and improving overall performance as it processes information in a sequential manner.
[0062] As per the above embodiment, the first AI model 310a in the plurality of AI models 310 extracts one or more features of the received data. The first AI model 310a analyzes the received data to identify and extract the one or more features. This involves transforming the raw data into a set of characteristics that are more informative for the task at hand. In an exemplary embodiment, in image processing, the image type data include edges, colors, or textures, while in text analysis, the text type data involves keywords or sentiment indicators. The one or more features extracted by the first AI model 310a become the output that serves as input for the next model in the sequence. This creates a flow of information where each model builds upon the previous one’s findings. An output of the first AI model 310a is an input for a subsequent AI model as per the sequence of the plurality of AI models 310. The output of each of the AI model in the plurality of AI models 310 is an input for the subsequent AI model. Each subsequent AI model in the sequence takes the output from the preceding model as its input. The second model 310b processes the features extracted by the first model 310a. The third model uses the output of the second model 310b, and so on. By structuring the plurality of AI models 310 in this way, each AI model focuses on refining the information it receives, allowing for more complex analyses and improved predictions as the data is transformed through the sequence.
[0063] At step 515, the method 500 includes the step of comparing the final output generated by the final AI model as per the sequence of the plurality of AI models with the task specific objective function by the comparing unit 230. The task specific objective function serves as a benchmark or criterion that defines the desired outcome for the specific task. The task specific objective function corresponds to the loss function if the task corresponds to one of classification and regression tasks. The task specific objective function quantifies how well the final output meets the criterion. In an exemplary embodiment, the task specific objective function might measure accuracy in the classification task, error in the regression task, or relevancy in the natural language processing task. By comparing the final output to the task specific objective function, the comparing unit 230 assesses the effectiveness of the entire model sequence. The comparing unit 230 further helps to determine if the AI models have successfully learned from the data and are producing useful, accurate results.
[0064] At step 520, the method 500 includes the step of updating the one or more parameters of each AI model of the plurality of AI models 310 to minimize the task specific objective function by the updating unit 235. In an embodiment, the one or more parameters is updated via at least one of gradient descent. The final model output is compared to the desired output using a task-specific objective function (loss function). The gradients of the loss function with respect to the one or more parameters are computed using techniques like backpropagation. The gradients indicate the direction and rate of change needed to minimize the loss. The one or more parameters are then updated using the gradient descent, which involves adjusting each parameter in an opposite direction of the gradient by a certain learning rate. The updating process is repeated iteratively, continually refining the one or more parameters until the loss function converges to an acceptable level.
[0065] Upon updating the one or more parameters of each AI model of the plurality of AI models 310 to minimize the task specific objective function, the method 500 further includes the step of validating the performance of the trained plurality of AI models 310 utilizing the validation dataset. The validation dataset is an unseen data received from the one or more data sources. Upon validation, the plurality of AI models 310 as per the sequence is deployed for performing the task on new data. The sequence of the plurality of AI models 310 is determined based on the performance of each AI model during validation. By performing the task on new data, which enhances reliability and robustness of the plurality of AI models 310.
[0066] In another aspect of the embodiment, a non-transitory computer-readable medium having stored thereon computer-readable instructions that, when executed by a processor 205 is disclosed. The processor 205 is configured to receive data required to perform a task from one or more data sources. The processor 205 is configured to train a plurality of AI models in the sequential order utilizing the received data. Each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models. The processor 205 is configured to compare a final output generated by a final AI model as per the sequence of the plurality of AI models with a task specific objective function. The processor 205 is configured to update one or more parameters of each AI model of the plurality of AI models to minimize the task specific objective functions.
[0067] A person of ordinary skill in the art will readily ascertain that the illustrated embodiments and steps in description and drawings (FIGS.1-5) are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0068] The present disclosure provides technical advancement for training the AI model in the sequential manner training through output-to-input pipelining. The plurality of AI models is trained by creating an unbroken chain of interconnected stages. The present invention enables the seamless transfer of information and updates between these stages, fostering efficiency and effectiveness in model training. The present invention revolutionizes the machine learning training process, offering a novel way to enhance model performance and streamline workflows across a broad spectrum of applications. The present invention transforms the conventional linear training into a dynamic and interconnected process, unlocking new possibilities in the field of machine learning.
[0069] The present disclosure offers multiple advantages such as:
-Efficient Integration of multimodal data: In real-world applications, the data often receives from the one or more data sources such as text, images, audio. The one or more trained AI model approach could seamlessly integrate these different types of data for more comprehensive analysis.
- End-to-End learning: The plurality of AI models can learn not only from the initial input data but also from the interactions between the models in the pipeline, which enables more holistic learning and better adaptation to the specific task.
- Effective Handling of sequential data: For the tasks that involve temporal or sequential data (e.g., time series prediction, natural language processing), the sequential data approach can provide a more natural and effective way to process and understand the data.
- Flexibility in model integration: Different types of models (e.g., CNNs, RNNs, transformers) can be integrated in a more flexible and coherent manner, which allows for a more versatile approach to solving complex problems.
- Continuous Learning and Adaptation: The one or more trained AI model pipeline approach may facilitate continuous training and adaptation to changing data over time. This is critical for applications where the model needs to evolve and stay relevant in dynamic environments.
- Enhanced Model Accuracy and Performance: By allowing the plurality of AI models to learn from each other in a sequential manner, it may lead to improved accuracy and performance, especially in tasks that require complex reasoning or multi-step processing.
[0070] The present invention offers multiple advantages over the prior art and the above listed are a few examples to emphasize on some of the advantageous features. The listed advantages are to be read in a non-limiting manner.
REFERENCE NUMERALS
[0071] Environment - 100
[0072] Network-105
[0073] User equipment- 110
[0074] Server - 115
[0075] System -120
[0076] Processor - 205
[0077] Memory - 210
[0078] User interface-215
[0079] Receiving unit – 220
[0080] Training unit– 225
[0081] Comparing unit- 230
[0082] Updating unit– 235
[0083] Database- 240
[0084] Architecture- 300
[0085] Data collection and integration unit-305
[0086] Plurality of AI models- 310
[0087] First AI model- 310a
[0088] Second AI model- 310b
[0089] Number of AI models- 310n
[0090] Data preprocessing module-315
[0091] Workflow manager- 315
,CLAIMS:CLAIMS
We Claim:
1. A method (500) of training of a plurality of AI models (310) arranged in a sequential order, the method (500) comprising the steps of:
receiving, by one or more processors (205), data required to perform a task from one or more data sources;
training, by the one or more processors (205), a plurality of AI models in the sequential order utilizing the received data, wherein each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models (310);
comparing, by the one or more processors (205), a final output generated by a final AI model as per the sequence of the plurality of AI models (310) with a task specific objective function; and
updating, by the one or more processors (205), one or more parameters of each AI model of the plurality of AI models (310) to minimize the task specific objective function.
2. The method (500) as claimed in claim 1, wherein the one or more data sources is at least one of a Network Management System (NMS) and a Fulfilment Management System (FMS), and wherein a type of the data is one of a text type data, an image type data, and a numerical type data.
3. The method (500) as claimed in claim 1, wherein a first AI model (310a) in the plurality of AI models (310) extracts one or more features of the received data, wherein an output of the first AI model (310a) is an input for a subsequent AI model as per the sequence of the plurality of AI models (310), and wherein an output of each of the AI model in the plurality of AI models (310) is an input for a subsequent AI model.
4. A system (120) for training of a plurality of AI models arranged in a sequential order, the system (120) comprising:
a receiving unit configured to receive, data required to perform a task from one or more data sources;
a training unit configured to train, a plurality of AI models in the sequential order utilizing the received data, wherein each of the plurality of AI models is configured to learn from interactions and transformations as the data is utilized by each of the plurality of AI models;
a comparing unit configured to compare, a final output generated by a final AI model as per the sequence of the plurality of AI models with a task specific objective function; and
an updating unit configured to update, one or more parameters of each AI model of the plurality of AI models to minimize the task specific objective functions.
5. The system (120) as claimed in claim 4, wherein the one or more data sources is at least one of a Network Management System (NMS) and a Fulfilment Management System (FMS), and wherein a type of the data is one of a text type data, an image type data, and a numerical type data.
6. The system (120) as claimed in claim 4, wherein a first AI model (310a) in the plurality of AI models (310) extracts one or more features of the received data, wherein an output of the first AI model (310a) is an input for a subsequent AI model as per the sequence of the plurality of AI models (310), and wherein an output of each of the AI model in the plurality of AI models (310) is an input for a subsequent AI model.
| # | Name | Date |
|---|---|---|
| 1 | 202321067395-STATEMENT OF UNDERTAKING (FORM 3) [07-10-2023(online)].pdf | 2023-10-07 |
| 2 | 202321067395-PROVISIONAL SPECIFICATION [07-10-2023(online)].pdf | 2023-10-07 |
| 3 | 202321067395-FORM 1 [07-10-2023(online)].pdf | 2023-10-07 |
| 4 | 202321067395-FIGURE OF ABSTRACT [07-10-2023(online)].pdf | 2023-10-07 |
| 5 | 202321067395-DRAWINGS [07-10-2023(online)].pdf | 2023-10-07 |
| 6 | 202321067395-DECLARATION OF INVENTORSHIP (FORM 5) [07-10-2023(online)].pdf | 2023-10-07 |
| 7 | 202321067395-FORM-26 [27-11-2023(online)].pdf | 2023-11-27 |
| 8 | 202321067395-Proof of Right [12-02-2024(online)].pdf | 2024-02-12 |
| 9 | 202321067395-DRAWING [07-10-2024(online)].pdf | 2024-10-07 |
| 10 | 202321067395-COMPLETE SPECIFICATION [07-10-2024(online)].pdf | 2024-10-07 |
| 11 | Abstract.jpg | 2024-12-20 |
| 12 | 202321067395-Power of Attorney [24-01-2025(online)].pdf | 2025-01-24 |
| 13 | 202321067395-Form 1 (Submitted on date of filing) [24-01-2025(online)].pdf | 2025-01-24 |
| 14 | 202321067395-Covering Letter [24-01-2025(online)].pdf | 2025-01-24 |
| 15 | 202321067395-CERTIFIED COPIES TRANSMISSION TO IB [24-01-2025(online)].pdf | 2025-01-24 |
| 16 | 202321067395-FORM 3 [31-01-2025(online)].pdf | 2025-01-31 |