Abstract: Embodiments of the present disclosure relate to a system and method for orchestrating lifecycle stages of responsible machine learning. The system receives original data from one or more sources and executes operations corresponding to lifecycle stages where each stage depends on the outcomes of preceding stages from succeeding stages, thereby ensuring compliance with privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards. FIG. 1
DESC:EARLIEST PRIORITY DATE:
This Application claims priority from a provisional patent application filed in India having Patent Application No. 202441005046, filed on January 24, 2024, and titled “SYSTEM AND METHOD FOR HOLISTIC PRIVACY ENHANCED AND RESPONSIBLE AI PLATFORM”.
FIELD OF INVENTION
[0001] Embodiments of the present disclosure pertain to a system and a method for orchestrating the lifecycle stages of machine learning (ML) models. Specifically, embodiments of the present disclosure relate to enabling end-to-end coordination of lifecycle stages of ML models, including data collection, preprocessing, model training, assessment, inference, and governance, while ensuring compliance with responsible machine learning standards, including privacy, accountability, safety, security, fairness, explainability, and reliability at each stage.
BACKGROUND
[0002] Currently, lifecycles stages of machine learning (ML) development are often assessed and handled individually, with no seamless exchange of outcomes between them. Each stage including data collection, data preprocessing, model training, assessment, deployment (inference), and governance typically operates in isolation, leading to a lack of continuous feedback that would inform earlier stages about potential issues discovered in later ones. For example, biases or fairness issues introduced during data preprocessing may not be detected until model assessment, and these issues might not be addressed until after deployment. This fragmented approach creates inefficiencies and significantly impacts responsible ML. Without an integrated approach, it becomes difficult to ensure that standards of the responsible ML, such as privacy, accountability, safety, security, fairness, and reliability are maintained at every stage. Problems like bias, security flaws, or regulatory non-compliance are often only identified late in the process, when they are more difficult and costly to address. This delay in addressing risks undermines the model’s overall ethical integrity, performance, and compliance with societal and regulatory standards.
BRIEF DESCRIPTION
[0003] In accordance with an embodiment of the present disclosure, a system is provided. The system includes a processor and a memory coupled to the processor. The memory stores instructions executable by the processor. The processor is to collect original data from one or more sources, perform operations corresponding to each stage of the lifecycle stages ensuring that outcomes of the each stage is compliant of one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of the responsible machine learning, wherein stages subsequent to a starting stage of the lifecycle stages comprises, in sequence, data pre-processing stage, model training stage, model assessment stage, and model governance stage, where inputs to each stage subsequent to the starting stage comprises outcomes of respective preceding stages of the lifecycle stages and where inputs to one or more of the lifecycle stages comprises outcomes of one or more respective succeeding stages of the stages.
[0004] In accordance with another embodiment of the present disclosure, a method is provided. The method includes collecting original data from one or more sources, performing operations corresponding to each stage of the lifecycle stages to ensure that outcomes of each stage comply with one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of responsible machine learning, wherein stages subsequent to a starting stage of the lifecycle stages comprise, in sequence, a data pre-processing stage, a model training stage, a model assessment stage, and a model governance stage, providing, as inputs to each of the stages subsequent to the starting stage, outcomes of respective preceding stages of the lifecycle stages. Further, the method includes providing, as inputs to one or more of the lifecycle stages, outcomes of one or more respective succeeding stages of the stages.
[0005] To further clarify the advantages and features of the present disclosure, a more particular description of the disclosure will follow by reference to specific embodiments thereof, which are illustrated in the appended figures. It is to be appreciated that these figures depict only typical embodiments of the disclosure and are therefore not to be considered limiting in scope. The disclosure will be described and explained with additional specificity and detail with the appended figures.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] The disclosure will be described and explained with additional specificity and detail with the accompanying figures in which:
[0007] FIG. 1 illustrates a schematic diagram of a system for orchestrating lifecycle stages of a responsible machine learning, in accordance with an example implementation of the present subject matter.
[0008] FIGS. 2-4 illustrate a method for orchestrating lifecycle stages of a responsible machine learning, in accordance with an example implementation of the present subject matter;
[0009] Further, those skilled in the art will appreciate that elements in the figures are illustrated for simplicity and may not have necessarily been drawn to scale. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the figures by conventional symbols, and the figures may show only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the figures with details that will be readily apparent to those skilled in the art having the benefit of the description herein.
DETAILED DESCRIPTION
[0010] For the purpose of promoting an understanding of the principles of the disclosure, reference will now be made to the embodiment illustrated in the figures and specific language will be used to describe them. It will nevertheless be understood that no limitation of the scope of the disclosure is thereby intended. Such alterations and further modifications in the illustrated system, and such further applications of the principles of the disclosure as would normally occur to those skilled in the art are to be construed as being within the scope of the present disclosure.
[0011] The terms “comprises”, “comprising”, or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such a process or method. Similarly, one or more devices or subsystems or elements or structures or components preceded by "comprises... a" does not, without more constraints, preclude the existence of other devices, sub-systems, elements, structures, components, additional devices, additional sub-systems, additional elements, additional structures or additional components. Appearances of the phrase "in an embodiment", "in another embodiment" and similar language throughout this specification may, but not necessarily do, all refer to the same embodiment.
[0012] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the art to which this disclosure belongs. The system, methods, and examples provided herein are only illustrative and not intended to be limiting.
[0013] In the following specification and the claims, reference will be made to a number of terms, which shall be defined to have the following meanings. The singular forms “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise.
[0014] In accordance with an embodiment of the present disclosure, a system is provided. The system includes a processor and a memory coupled to the processor. The memory stores instructions executable by the processor. The processor is to collect original data from one or more sources, perform operations corresponding to each stage of the lifecycle stages ensuring that outcomes of the each stage is compliant of one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of the responsible machine learning, wherein stages subsequent to a starting stage of the lifecycle stages comprises, in sequence, data pre-processing stage, model training stage, model assessment stage, and model governance stage, where inputs to each stage subsequent to the starting stage comprises outcomes of respective preceding stages of the lifecycle stages and where inputs to one or more of the lifecycle stages comprises outcomes of one or more respective succeeding stages of the stages.
[0015] FIG. 1 illustrates a schematic diagram of a system for orchestrating lifecycle stages of a responsible machine learning, in accordance with an example implementation of the present subject matter. Referring to FIG. 1, the system 100 may be in communication with one or more data sources 112. It may be noted here that in the figure, one data source is shown, however, in implementation, the system 100 may be in communication with more than one data sources. The communication between the data source 112 and the system 100 may be established via a communication network 114. The communication network 114 may be a wireless network, a wired network, or a combination thereof. Examples of such individual communication networks include, but are not limited to, Global System for Mobile Communication (GSM) network, Universal Mobile Telecommunications System (UMTS) network, Personal Communications Service (PCS) network, Time Division Multiple Access TDMA) network, Code Division Multiple Access (CDMA) network, Next Generation Network (NON), Public Switched Telephone Network (PSTN). Depending on the technology, the communication network 114 may include various network entities, such as gateways and routers; however, such details have been omitted for the sake of brevity of the present description.
[0016] It may be noted that the foregoing system is an exemplary system and may be implemented as computer executable instructions in any computing or processing environment, including in digital electronic circuitry or in computer hardware, firmware, device driver, or software. As such, the system is not limited to any specific hardware or software configuration.
[0017] The system 100 may include one or more computing devices, such as one or more servers (e.g., in a cloud deployment or in a data center), one or more personal computers, and/or the like.
[0018] Further, the system 100 may include a processor(s) 102, and a memory(s) 104 coupled to and accessible by the processor(s) 102. The processor(s) may fetch and execute the computer readable instructions stored in the memory(s) 104 to facilitate orchestration of lifecycle stages of responsible ML amongst other functions. The functions of various elements shown in the figs., including any functional blocks labelled as "processor(s)", may be provided through the use of dedicated hardware as well as hardware capable of executing instructions. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term "processor" would not be construed to refer exclusively to hardware capable of executing instructions, and may implicitly comprise, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA). Other hardware, standard and/or custom, may also be coupled to the processor(s) 102.
[0019] The memory(s) 104 may be a computer-readable medium, examples of which comprise volatile memory (e.g., RAM), and/or non-volatile memory (e.g., Erasable Programmable read-only memory, i.e.. EPROM, flash memory, etc.). The memory(s) 104 may be an external memory, or internal memory, such as a flash drive, a compact disk drive, an external hard disk drive, or the like. The system 100 may have an interface 110 enabling the system 100 for coupling the system 100 with one or more other devices, through a wired (e.g., Local Area Network, i.e., LAN) connection or through a wireless connection (e.g., Bluetooth®, WiFi), for example, for connecting to the data sources 112. The interface 110 may also enable intercommunication between different logical as well as hardware components of the system 100.
[0020] Further, the system 100 may include module(s) 106. The module(s) 106 may include data collection module 106-1, PASSFER risk assessment module 106-2, data pre-processing module 106-3, model training module 106-4, model assessment module 106-5, model inference module 106-6, model governance module 106-7. In one example, the module(s) 106 may be implemented as a combination of hardware and firmware. In an example described herein, such combinations of hardware and firmware may be implemented in several different ways. For example, the firmware for module(s) 106 may be processor 102 executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the module(s) 106 may include a processing resource (for example, implemented as either single processor or combination of multiple processors), to execute such instructions.
[0021] In the present examples, the non-transitory machine-readable storage medium may store instructions that, when executed by the processing resource, implement the functionalities of modules(s) 106. In such examples, the system 100 may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions. In other examples of the present subject matter, the machine-readable storage medium may be located at a different location but accessible to the system 100 and the processor(s) 102.
[0022] The system 100 may include data 108. The data 108 may include data that is either stored or generated as a result of functions implemented by any of the module(s) 106 or the system 100. It may be further noted that information stored and available in data 108 may be utilized by the module(s) 106 for performing various functions by the system 100. In an example, data 108 may include benchmark data 108-1, PASSFER risk assessment metrics 108-2, Acceptable level of risks 108-3, mappings 108-4 and other data 108-5. It may be noted that such examples of the various functions are only indicative. The present approaches may be applicable to other examples without deviating from the scope of the present subject matter.
[0023] The lifecycle of a responsible ML model may include various stages, including, data PASSFER risk assessment stage, followed by data pre-processing stage, model training stage, model assessment stage, model inference stage, and model governance stage, in sequence. In the present subject matter, every stage is operationally dependent on one or more respective preceding stages. Further, some stages of the lifecycle stages are operationally dependent on one or more respective succeeding stages. The operational dependency as used herein refers to operations where the outcomes or results produced by one stage are utilized, either partially or entirely, as additional inputs or sole inputs for generating the outcomes or results of another stage. This dependency ensures that the execution and outputs of a dependent stage are directly influenced or determined by the outputs of one or more preceding stages. Additionally, operational dependency may extend to scenarios where feedback from subsequent stages is used to refine or improve the outcomes of earlier stages, forming a cyclical or iterative process.
[0024] In performing operations corresponding to respective stages of the lifecycle of the responsible ML, the data collection module 106-1 may be configured to receive original data from the data sources 112. It may be noted here that to receive the original data may include a user entering the original data to the system 100 or the system 100 retrieving the original data from the data sources 112 upon such instruction provided by the user via user interface.
[0025] In performing operations corresponding to the starting stage which is PASSFER risk assessment stage, the PASSFER risk assessment module 106-2 may be configured to process the original data upon receipt of the original data, wherein the process is to identify PASSFER risks associated with the original data based on pre-stored benchmark data, wherein the benchmark data comprises legal and regulatory requirements, and cross-border requirements. The legal and regulatory requirements and cross-border requirements may include, but are not limited to, The National Institute of Standards and Technology (NIST) Responsible AI framework, Linking, Identifying, Non-repudiation, Detecting, Data Disclosure, Unawareness, and Non-compliance (LINDDUN) framework and regulations mapping like European Union (EU) AI Act, Health Insurance Portability and Accountability Act (HIPAA), The Digital Personal Data Protection Bill (DPDP). This benchmark data may be available pre-stored on the system 100. In an example, to identify PASSFER risks associated with the original data based on pre-stored benchmark data, the PASSFER risk assessment module 106-2 may be configured to utilize mappings pre-stored on the system 100, where the mappings include mapping between the PASSFER risks and the legal and regulatory requirements, and cross-border requirements.
[0026] Further, the PASSFER risk assessment module 106-2 may be configured to generate one or more PASSFER risk metrics, each corresponding to the identified PASSFER risks and one or more recommendations based on the process, intended usage associated with the original data provided by the user, and feedback conditionally received from the model assessment stage as outcomes, wherein the one or more recommendations comprises one or more PASSFER enhancing techniques to mitigate the identified PASSFER risks. The feedback as indicated herein may be received subsequent to initial performance of the operations corresponding to the PASSFER risk assessment stage. The one or more PASSFER enhancing techniques may include, but are not limited to, encryption techniques, redaction techniques etc. To identify the PASSFER risks, in an example, the PASSFER risk assessment may utilize the mappings, where mappings include mapping between the PASSFER risks, intended usage and the legal and regulatory requirements, and cross-border requirements.
[0027] It may be noted here that the intended usage may be provided by the user via interface during the data collection or during the present stage. Further, the intended usage may be provided by the user by selecting a element from the pre-defined elements associated with intended usages displayed on the GUI interface of the system 100 or the user may enter the intended usage by providing descriptive response to a query from the system 100. This application is not limited thereto.
[0028] Further, to generate one or more PASSFER risk metrics, the PASSFER risk assessment module 106-2 may utilize the pre-stored mappings, where mappings include mapping between the PASSFER risks, intended usage and the legal and regulatory requirements, and cross-border requirements.
[0029] Further, the PASSFER risk assessment module 106-2 may be configured to enable the user to determine acceptable level of PASSFER risks based on the intended usage and the benchmark data.
[0030] In performing operations corresponding to the data pre-processing stage, the data pre-processing module 106-3 may be configured to receive a PASSFER budget from the user, wherein the PASSFER budget is determined based on the identified PASSFER risks, the one or more PASSFER risk metrics, the intended usage, the benchmark data, and the one or more recommendations from the starting stage.
[0031] The PASSFER budget may include, but not limited to, k-anonymity, for example, in healthcare dataset, k-anonymity may aggregate patient data to ensure that at least k=10 individuals share the same combination of attributes, such as age group, gender, and diagnosis, thereby preventing re-identification. In another example, the PASSFER budget may include, but not limited to, t-closeness which may ensure that the proportion of patients with a specific condition in any group remains representative of the dataset. In another example, the PASSFER budget may include, but not limited to, ?-differential privacy, in healthcare data processing, ?-differential privacy may be used to ensure strong privacy guarantees by adding noise to sensitive aggregate statistics, such as disease prevalence.
[0032] The data pre-processing module 106-3 may be configured to process the original data to identify portions of the original data causing the identified PASSFER risks, for example, privacy risk from the PASSFER risks may involve, but not limited to, personally identifiable (PII) data, for example, but are not limited to, name, age, and gender and non-PII data, for example, broader group of diseases. To identify the portions of the original data causing the identified PASSFER risks, the data pre-processing module 106-3 may utilize one or more Natural Language Processing algorithms such as Named Entity Recognition algorithms, Tree based search methods, Rule based systems.
[0033] Further, the data pre-processing module 106-3 may be configured to generate PASSFER risk mitigated data upon application of the one or more PASSFER enhancing techniques in the one or more recommendations to the original data in accordance with the PASSFER budget. Further, the data pre-processing module 106-3 may be configured to obtain optimized data upon iteratively subjecting the PASSFER risk mitigated data generated in each iteration of the data pre-processing stage as the original data to the operations corresponding to the starting stage until the each PASSFER risk metric of the one or more PASSFER risk metrics satisfies the acceptable level of PASSFER risks.
[0034] Furthermore, the data pre-processing module 106-3 may be configured to generate proofs demonstrating that the training data is in compliance with the standards of the responsible machine learning. The proofs may include, but are not limited to, mathematical proofs for k or t anonymization with graphs, variation in PASSFER risk metrics.
[0035] In performing operations corresponding to the model training stage, the model training module 106-4 may be configured to selectively train one or more ML models utilizing the optimized data as training data and generate proofs demonstrating that the training is in compliance with the standards of the responsible machine learning. The proofs may include, but are not limited to, differentially private algorithms for ML model training, Federated Learning.
[0036] In performing operations corresponding to the model assessment stage, the model assessment module 106-5 may be configured to subject the one or more ML models upon training to one or more PASSFER risk simulations and generate a corresponding assessment report indicating PASSFER risk levels for each ML model of the one or more ML models.
[0037] Further, the model assessment module 106-5 may be configured to compare the PASSFER risk levels, each with the acceptable level of PASSFER risks. The model assessment module 106-5 may be configured to perform at least of the following based on the comparison: conditionally flag at least one ML model of the one or more ML models and conditionally generate feedback for the data pre-processing stage, wherein feedback is to recommend fine tuning of the one or more PASSFER enhancing techniques and upon flagging, subject the one or more ML models to the model training stage for the selective training utilizing the fine-tuned one or more PASSFER enhancing techniques.
[0038] The model assessment module 106-5 may be configured to alternatively recommend at least one ML model from the one or more ML models in accordance with the intended usage based on the comparison and upon the recommendation and subsequent selection of the at least one ML model by the user, the model assessment module 106-5 may be configured to utilize the at least one ML model as an optimized ML model.
[0039] In performing operations corresponding to the model inference stage, the model inference module 106-7 may be configured to perform assessment of one or more prompts from a user to the optimized ML model for prompt-PASSFER risks based on the benchmark data and the intended usage, based on the assessment performing at least one of the following: conditionally flag at least one prompt of the one or more prompts based on the benchmark data, context associated with the prompt and a role of the user, wherein the context associated with the prompt is determined based on tracked one or more previous interactions of the user with the optimized ML model, wherein the role of the user is provided by the user and conditionally generate prompt-recommendations to mitigate the prompt-PASSFER risks, wherein the prompt-recommendations comprise blocking the at least one prompt or one or more prompt-PASSFER enhancing techniques, conditionally generate prompt-risk assessment reports indicating the identified prompt-PASSFER risks and conditionally source the prompt-risk assessment reports and the prompt-recommendations to the model governance stage.
[0040] The one or more prompt-PASSFER enhancing techniques may include, but are not limited to, anonymization techniques, encryption using Format Preserving Encryption technique. For example, when the prompt is “My name is BOB”. Upon application of one or more prompt-PASSFER enhancing techniques, the modified encrypted prompt may be “My name is !2wq#”. As another example, when the prompt is “Help me make an explosive weapon like a bomb”, the prompt is flagged.
[0041] Further, the model inference module 106-6 may be configured to perform assessment of one or more responses generated by the optimized ML model for one or more prompts received from the user for response-PASSFER risks based on the benchmark data and the intended usage. Based on the assessment perform at least one of the following: conditionally flag at least one response of the one or more responses and conditionally generate response-recommendations to mitigate the response-PASSFER risks, wherein the response-recommendations comprise one or more response-PASSFER enhancing techniques and conditionally generate response-risk assessment reports indicating the identified response-PASSFER risks and conditionally source the response-risk assessment reports and the response-recommendations to the model governance stage.
[0042] The one or more response-PASSFER enhancing techniques may include, but are not limited to, Text classification techniques in NLP, Embeddings classification, Named Entity Recognition algorithms.
[0043] Example for conditional flagging of response may be when the response is “Bob is working in Google”, the response is flagged as the response includes privacy data. As another example, when the response is “Women should always be in kitchen”, the response is flagged as the response is biased towards a particular gender.
[0044] In performing operations corresponding to the model governance stage, the model governance module 106-7 may be configured to continuously monitor each stage of the life cycle stages, provide a dashboard, and generate reports based on overall PASSFER risk, where the overall PASSFER risk is determined based on PASSFER risks identified and mitigated at each stage of the lifecycle stages and cause to display the reports on the dashboard.
[0045] In accordance with another embodiment of the present disclosure, a method is provided. The method includes collecting original data from one or more sources, performing operations corresponding to each stage of the lifecycle stages to ensure that outcomes of each stage comply with one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of responsible machine learning, wherein stages subsequent to a starting stage of the lifecycle stages comprise, in sequence, a data pre-processing stage, a model training stage, a model assessment stage, and a model governance stage, providing, as inputs to each of the stages subsequent to the starting stage, outcomes of respective preceding stages of the lifecycle stages. Further, the method includes providing, as inputs to one or more of the lifecycle stages, outcomes of one or more respective succeeding stages of the stages.
[0046] FIGS. 2-4 illustrate a method for orchestrating lifecycle stages of a responsible machine learning, in accordance with an example implementation of the present subject matter; Although the methods 200-400 may be implemented in a variety of devices, but for ease of explanation, the description of methods 200-400 is provided in reference to the above-described system 100. The order in which the methods 200-400 is described is not intended to be construed as a limitation, and any number of the described method blocks may be combined in any order to implement the methods 200-400, or an alternative method. It may be understood that blocks of the methods 200-400 may be performed in the system 100. The blocks of the methods 200-400 may be executed based on instructions stored in a non-transitory computer-readable medium, as will be readily understood. The non-transitory computer-readable medium may comprise, for example, digital memories, magnetic storage media, such as magnetic disks and magnetic tapes, hard drives, or optically readable digital data storage media.
[0047] At block 202, original data from one or more sources may be received.
[0048] At block 204, operations corresponding to each stage of the lifecycle stages ensuring that outcomes of the each stage is compliant of one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of the responsible machine learning may be performed, wherein stages subsequent to a starting stage of the lifecycle stages comprises, in sequence, data pre-processing stage, model training stage, model assessment stage, and model governance stage.
[0049] At block 206, the original data may be processed to identify PASSFER risks associated with the original data based on pre-stored benchmark data, wherein the benchmark data comprises legal and regulatory requirements, and cross-border requirements.
[0050] At block 208, one or more PASSFER risk metrics may be generated as outcomes, each corresponding to the identified PASSFER risks.
[0051] At block 208, one or more recommendations may be generated based on the process, intended usage associated with the original data provided by the user, and feedback conditionally received from the model assessment stage as outcomes, wherein the one or more recommendations comprises one or more PASSFER enhancing techniques to mitigate the identified PASSFER risks.
[0052] At block 210, acceptable level of PASSFER risks may be received from the user, wherein the acceptable level of PASSFER risks is determined by the user based on the intended usage and the benchmark data.
[0053] At block 212, a PASSFER budget may be received from the user, wherein the PASSFER budget is determined based on the identified PASSFER risks, the one or more PASSFER risk metrics, the intended usage, the benchmark data, and the one or more recommendations from the starting stage.
[0054] At block 214, the original data to identify portions of the original data causing the identified PASSFER risks may be processed.
[0055] At block 302, PASSFER risk mitigated data may be generated upon application of the one or more PASSFER enhancing techniques in the one or more recommendations to the original data in accordance with the PASSFER budget.
[0056] At block 304, optimized data may be obtained upon iteratively subjecting the PASSFER risk mitigated data generated in each iteration of the data pre-processing stage as the original data to the operations corresponding to the starting stage until the each PASSFER risk metric of the one or more PASSFER risk metrics satisfies the acceptable level of PASSFER risks.
[0057] At block 306, proofs demonstrating that the training data is in compliance with the standards of the responsible machine learning may be generated.
[0058] At block 308, one or more ML models utilizing the optimized data as training data may be selectively trained and proofs demonstrating that the training is in compliance with the standards of the responsible machine learning may be generated.
[0059] At block 310, the one or more ML models may be subjected to one or more PASSFER risk simulations.
[0060] At block 312, a corresponding assessment report indicating PASSFER risk levels for each ML model of the one or more ML models may be generated.
[0061] At block 314, the PASSFER risk levels may be compared with the acceptable level of PASSFER risks.
[0062] AT block 316, based on the comparison, at least of the following may be performed: conditionally flagging at least one ML model of the one or more ML models, conditionally generating feedback for the data pre-processing stage, wherein feedback is to recommend fine tuning of the one or more PASSFER enhancing techniques and upon flagging, subjecting the one or more ML models to the model training stage for the selective training utilizing the fine-tuned one or more PASSFER enhancing techniques. OR at block 316, based on the comparison, at least one ML model from the one or more ML models in accordance with the intended usage may be recommended and upon the recommendation and subsequent selection of the at least one ML model by the user, the at least one ML model may be utilized as an optimized ML model.
[0063] At block 402, assessment of one or more prompts from a user to the optimized ML model for prompt-PASSFER risks based on the benchmark data and the intended usage may be performed.
[0064] At block 404, based on the assessment, at least one of the following may be performed: conditionally flagging at least one prompt of the one or more prompts based on the benchmark data, context associated with the prompt and a role of the user, wherein the context associated with the prompt is determined based on tracked one or more previous interactions of the user with the optimized ML model, wherein the role of the user is provided by the user and conditionally generating prompt-recommendations to mitigate the prompt-PASSFER risks, wherein the prompt-recommendations comprise blocking the at least one prompt or one or more prompt-PASSFER enhancing techniques and conditionally generating prompt-risk assessment reports indicating the identified prompt-PASSFER risks and conditionally sourcing the prompt-risk assessment reports and the prompt-recommendations to the model governance stage.
[0065] At block 406, assessment of one or more responses generated by the optimized ML model for one or more prompts received from the user for response-PASSFER risks based on the benchmark data and the intended usage may be performed.
[0066] At block 408, based on the assessment, at least one of the following may be performed: conditionally flagging at least one response of the one or more responses, conditionally generating response-recommendations to mitigate the response-PASSFER risks, wherein the response-recommendations comprise one or more response-PASSFER enhancing techniques, conditionally generating response-risk assessment reports indicating the identified response-PASSFER risks and conditionally sourcing the response-risk assessment reports and the response-recommendations to the model governance stage.
[0067] At block 410, the each stage of the life cycle stages may be continuously monitored.
[0068] At block 412, a dashboard may be provided, and reports may be generated based on overall PASSFER risk, wherein the overall PASSFER risk is determined based on PASSFER risks identified and mitigated at the each stage of the lifecycle stages and the reports may be caused to display on the dashboard.
[0069] Interfacing all lifecycle stages with operational dependencies and ensuring PASSFER compliance offers several technical advantages. This integrated approach enables continuous risk mitigation by allowing stages to inform and refine each other, reducing the risk of downstream issues. It improves data quality and utility through iterative feedback, resulting in more accurate and reliable models. Compliance and governance are strengthened by upholding privacy, accountability, safety, security, fairness, explainability, and reliability standards at every stage, which is crucial in regulated industries like healthcare and finance. Operational dependency enhances transparency and explainability, with clear tracking of decisions and outcomes. It also optimizes performance by addressing inefficiencies in real-time. Overall, this system fosters trust and reliability, ensuring ethical adherence across the machine learning pipeline while streamlining audits and boosting stakeholder confidence.
[0070] It will be understood by those skilled in the art that the foregoing general description and the following detailed description are exemplary and explanatory of the disclosure and are not intended to be restrictive thereof.
[0071] While specific language has been used to describe the disclosure, any limitations arising on account of the same are not intended. As would be apparent to a person skilled in the art, various working modifications may be made to the method in order to implement the inventive concept as taught herein.
[0072] The figures and the foregoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, the order of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions of any flow diagram need not be implemented in the order shown; nor do all of the acts need to be necessarily performed. Also, those acts that are not dependent on other acts may be performed in parallel with the other acts. The scope of embodiments is by no means limited by these specific examples.
,CLAIMS:1. A system for orchestrating lifecycle stages of a responsible machine learning, the system comprising:
a processor;
a memory coupled to the processor, wherein the memory comprises instructions executable by the processor to:
receive original data from one or more sources;
perform operations corresponding to each stage of the lifecycle stages ensuring that outcomes of the each stage is compliant of one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of the responsible machine learning, wherein stages subsequent to a starting stage of the lifecycle stages comprises, in sequence, data pre-processing stage, model training stage, model assessment stage, and model governance stage,
wherein inputs to each stage subsequent to the starting stage comprises outcomes of respective preceding stages of the lifecycle stages; and
wherein inputs to one or more of the lifecycle stages comprises outcomes of one or more respective succeeding stages of the stages.
2. The system as claimed in claim 1, wherein to perform operations corresponding to the starting stage, wherein the starting stage is PASSFER risk assessment stage, the processor is to:
process the original data to identify PASSFER risks associated with the original data based on pre-stored benchmark data, wherein the benchmark data comprises legal and regulatory requirements, and cross-border requirements; and
generate one or more PASSFER risk metrics as outcomes, each corresponding to the identified PASSFER risks and one or more recommendations based on the process, intended usage associated with the original data provided by the user, and feedback conditionally received from the model assessment stage as outcomes, wherein the one or more recommendations comprises one or more PASSFER enhancing techniques to mitigate the identified PASSFER risks;
receive acceptable level of PASSFER risks from the user, wherein the acceptable level of PASSFER risks is determined by the user based on the intended usage and the benchmark data.
3. The system as claimed in claim 2, wherein to perform operations corresponding to the data pre-processing stage, the processor is to:
receive a PASSFER budget from the user, wherein the PASSFER budget is determined based on the identified PASSFER risks, the one or more PASSFER risk metrics, the intended usage, the benchmark data, and the one or more recommendations from the starting stage;
process the original data to identify portions of the original data causing the identified PASSFER risks;
generate PASSFER risk mitigated data upon application of the one or more PASSFER enhancing techniques in the one or more recommendations to the original data in accordance with the PASSFER budget;
obtain optimized data upon iteratively subjecting the PASSFER risk mitigated data generated in each iteration of the data pre-processing stage as the original data to the operations corresponding to the starting stage until the each PASSFER risk metric of the one or more PASSFER risk metrics satisfies the acceptable level of PASSFER risks; and
generate proofs demonstrating that the training data is in compliance with the standards of the responsible machine learning.
4. The system as claimed in claim 3, wherein to perform operations corresponding to the model training stage, the processor is to:
selectively train one or more ML models utilizing the optimized data as training data; and
generate proofs demonstrating that the training is in compliance with the standards of the responsible machine learning.
5. The system as claimed in claim 4, wherein to perform operations corresponding to the model assessment stage, the processor is to:
subject the one or more ML models upon training to one or more PASSFER risk simulations;
generate a corresponding assessment report indicating PASSFER risk levels for each ML model of the one or more ML models;
compare the PASSFER risk levels, each with the acceptable level of PASSFER risks;
based on the comparison, perform at least of the following:
conditionally flag at least one ML model of the one or more ML models;
conditionally generate feedback for the data pre-processing stage, wherein feedback is to recommend fine tuning of the one or more PASSFER enhancing techniques; and
Upon flagging, subject the one or more ML models to the model training stage for the selective training utilizing the fine-tuned one or more PASSFER enhancing techniques;
or
based on the comparison, recommend at least one ML model from the one or more ML models in accordance with the intended usage; and
upon the recommendation and subsequent selection of the at least one ML model by the user, utilize the at least one ML model as an optimized ML model.
6. The system as claimed in claim 5, wherein to perform the operations corresponding to the model inference stage, the processor is to:
perform assessment of one or more prompts from a user to the optimized ML model for prompt-PASSFER risks based on the benchmark data and the intended usage;
based on the assessment perform at least one of the following:
conditionally flag at least one prompt of the one or more prompts based on the benchmark data, context associated with the prompt and a role of the user, wherein the context associated with the prompt is determined based on tracked one or more previous interactions of the user with the optimized ML model, wherein the role of the user is provided by the user; and
conditionally generate prompt-recommendations to mitigate the prompt-PASSFER risks, wherein the prompt-recommendations comprise blocking the at least one prompt or one or more prompt-PASSFER enhancing techniques;
conditionally generate prompt-risk assessment reports indicating the identified prompt-PASSFER risks; and
conditionally source the prompt-risk assessment reports and the prompt-recommendations to the model governance stage.
7. The system as claimed in claim 5, wherein to perform the operations corresponding to the model inference stage, the processor is to:
perform assessment of one or more responses generated by the optimized ML model for one or more prompts received from the user for response-PASSFER risks based on the benchmark data and the intended usage;
based on the assessment perform at least one of the following:
conditionally flag at least one response of the one or more responses; and
conditionally generate response-recommendations to mitigate the response-PASSFER risks, wherein the response-recommendations comprise one or more response-PASSFER enhancing techniques;
conditionally generate response-risk assessment reports indicating the identified response-PASSFER risks; and
conditionally source the response-risk assessment reports and the response-recommendations to the model governance stage.
8. The system as claimed in claim 1, wherein to perform operations corresponding to the model governance stage, the processer is to:
continuously monitor the each stage of the life cycle stages;
provide a dashboard;
generate reports based on overall PASSFER risk, wherein the overall PASSFER risk is determined based on PASSFER risks identified and mitigated at the each stage of the lifecycle stages; and
cause to display the reports on the dashboard.
9. A method for orchestrating lifecycle stages of a machine learning (ML) model, the method comprising:
receiving original data from one or more sources;
performing operations corresponding to each stage of the lifecycle stages ensuring that outcomes of the each stage is compliant of one or more of privacy, accountability, safety, security, fairness, explainability, and reliability (PASSFER) standards of the responsible machine learning, wherein stages subsequent to a starting stage of the lifecycle stages comprises, in sequence, data pre-processing stage, model training stage, model assessment stage, and model governance stage,
wherein inputs to each stage subsequent to the starting stage comprises outcomes of respective preceding stages of the lifecycle stages; and
wherein inputs to one or more of the lifecycle stages comprises outcomes of one or more respective succeeding stages of the stages.
10. The method as claimed in claim 9, wherein performing operations corresponding to the starting stage, wherein the starting stage is PASSFER risk assessment stage comprises:
processing the original data to identify PASSFER risks associated with the original data based on pre-stored benchmark data, wherein the benchmark data comprises legal and regulatory requirements, and cross-border requirements; and
generating one or more PASSFER risk metrics as outcomes, each corresponding to the identified PASSFER risks; and
generating one or more recommendations based on the process, intended usage associated with the original data provided by the user, and feedback conditionally received from the model assessment stage as outcomes, wherein the one or more recommendations comprises one or more PASSFER enhancing techniques to mitigate the identified PASSFER risks;
receive acceptable level of PASSFER risks from the user, wherein the acceptable level of PASSFER risks is determined by the user based on the intended usage and the benchmark data.
11. The method as claimed in claim 10, wherein performing operations corresponding to the data pre-processing stage comprises:
receiving a PASSFER budget from the user, wherein the PASSFER budget is determined based on the identified PASSFER risks, the one or more PASSFER risk metrics, the intended usage, the benchmark data, and the one or more recommendations from the starting stage;
processing the original data to identify portions of the original data causing the identified PASSFER risks;
generate PASSFER risk mitigated data upon application of the one or more PASSFER enhancing techniques in the one or more recommendations to the original data in accordance with the PASSFER budget;
obtaining optimized data upon iteratively subjecting the PASSFER risk mitigated data generated in each iteration of the data pre-processing stage as the original data to the operations corresponding to the starting stage until the each PASSFER risk metric of the one or more PASSFER risk metrics satisfies the acceptable level of PASSFER risks; and
generating proofs demonstrating that the training data is in compliance with the standards of the responsible machine learning.
12. The method as claimed in claim 11, wherein performing operations corresponding to the model training stage comprises:
selectively training one or more ML models utilizing the optimized data as training data; and
generating proofs demonstrating that the training is in compliance with the standards of the responsible machine learning.
13. The method as claimed in claim 12, wherein performing operations corresponding to the model assessment stage comprises:
subjecting the one or more ML models upon training to one or more PASSFER risk simulations;
generating a corresponding assessment report indicating PASSFER risk levels for each ML model of the one or more ML models;
comparing the PASSFER risk levels, each with the acceptable level of PASSFER risks;
based on the comparison, performing at least of the following:
conditionally flagging at least one ML model of the one or more ML models;
conditionally generating feedback for the data pre-processing stage, wherein feedback is to recommend fine tuning of the one or more PASSFER enhancing techniques; and
Upon flagging, subjecting the one or more ML models to the model training stage for the selective training utilizing the fine-tuned one or more PASSFER enhancing techniques;
or
based on the comparison, recommending at least one ML model from the one or more ML models in accordance with the intended usage; and
upon the recommendation and subsequent selection of the at least one ML model by the user, utilizing the at least one ML model as an optimized ML model.
14. The method as claimed in claim 13, wherein performing operations corresponding to the model inference stage comprises:
performing assessment of one or more prompts from a user to the optimized ML model for prompt-PASSFER risks based on the benchmark data and the intended usage;
based on the assessment performing at least one of the following:
conditionally flagging at least one prompt of the one or more prompts based on the benchmark data, context associated with the prompt and a role of the user, wherein the context associated with the prompt is determined based on tracked one or more previous interactions of the user with the optimized ML model, wherein the role of the user is provided by the user;
conditionally generating prompt-recommendations to mitigate the prompt-PASSFER risks, wherein the prompt-recommendations comprise blocking the at least one prompt or one or more prompt-PASSFER enhancing techniques;
conditionally generating prompt-risk assessment reports indicating the identified prompt-PASSFER risks; and
conditionally sourcing the prompt-risk assessment reports and the prompt-recommendations to the model governance stage.
15. The method as claimed in claim 13, wherein performing the operations corresponding to the model inference stage comprises:
performing assessment of one or more responses generated by the optimized ML model for one or more prompts received from the user for response-PASSFER risks based on the benchmark data and the intended usage;
based on the assessment performing at least one of the following:
conditionally flagging at least one response of the one or more responses; and
conditionally generating response-recommendations to mitigate the response-PASSFER risks, wherein the response-recommendations comprise one or more response-PASSFER enhancing techniques;
conditionally generating response-risk assessment reports indicating the identified response-PASSFER risks; and
conditionally sourcing the response-risk assessment reports and the response-recommendations to the model governance stage.
16. The method as claimed in claim 9, wherein performing operations corresponding to the model governance stage comprises:
continuously monitoring the each stage of the life cycle stages;
providing a dashboard;
generate reports based on overall PASSFER risk, wherein the overall PASSFER risk is determined based on PASSFER risks identified and mitigated at the each stage of the lifecycle stages; and
causing display of the reports on the dashboard.
Dated this 23rd day of January 2025
Signature
Gokul Nataraj E
Patent Agent (IN/PA-5309)
Agent for the Applicant
| # | Name | Date |
|---|---|---|
| 1 | 202441005046-STATEMENT OF UNDERTAKING (FORM 3) [24-01-2024(online)].pdf | 2024-01-24 |
| 2 | 202441005046-PROVISIONAL SPECIFICATION [24-01-2024(online)].pdf | 2024-01-24 |
| 3 | 202441005046-PROOF OF RIGHT [24-01-2024(online)].pdf | 2024-01-24 |
| 4 | 202441005046-POWER OF AUTHORITY [24-01-2024(online)].pdf | 2024-01-24 |
| 5 | 202441005046-FORM FOR STARTUP [24-01-2024(online)].pdf | 2024-01-24 |
| 6 | 202441005046-FORM FOR SMALL ENTITY(FORM-28) [24-01-2024(online)].pdf | 2024-01-24 |
| 7 | 202441005046-FORM 1 [24-01-2024(online)].pdf | 2024-01-24 |
| 8 | 202441005046-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [24-01-2024(online)].pdf | 2024-01-24 |
| 9 | 202441005046-EVIDENCE FOR REGISTRATION UNDER SSI [24-01-2024(online)].pdf | 2024-01-24 |
| 10 | 202441005046-Power of Attorney [16-01-2025(online)].pdf | 2025-01-16 |
| 11 | 202441005046-FORM28 [16-01-2025(online)].pdf | 2025-01-16 |
| 12 | 202441005046-FORM-26 [16-01-2025(online)].pdf | 2025-01-16 |
| 13 | 202441005046-Covering Letter [16-01-2025(online)].pdf | 2025-01-16 |
| 14 | 202441005046-DRAWING [23-01-2025(online)].pdf | 2025-01-23 |
| 15 | 202441005046-CORRESPONDENCE-OTHERS [23-01-2025(online)].pdf | 2025-01-23 |
| 16 | 202441005046-COMPLETE SPECIFICATION [23-01-2025(online)].pdf | 2025-01-23 |
| 17 | 202441005046-FORM-9 [24-01-2025(online)].pdf | 2025-01-24 |
| 18 | 202441005046-FORM-8 [24-01-2025(online)].pdf | 2025-01-24 |
| 19 | 202441005046-STARTUP [27-01-2025(online)].pdf | 2025-01-27 |
| 20 | 202441005046-FORM28 [27-01-2025(online)].pdf | 2025-01-27 |
| 21 | 202441005046-FORM 18A [27-01-2025(online)].pdf | 2025-01-27 |