Abstract: A system to alter queries provided to an artificial intelligence platform for preserving privacy is disclosed. The system includes a query assessment module to identify sensitive attributes in prompts, a query annotation module to annotate them and a context and sensitivity based query paraphrasing module to generate synthetic attributes along with a privacy budget module to alter identified numerical values for creating synthetic prompt. The system includes a query output module to receive the synthetic privacy preserved query from upstream and forward it to downstream neural network with attention based language model and a query response origin module to receive an output generated by the artificial intelligence platform upon receiving the query. The system includes a query response assessment module to identify the sensitive synthetic attributes, a response privacy and context preservation module to modify the output to restore original entities and an output module to render the output. FIG. 1
DESC:EARLIEST PRIORITY DATE:
This Application claims priority from a provisional patent application filed in India having Patent Application No. 202341026661, filed on April 11, 2023, and titled “SYSTEM AND METHOD FOR PRIVACY PRESERVING QUERY AND RESPONSE FOR DATA & AI GOVERNANCE”.
FIELD OF INVENTION
[0001] Embodiment of the present disclosure relates to artificial intelligence governance platform and more particularly to, a system and a method to alter queries provided to an artificial intelligence platform for preserving privacy.
BACKGROUND
[0002] Querying and response in an artificial intelligence (AI) platform involves the interaction of a user and the artificial intelligence system. Users interact with the AI platform by providing queries, requests, or input in natural language or structured formats. The AI platform employs natural language processing (NLP) techniques to understand and interpret the user's input. Once the user query is understood, the AI platform processes the query to determine the appropriate action or response. Based on the processed query and the AI's understanding of the user's intent, the platform generates a response. Finally, the generated response is delivered back to the user through the user interface of the AI platform. Although AI platforms have made significant advancements in querying and response capabilities, there are several challenges that are faced.
[0003] A few challenges includes complexity of human language in user prompts, ambiguity of user prompts, privacy and security of user prompts and so on. Specifically, users tend to disclose sensitive attributes pertaining to personal information (such as email address, health information, credit card numbers and the like). Such sensitive attributes in user prompts are important considerations for AI platforms to ensure user privacy, confidentiality, and security. It is imperative that such personal information does not reach either public or private AI platforms. AI systems must be designed to handle sensitive information responsibly, following privacy regulations and best practices to protect user data from unauthorized access or misuse. The sensitive attributes can be learned from unintentional leak of sensitive information in the user prompts or in responses to the user prompts. This makes both querying and data sharing with AI platforms difficult. Adhering to data protection regulations, preventing unauthorized access, and safeguarding user information adds complexity to the querying and response process.
[0004] Hence, there is a need for an improved system to alter queries provided to an artificial intelligence platform for preserving privacy to address the aforementioned issue(s).
OBJECTIVE OF THE INVENTION
[0005] An objective of the present invention is to provide a dedicated system that understands the nature, scope, context, purpose, sensitivity and privacy budget of various types of queries and responses and accordingly privacy preserve data in both query and response.
[0006] Another objective of the present disclosure is to proactively help in privacy preservation of query and response for data and AI governance.
BRIEF DESCRIPTION
[0007] In accordance with an embodiment of the present disclosure, a system to alter queries provided to an artificial intelligence platform for preserving privacy is provided. The system includes at least one processor in communication with a client processor. The system also includes at least one memory includes a set of program instructions in the form of a processing subsystem, configured to be executed by the at least one processor. The processing subsystem is hosted on a server and configured to execute on a network to control bidirectional communications among a plurality of modules. The processing subsystem includes a querying input module configured to receive one or more prompts from a user. The processing subsystem includes a query assessment module operatively coupled to the querying input module, wherein the query assessment module is configured to analyze the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts. The processing subsystem includes a query annotation module operatively coupled to the querying input module, wherein the query annotation module is configured to annotate one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt. The processing subsystem includes a context and sensitivity based query paraphrasing module operatively coupled to the query annotation module. The context and sensitivity based query paraphrasing module is configured to generate one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information. Further, the context and sensitivity based query paraphrasing module is configured to paraphrase the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes. Furthermore, the context and sensitivity based query paraphrasing module includes a privacy budget module, wherein the privacy budget module is configured to alter the identified numerical values in sensitive data by performing a mathematical operation. Further, the context and sensitivity based query paraphrasing module is configured to paraphrase the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes. The processing subsystem includes a query output module operatively coupled to the context and sensitivity based query paraphrasing module, wherein the query output module is configured to receive the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model. The processing subsystem includes a query response origin module operatively coupled to the query output module, wherein the query response origin module is configured to receive an output generated by the downstream neural network with attention based language model from the artificial intelligence platform upon receiving the query. The processing subsystem includes a query response assessment module operatively coupled to the query response origin module, wherein the query response assessment module is configured to identify the corresponding sensitive synthetic attributes from the output. The processing subsystem includes a response privacy and context preservation module operatively coupled to the query response module, wherein the response privacy and context preservation module is configured to modify the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user. Furthermore, the processing subsystem includes an output module operatively coupled to the response privacy and context preservation module, wherein the output module is configured to render the output in a user interface associated with the user.
[0008] In accordance with an embodiment of the present disclosure, a method to alter queries provided to an artificial intelligence platform for preserving privacy is provided. The method includes receiving, by a querying input module, one or more prompts from a user. The method includes analyzing, by a query assessment module, the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts. The method includes annotating, by a query annotation module, one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt. The method includes generating, by a context and sensitivity based query paraphrasing module, the one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal information. Furthermore, the method includes altering, by the privacy budget module, alter the at least one of the date, currency and the numerical values in sensitive data number by performing a mathematical operation on the at least one of the date, currency and the number using an offset value provided by the user, wherein the mathematical operation comprises addition, subtraction, multiplication, and division. The method includes paraphrasing, by the context and sensitivity based query paraphrasing module, the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes. The method includes receiving, by a query output module, the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model. The method includes receiving, by a query response origin module, an output generated by the artificial intelligence platform upon receiving the query. The method includes identifying, by a query response assessment module, the corresponding sensitive synthetic attributes from the output. The method includes modifying, by a response privacy and context preservation module, the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user. Moreover, the method includes rendering, by an output module, the output in a user interface associated with the user.
[0009] To further clarify the advantages and features of the present disclosure, a more particular description of the disclosure will follow by reference to specific embodiments thereof, which are illustrated in the appended figures. It is to be appreciated that these figures depict only typical embodiments of the disclosure and are therefore not to be considered limiting in scope. The disclosure will be described and explained with additional specificity and detail with the appended figures.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The disclosure will be described and explained with additional specificity and detail with the accompanying figures in which:
[0011] FIG. 1 is a block diagram representation of a system to alter queries provided to an artificial intelligence platform for preserving privacy in accordance with an embodiment of the present disclosure;
[0012] FIG. 2 is a block diagram of a computer or a server in accordance with an embodiment of the present disclosure;
[0013] FIG. 3(a) illustrates a flow chart representing the steps involved in a method to alter queries provided to an artificial intelligence platform for preserving privacy in accordance with an embodiment of the present disclosure; and
[0014] FIG. 3(b) illustrates continued steps of method to alter queries provided to an artificial intelligence platform for preserving privacy of FIG. 3(a) in accordance with an embodiment of the present disclosure.
[0015] Further, those skilled in the art will appreciate that elements in the figures are illustrated for simplicity and may not have necessarily been drawn to scale. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the figures by conventional symbols, and the figures may show only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the figures with details that will be readily apparent to those skilled in the art having the benefit of the description herein.
DETAILED DESCRIPTION
[0016] For the purpose of promoting an understanding of the principles of the disclosure, reference will now be made to the embodiment illustrated in the figures and specific language will be used to describe them. It will nevertheless be understood that no limitation of the scope of the disclosure is thereby intended. Such alterations and further modifications in the illustrated computer-implemented system, and such further applications of the principles of the disclosure as would normally occur to those skilled in the art are to be construed as being within the scope of the present disclosure.
[0017] The terms “comprises”, “comprising”, or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such a process or method. Similarly, one or more devices or subsystems or elements or structures or components preceded by "comprises... a" does not, without more constraints, preclude the existence of other devices, sub-systems, elements, structures, components, additional devices, additional sub-systems, additional elements, additional structures, or additional components. Appearances of the phrase "in an embodiment", "in another embodiment" and similar language throughout this specification may, but not necessarily do, all refer to the same embodiment.
[0018] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the art to which this disclosure belongs. The system, methods, and examples provided herein are only illustrative and not intended to be limiting.
[0019] Embodiments of the present disclosure relates to system and method to alter queries provided to an artificial intelligence platform for preserving privacy. The system includes at least one processor in communication with a client processor. The system also includes at least one memory includes a set of program instructions in the form of a processing subsystem, configured to be executed by the at least one processor. The processing subsystem is hosted on a server and configured to execute on a network to control bidirectional communications among a plurality of modules. The processing subsystem includes a querying input module configured to receive one or more prompts from a user. The processing subsystem includes a query assessment module operatively coupled to the querying input module, wherein the query assessment module is configured to analyze the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts. The processing subsystem includes a query annotation module operatively coupled to the querying input module, wherein the query annotation module is configured to annotate one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt. The processing subsystem includes a context and sensitivity based query paraphrasing module operatively coupled to the query annotation module. The context and sensitivity based query paraphrasing module is configured to generate one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information. Further, the context and sensitivity based query paraphrasing module includes a privacy budget module, wherein the privacy budget module is configured to alter identified numerical values in sensitive data by performing a mathematical operation. Further, the context and sensitivity based query paraphrasing module is configured to paraphrase the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes. The processing subsystem includes a query output module operatively coupled to the context and sensitivity based query paraphrasing module, wherein the query output module is configured to receive the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model. The processing subsystem includes a query response origin module operatively coupled to the query output module, wherein the query response origin module is configured to receive an output generated by the artificial intelligence platform upon receiving the query. The processing subsystem includes a query response assessment module operatively coupled to the query response origin module, wherein the query response assessment module is configured to identify the corresponding sensitive synthetic attributes from the output. The processing subsystem includes a response privacy and context preservation module operatively coupled to the query response module, wherein the response privacy and context preservation module is configured to modify the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user. Furthermore, the processing subsystem includes an output module operatively coupled to the response privacy and context preservation module, wherein the output module is configured to render the output in a user interface associated with the user.
[0020] FIG. 1 is a block diagram representation of a system to alter one or more queries provided to an artificial intelligence platform for preserving privacy in accordance with an embodiment of the present disclosure. The system (10) includes at least one processor (20) in communication with a client processor (30). The processor (20) generally refers to a computational unit or central processing unit (CPU) responsible for executing instructions in a computer system. The phrase "in communication with a client processor" implies that there is a relationship or interaction between at least one processor and a specific type of processor referred to as a "client processor." Here, the term "client processor" refer to a processor that initiates requests or tasks and interacts with another processor (which may be a server processor) to fulfil those requests.
[0021] The system (10) also includes at least one memory (40) includes a set of program instructions in the form of a processing subsystem (50), configured to be executed by the at least one processor. The processing subsystem (50) is hosted on a server (55) and configured to execute on a network (not shown in FIG. 1) to control bidirectional communications among a plurality of modules. As used herein, the memory (40) is a storage component within the system used for storing data and instructions that can be accessed by the processor. It executes a sequence of commands or directions written in a programming language that can be executed by a computer. In one embodiment, the server (55) may include a cloud server. In another embodiment, the server (55) may include a local server. The processing subsystem (50) is configured to execute on the network to control bidirectional communications among a plurality of modules. In one embodiment, the network may include a wired network such as local area network (LAN). In another embodiment, the network may include a wireless network such as Wi-Fi, Bluetooth, Zigbee, near field communication (NFC), infra-red communication (RFID) or the like.
[0022] The processing subsystem (50) includes a querying input module (60) configured to receive one or more user prompts from a user. The one or more user prompts refers to queries, commands or instructions provided by the user in order to interact with an artificial intelligence (AI) platform. It must be noted that the one or more user prompts are in the form of text input. The said one or more user prompts convey the user’s intent or requirements and prompts the artificial intelligence platform to generate corresponding responses with relevant information. Further, the one or more user prompts can manifest in various forms base on the interface capabilities of the artificial intelligence platform. Examples of the forms includes, but are not limited to, text input, voice commands and gestures. Further, the one or more user prompts are received from the user via a prompt interface. The prompt interface is configured on a user device (not shown in FIG. 1) and is configured to display a field that receives the one or more user prompts from the user via a graphical user interface of the user device. Examples of the user device includes, but is not limited to, a personal computer (PC), a mobile phone, a tablet device, a personal digital assistant (PDA), a smart phone, a laptop, and pagers.
[0023] It will be appreciated to those skilled in the art that the artificial intelligence platform is an infrastructure that provides tools, libraries and Application Programming Interfaces (APIs) for deploying AI-powered applications and services. Examples of the AI platforms includes, but is not limited to, TensorFlow, PyTorch, Microsoft Azure AI Platform, Google Cloud AI Platform, Amazon SageMaker and IBM Watson. In one embodiment, the artificial intelligence platform is configured to integrate the above system (10) with any downstream large language model (LLM), which can be both private and public, to provide privacy preserved prompt engineering services for users while interacting with LLMs. The downstream LLM is pre-trained and fine-tuned.
[0024] In one embodiment, the one or more prompts comprises at least one or more normal prompt and chain of thought prompts, where the synthetic prompt is created based on synthetic attributes used as in earliest prompt for replacing identified sensitive information, stored for reversal of the response to the original name across the context for user consumption within the user chain of thoughts.
[0025] The processing subsystem (50) includes a query assessment module (70) operatively coupled to the querying input module (60). The query assessment module (70) is configured to analyze the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts. The one or more prompts are received from the querying input module (60). The one or more sensitive attributes refers to specific information or characteristics that the user may unintentionally disclose or include in the user prompts. Such specific information or characteristics could potentially be considered as sensitive or personal. In one embodiment, the one or more sensitive attributes includes, name, number, address, email address, password, location, date, time and the like. Further, the sensitive attributes can include personal identity information (PII), health information, financial information, location information, demographic information and the like.
[0026] In one embodiment, the query assessment module (70) is configured to query assessment module identify the one or more sensitive attributes using the artificial intelligence platform comprising name entity recognition and part of speech tagging.
[0027] Typically, the query assessment module (70) is configured to classify the one or more user prompts into multiple risk categories by comprehensively understanding the context, including the tense (past, present or future) and identify any entities present in the given one or more user prompts.
[0028] Further, a suitable natural language processing (NLP) technique is used to analyze the one or more user prompts. Typically, NLP is used to interpret the one or more user prompts thereby bridging the gap between human language and machine understanding. In one embodiment, NLP can be configured to detect language, parse text, determine proper part-of-speech for various words, and identify semantic relationships. In some embodiments, NLP can include statistical methods, machine learning methods, or rules-based and algorithmic methods. In such an embodiment, the NLP can be a machine learning model that is trained to extract entities from the one or more user prompts. The model training can involve feeding a training algorithm with sample questions that include the entities for that specific utterance. The machine learning algorithm can generalize the question based on the language models provided to the algorithm.
[0029] The processing subsystem (50) includes a query annotation module (80) operatively coupled to the querying input module, wherein the query annotation module (80) is configured to annotate one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt. As used herein, ‘annotating’ helps to visualize the parts of the one or more user prompts that are subjected with risk entities. Consequently, the user is able to understand and can provide feedback if the response is not correct. Further, ‘annotating’ helps to exhibit the potential PII present in the one or more user prompts to the user.
[0030] In one embodiment, the one or more prompts and the synthetic prompt with privacy preservation are stored for future references and audit purposes.
[0031] The processing subsystem (50) includes a context and sensitivity based query paraphrasing module (90) operatively coupled to the query annotation module (80). The context and sensitivity based query paraphrasing module (90) is configured to generate one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information. Further, the context and sensitivity based query paraphrasing module is configured to paraphrase the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes. For instance, PII information is removed from the original user prompt without changing the context.
[0032] The context and sensitivity based query paraphrasing module (90) includes a privacy budget module (100), wherein the privacy budget module (100) is configured to alter identified numerical values in sensitive data by performing a mathematical operation. In one embodiment, the numerical values in sensitive data includes dates, currency and cardinal. In such an embodiment, the ‘budget’ is set to a numerical value. Subsequently, for each of the numerical values in sensitive data, a real number value is sampled from the gaussian distribution where the mean is zero and the standard deviation is a function of the budget. Then, this sampled number is multiplied to the original value to generate noise value termed as epsilon. This noise termed as epsilon is then added to the numerical value in sensitive data to generate altered values for creating the synthetic prompt. For instance, consider that the sensitive numerical value is a currency, $10. In such a case, say the ‘budget’ is 0.8. Then taking the function of standard deviation as 1 - budget, the gaussian distribution has mean of 0 and standard deviation of 0.2. Using the gaussian distribution, the sampled value is 0.12. Then the noise termed as epsilon is 1.2 and the synthetic entity currency becomes $11.2.
[0033] In one embodiment, the privacy budget module (100) is configured to change sensitive numerical attributes based on required configuration for privacy preservation of numerical values, like the date, currency, GPS coordinates and other numerical values in sensitive data, using Differential Privacy.
[0034] In one embodiment, the context and sensitivity based query paraphrasing module (90) is configured to maintain the one or more sensitive attributes and corresponding synthetic one or more sensitive synthetic attributes into the platform.
[0035] The processing subsystem (50) includes a query output module (110) operatively coupled to the context and sensitivity based query paraphrasing module 990). The query output module (110) is configured to receive the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model. In one embodiment, the query output module (110) is configured to query the artificial intelligence platform by using the altered versions of one or more prompts in which the one or more sensitive attributes information are replaced with the corresponding sensitive synthetic attributes generated by the context and sensitivity based query paraphrasing module and thereby retaining the corresponding context of the one or more prompts.
[0036] The processing subsystem (50) includes a query response origin module (120) operatively coupled to the query output module (110). The query response origin module (120) is configured to receive an output generated by the artificial intelligence platform upon receiving the query.
[0037] The processing subsystem (50) includes a query response assessment module (130) operatively coupled to the query response origin module (120). The query response assessment module (130) is configured to identify the corresponding sensitive synthetic attributes from the output. The sensitive synthetic attributes are generated by a sensitive synthetic attributes word generation technique. Examples of the sensitive synthetic attributes word generation technique includes, but is not limited to, pseudo-random number generation technique, list-based generation technique, Markov chain based generation technique, neural network based generation techniques, combination techniques, cryptographically secure randomness technique, and word embedding techniques.
[0038] In one embodiment, the context and sensitivity based query paraphrasing module uses the sensitive synthetic attributes word generation technique to generate synthetic attributes initially and the query response assessment module use a mapping table to convert the one or more sensitive attributes to one or more sensitive attributes information and vice versa. Typically, the mapping table is a data structure used to associate one set of values with another. The mapping table includes a set of rows and columns. A single record is stored as a row. It includes two columns consisting of input values and output values respectively. Specifically, the two columns include the one or more sensitive attributes and one or more sensitive attributes information. It must be noted that synthetic entities is remapped to the original entities when a response is received from the AI platform.
[0039] The processing subsystem (50) includes a response privacy and context preservation module (140) operatively coupled to the query response module (130). The response privacy and context preservation module (140) is configured to modify the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user.
[0040] The processing subsystem (50) includes an output module (150) operatively coupled to the response privacy and context preservation module (140). The output module is configured to render the output in a user interface associated with the user.
[0041] FIG. 2 is a block diagram of a computer or a server in accordance with an embodiment of the present disclosure. The server (300) includes processor(s) (330), and memory (310) operatively coupled to the bus (320). The processor(s) (330), as used herein, means any type of computational circuit, such as, but not limited to, a microprocessor, a microcontroller, a complex instruction set computing microprocessor, a reduced instruction set computing microprocessor, a very long instruction word microprocessor, an explicitly parallel instruction computing microprocessor, a digital signal processor, or any other type of processing circuit, or a combination thereof.
[0042] The memory (310) includes several subsystems stored in the form of executable program which instructs the processor (330) to perform the method steps illustrated in FIG. 1. The memory (310) includes a processing subsystem (50) of FIG. 1. The processing subsystem (50) further has following modules: a querying input module (60), a query assessment module (70), a query annotation module (80), a context and sensitivity based query paraphrasing module (90), a privacy budget module (100), a query output module (110), a query response origin module (120), a query response assessment module (130), a response privacy and context preservation module (140) and an output module (150).
[0043] The querying input module (60) is configured to receive one or more prompts from a user. The query assessment module (70) is operatively coupled to the querying input module (60), wherein the query assessment module (70) is configured to analyze the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts. The query annotation module (80) is operatively coupled to the querying input module (60), wherein the query annotation module (80) is configured to annotate one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt. The context and sensitivity based query paraphrasing module (90) is operatively coupled to the query annotation module (80). The context and sensitivity based query paraphrasing module (90) is configured to generate one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information. Further, the privacy budget module (100) is operatively coupled to the context and sensitivity based query paraphrasing module (90), wherein the privacy budget module (100) is configured to alter identified numerical values in sensitive data by performing a mathematical operation. Further, the context and sensitivity based query paraphrasing module (90) is configured to paraphrase the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes. The query output module (110) is operatively coupled to the context and sensitivity based query paraphrasing module (90), wherein the query output module (110) is configured to receive the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model. The query response origin module (120) is operatively coupled to the query output module (110), wherein the query response origin module (120) is configured to receive an output generated by the artificial intelligence platform upon receiving the query. The query response assessment module (130) is operatively coupled to the query response origin module (120), wherein the query response assessment module (130) is configured to identify the corresponding sensitive synthetic attributes from the output. The response privacy and context preservation module (140) is operatively coupled to the query response assessment module (130), wherein the response privacy and context preservation module (140) is configured to modify the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user. Furthermore, the output module (150) is operatively coupled to the response privacy and context preservation module (140), wherein the output module (150) is configured to render the output in a user interface associated with the user.
[0044] The bus (220) as used herein refers to be internal memory channels or computer network that is used to connect computer components and transfer data between them. The bus (220) includes a serial bus or a parallel bus, wherein the serial bus transmits data in bit-serial format and the parallel bus transmits data across multiple wires. The bus (220) as used herein, may include but not limited to, a system bus, an internal bus, an external bus, an expansion bus, a frontside bus, a backside bus and the like.
[0045] Consider a scenario, where a registered user X submits a query prompt to the system (10). The query prompt is received by the querying input module (60). Let the query prompt be ‘My name is John Abhram. I live in Hyderabad. I want to move from Hyderabad to Mumbai. Give me the best way.’ Subsequently, the query prompt is analyzed using an NLP technique to identify sensitive attributes or potential risks such as location risk and personal risk by the query assessment module (70). The user X is alerted of the identified potential risks, the tense and first person reference. The query annotation module (80) annotates the sensitive attributes. In other words, a synthetic prompt is then generated which replaces the first person reference, the user location and the user’s name (as provided in the prompt). Specifically, John Abhram, Hyderabad and Mumbai is replaced with Mark King, Saint Helena and Port respectively. The context and sensitivity based query paraphrasing module (90) generates synthetic attributes based on the sensitive attributes to prevent leakage of sensitive personal and organizational information. The privacy budget module (100) identifies and alters the numerical values in sensitive data by performing a mathematical operation. Subsequently, the query prompt is paraphrased by converting the first person to third person and by making necessary grammatical and synthetic changes. The query prompt is now a synthetic privacy preserved query. The query output module (110) sends this query prompt to a downstream neural network with attention based language model. The response to the query prompt is received by the query response origin module (120). Subsequently, sensitive synthetic attributes are identified from the response by the query response assessment module (130). At this point, the response is altered by replacing the sensitive synthetic attributes with corresponding sensitive attributes by using a mapping table. However, it must be noted that the context of the query prompt is retained. Specifically, Mark King, Saint Helena and Port is replaced with John Abhram, Hyderabad and Mumbai respectively. Finally, the response that is altered is rendered as the output to user X. By this way, privacy is preserved in user prompts and the response from the AI platform.
[0046] FIG. 3(a) illustrates a flow chart representing the steps involved in a method to alter queries provided to an artificial intelligence platform for preserving privacy in accordance with an embodiment of the present disclosure. FIG. 3(b) illustrates continued steps of method to alter queries provided to an artificial intelligence platform for preserving privacy of FIG. 3(a) in accordance with an embodiment of the present disclosure. The method (400) includes receiving, by a querying input module, one or more prompts from a user in step (410). In one embodiment, the one or more prompts comprises at least one or more normal prompt and chain of thought prompts, where the synthetic prompt is created based on synthetic attributes used as in earliest prompt for replacing identified sensitive information, stored for reversal of the response to the original name across the context for user consumption within the LLM operations.
[0047] In one embodiment, the system discussed earlier is integrated with any downstream large language model, which can be both private and public, to provide privacy preserved prompt engineering services for users while interacting with LLMs.
[0048] The method (400) includes analyzing, by a query assessment module, the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts in step (420). In one embodiment, the one or more sensitive attributes are identified using the artificial intelligence platform comprising name entity recognition and part of speech tagging.
[0049] The method (400) includes annotating, by a query annotation module, one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt in step (430). In one embodiment, the one or more prompts and the synthetic prompt with privacy preservation are stored for future references and audit purposes.
[0050] The method (400) includes generating, by a context and sensitivity based query paraphrasing module, the one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information in step (440).
[0051] The method (400) includes altering, by a privacy budget module, identified numerical values in sensitive data by performing a mathematical operation in step (450). In one embodiment, the sensitive numerical attributes is changed based on required configuration for privacy preservation of numerical values, like the date, currency, GPS coordinates and other numerical values in sensitive data, using Differential Privacy.
[0052] The method (400) includes paraphrasing, by the context and sensitivity based query paraphrasing module, the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes in step (460). In one embodiment, the one or more sensitive attributes and corresponding synthetic one or more sensitive synthetic attributes is maintained into the platform. In one embodiment, uses the earlier mentioned sensitive synthetic attributes word generation technique to generate synthetic attributes initially.
[0053] The method (400) includes receiving, by a query output module, the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model in step (470). In one embodiment, the artificial intelligence platform is queried by altering the one or more prompts by replacing the one or more sensitive attributes information with the corresponding sensitive synthetic attributes generated by retaining the corresponding context of the one or more prompts.
[0054] The method (400) includes receiving, by a query response origin module, an output generated by the artificial intelligence platform upon receiving the query in step (480).
[0055] The method (400) includes identifying, by a query response assessment module, the corresponding sensitive synthetic attributes from the output in step (490). The sensitive synthetic attributes is generated by a sensitive synthetic attributes word generation technique. Examples of the sensitive synthetic attributes word generation technique includes, but is not limited to, pseudo-random number generation technique, list-based generation technique, Markov chain based generation technique, neural network based generation techniques, combination techniques, cryptographically secure randomness technique, and word embedding techniques.
[0056] In one embodiment, the method (400) uses a mapping table to convert the one or more sensitive attributes to one or more sensitive attributes information and vice versa.
[0057] The method (400) includes modifying, by a response privacy and context preservation module, the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user in step (500). In one embodiment, the one or more sensitive attributes includes, but is not limited to, name, number, address, email address, password and location.
[0058] The method (400) includes rendering, by an output module, the output in a user interface associated with the user in step (510).
[0059] Various embodiments of the present disclosure provides a system to alter queries provided to an artificial intelligence platform for preserving privacy. The query assessment module (70) meticulously analyzes the user prompts to identify potential risks and promptly alerts the user to these risks. Further, the query assessment module (70) comprehensively understands the context of the user prompts including the tense. The query annotation module (80) helps to visualize the parts of the user prompts that have risk entities. The context and sensitivity based query paraphrasing module (90) and the response privacy and context preservation module (140) ensures that the context of the user prompt is retained while replacing the sensitive attributes with the synthetic attributes and vice-versa.
[0060] Further, the system discloses an effective privacy preserving statistical mechanism to accurately analyze and understand the nature, scope, context, purpose, sensitivity and privacy budget of queries and corresponding responses.
[0061] It will be understood by those skilled in the art that the foregoing general description and the following detailed description are exemplary and explanatory of the disclosure and are not intended to be restrictive thereof.
[0062] While specific language has been used to describe the disclosure, any limitations arising on account of the same are not intended. As would be apparent to a person skilled in the art, various working modifications may be made to the method in order to implement the inventive concept as taught herein.
[0063] The figures and the foregoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, the order of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions of any flow diagram need not be implemented in the order shown; nor do all of the acts need to be necessarily performed. Also, those acts that are not dependent on other acts may be performed in parallel with the other acts. The scope of embodiments is by no means limited by these specific examples.
,CLAIMS:1. A system (10) to alter one or more queries provided to an artificial intelligence platform for preserving privacy comprising:
characterized in that:
at least one processor (20) in communication with a client processor (30); and
at least one memory (40) comprises a set of program instructions in the form of a processing subsystem (50), configured to be executed by the at least one processor (20), wherein the processing subsystem (50) is hosted on a server (55) and configured to execute on a network to control bidirectional communications among a plurality of modules comprising:
a querying input module (60) configured to receive one or more prompts from a user;
a query assessment module (70) operatively coupled to the querying input module (60), wherein the query assessment module (70) is configured to analyze the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts;
a query annotation module (80) operatively coupled to the querying input module (60), wherein the query annotation module (80) is configured to annotate one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt;
a context and sensitivity based query paraphrasing module (90) operatively coupled to the query annotation module (80), wherein the context and sensitivity based query paraphrasing module (90) is configured to:
generate one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information; and
paraphrase the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes;
a privacy budget module (100) within the context and sensitivity based query paraphrasing module (90), wherein the privacy budget module (100) is configured to alter numerical values in sensitive data by performing a mathematical operation;
a query output module (110) operatively coupled to the context and sensitivity based query paraphrasing module (90), wherein the query output module (110) is configured to receive the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model;
a query response origin module (120) operatively coupled to the query output module, (110) wherein the query response origin module (120) is configured to receive an output generated by the artificial intelligence platform upon receiving the query;
a query response assessment module (130) operatively coupled to the query response origin module (120), wherein the query response assessment module (130) is configured to identify the corresponding sensitive synthetic attributes from the output;
a response privacy and context preservation module (140) operatively coupled to the query response module, wherein the response privacy and context preservation module (140) is configured to modify the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more sensitive attributes to generate the response to the user; and
an output module (150) operatively coupled to the response privacy and context preservation module (140), wherein the output module (150) is configured to render the output in a user interface associated with the user.
2. The system (10) as claimed in claim1, wherein the sensitive synthetic attributes word generation technique comprises pseudo-random number generation technique, list-based generation technique, Markov chain based generation technique, neural network based generation techniques, combination techniques, cryptographically secure randomness technique, and word embedding techniques.
3. The system (10) as claimed in claim1, wherein the one or more sensitive attributes information comprises name, number, address, email address, password, location, date and time.
4. The system (10) as claimed in claim1, wherein the query output module is configured to query the artificial intelligence platform by using the altered versions of one or more prompts in which the one or more sensitive attributes information are replaced with the corresponding sensitive synthetic attributes generated by the context and sensitivity based query paraphrasing module and thereby retaining the corresponding context of the one or more prompts.
5. The system (10) as claimed in claim 1, wherein the context and sensitivity based query paraphrasing module is configured to maintain the one or more sensitive attributes and corresponding synthetic one or more sensitive synthetic attributes into the platform.
6. The system (10) as claimed in claim 1, wherein the one or more prompts and the synthetic prompt with privacy preservation are stored for future references and audit purposes.
7. The system (10) as claimed in claim 1, wherein the one or more prompts comprises at least one or more normal prompt and chain of thought prompts, where the synthetic prompt is created based on synthetic attributes used as in earliest prompt for replacing identified sensitive information, stored for reversal of the response to the original name across the context for user consumption within the user chain of thoughts.
8. The system (10) as claimed in claim 1, wherein the query assessment module identifies the one or more sensitive attributes using the artificial intelligence platform comprising name entity recognition and part of speech tagging.
9. The system (10) as claimed in claim 1, wherein the context and sensitivity based query paraphrasing module uses the sensitive synthetic attributes word generation technique to generate synthetic attributes initially and the query response assessment module use a mapping table to convert the one or more sensitive attributes to one or more synthetic sensitive attributes information and vice versa.
10. The system (10) as claimed in claim 1, wherein the privacy budget module changes sensitive numerical attributes based on required configuration for privacy preservation of numerical values, like the date, currency, GPS coordinates and other numerical values in sensitive data, using Differential Privacy.
11. The system (10) as claimed in claim 1, wherein the artificial intelligence platform comprises integrating the above system with any downstream large language model, which can be both private and public, to provide privacy preserved prompt engineering services for users while interacting with LLMs.
12. A method (400) to alter queries provided to an artificial intelligence platform for preserving privacy comprising:
characterized in that:
receiving, by a querying input module, one or more prompts from a user; (410)
analyzing, by a query assessment module, the one or more prompts using a plurality of natural language processing techniques to identify one or more sensitive attributes present in the one or more prompts; (420)
annotating, by a query annotation module, one or more sensitive attributes identified by the query assessment module for privacy preservation with synthetic prompt; (430)
generating, by a context and sensitivity based query paraphrasing module, the one or more synthetic attributes based on one or more sensitive attributes identified to prevent leakage of sensitive personal and organizational information; (440)
altering, by a privacy budget module, identified numerical values in sensitive data by performing a mathematical operation; (450)
paraphrasing, by the context and sensitivity based query paraphrasing module, the input prompt by converting the input prompt in case of first person to third person when the subject is exposing sensitive information, by making necessary grammatical and synthetic changes; (460)
receiving, by a query output module, the synthetic privacy preserved query from upstream and send the query to downstream neural network with attention based language model; (470)
receiving, by a query response origin module, an output generated by the artificial intelligence platform upon receiving the query; (480)
identifying, by a query response assessment module, the corresponding sensitive synthetic attributes from the output; (490)
modifying, by a response privacy and context preservation module, the output by replacing the corresponding sensitive synthetic attributes with the corresponding one or more original sensitive attributes to generate the response to the user; (500) and
rendering, by an output module, the output in a user interface associated with the user. (510)
Dated this 08th day of April, 2024
Signature
Jinsu Abraham
Patent Agent (IN/PA3267)
Agent for the Applicant
| # | Name | Date |
|---|---|---|
| 1 | 202341026661-STATEMENT OF UNDERTAKING (FORM 3) [11-04-2023(online)].pdf | 2023-04-11 |
| 2 | 202341026661-PROVISIONAL SPECIFICATION [11-04-2023(online)].pdf | 2023-04-11 |
| 3 | 202341026661-PROOF OF RIGHT [11-04-2023(online)].pdf | 2023-04-11 |
| 4 | 202341026661-POWER OF AUTHORITY [11-04-2023(online)].pdf | 2023-04-11 |
| 5 | 202341026661-FORM FOR STARTUP [11-04-2023(online)].pdf | 2023-04-11 |
| 6 | 202341026661-FORM FOR SMALL ENTITY(FORM-28) [11-04-2023(online)].pdf | 2023-04-11 |
| 7 | 202341026661-FORM 1 [11-04-2023(online)].pdf | 2023-04-11 |
| 8 | 202341026661-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [11-04-2023(online)].pdf | 2023-04-11 |
| 9 | 202341026661-EVIDENCE FOR REGISTRATION UNDER SSI [11-04-2023(online)].pdf | 2023-04-11 |
| 10 | 202341026661-FORM-26 [24-08-2023(online)].pdf | 2023-08-24 |
| 11 | 202341026661-DRAWING [08-04-2024(online)].pdf | 2024-04-08 |
| 12 | 202341026661-CORRESPONDENCE-OTHERS [08-04-2024(online)].pdf | 2024-04-08 |
| 13 | 202341026661-COMPLETE SPECIFICATION [08-04-2024(online)].pdf | 2024-04-08 |
| 14 | 202341026661-Power of Attorney [15-04-2024(online)].pdf | 2024-04-15 |
| 15 | 202341026661-FORM28 [15-04-2024(online)].pdf | 2024-04-15 |
| 16 | 202341026661-FORM-9 [15-04-2024(online)].pdf | 2024-04-15 |
| 17 | 202341026661-Covering Letter [15-04-2024(online)].pdf | 2024-04-15 |
| 18 | 202341026661-STARTUP [18-04-2024(online)].pdf | 2024-04-18 |
| 19 | 202341026661-FORM28 [18-04-2024(online)].pdf | 2024-04-18 |
| 20 | 202341026661-FORM 18A [18-04-2024(online)].pdf | 2024-04-18 |
| 21 | 202341026661-FER.pdf | 2024-07-11 |
| 22 | 202341026661-FORM 3 [26-07-2024(online)].pdf | 2024-07-26 |
| 23 | 202341026661-FER_SER_REPLY [08-01-2025(online)].pdf | 2025-01-08 |
| 24 | 202341026661-COMPLETE SPECIFICATION [08-01-2025(online)].pdf | 2025-01-08 |
| 25 | 202341026661-US(14)-HearingNotice-(HearingDate-04-09-2025).pdf | 2025-08-08 |
| 26 | 202341026661-FORM-26 [28-08-2025(online)].pdf | 2025-08-28 |
| 27 | 202341026661-Correspondence to notify the Controller [28-08-2025(online)].pdf | 2025-08-28 |
| 28 | 202341026661-US(14)-ExtendedHearingNotice-(HearingDate-10-12-2025)-1100.pdf | 2025-11-13 |
| 1 | Search026661E_10-06-2024.pdf |