Sign In to Follow Application
View All Documents & Correspondence

System And Method For Demarcating Sensitive Regions In The Query Output

Abstract: Disclosed is a system and method for identifying sensitive data regions in response obtained based on execution of a query against a database. The system and method may enable defining rules capable of being applied on a response received from the database. The rules defined by the user may help in identifying vertical regions or portions in the response which might be sensitive. The system and method may enable analyzing the query and determining structure defined within the query. The system and method may further enable defining the sensitivity of the column and the query. The system and method may further enable identifying the sensitive region in the response returned after the execution of query. The identification of the sensitive region may depend on the type of query and the contextual information.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
27 November 2013
Publication Number
31/2015
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
ip@legasis.in
Parent Application
Patent Number
Legal Status
Grant Date
2022-09-30
Renewal Date

Applicants

Tata Consultancy Services Limited
Nirmal Building, 9th Floor, Nariman Point, Mumbai 400021, Maharashtra, India

Inventors

1. VIDHANI, Kumar Mansukhlal
Tata Consultancy Services Limited, Tata Research Development & Design Centre,54 B, Hadapsar Industrial Estate, Hadapsar, Pune - 411013, Maharashtra, India
2. SIRIGIREDDY, Gangadhara Reddy
Tata Consultancy Services Limited, Tata Research Development & Design Centre,54 B, Hadapsar Industrial Estate, Hadapsar, Pune - 411013, Maharashtra, India
3. NAIR, Vikram
Tata Consultancy Services Limited, Tata Research Development & Design Centre,54 B, Hadapsar Industrial Estate, Hadapsar, Pune - 411013, Maharashtra, India
4. BANAHATTI, Vijayanand Mahadeo
Tata Consultancy Services Limited, Tata Research Development & Design Centre,54 B, Hadapsar Industrial Estate, Hadapsar, Pune - 411013, Maharashtra, India
5. LODHA, Sachin P.
Tata Consultancy Services Limited, Tata Research Development & Design Centre,54 B, Hadapsar Industrial Estate, Hadapsar, Pune - 411013, Maharashtra, India

Specification

DESC:
FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION

(See Section 10 and Rule 13)

Title of invention:

SYSTEM AND METHOD FOR DEMARCATING SENSITIVE REGIONS IN THE QUERY OUTPUT

APPLICANT:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India

The following specification describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD
[001] The present disclosure described herein, in general, relates to a system and method for data privacy and more particularly to the system and method for identifying sensitive data in a response obtained based on execution of a query against a database.
BACKGROUND
[002] In an Information Technology (IT)-enabled platform, database systems are implemented in order to manage data associated to various computing applications. The execution of the computing applications is facilitated through persistence of the data residing in the database systems. In the present scenario, the execution of the computing applications may be implemented through several processing stages, wherein at each stage, a distinct user may be involved. The user involved may access the data base systems in order to facilitate the execution of the computing applications using the data residing in the databases. Therefore, the multi-stage processing of the data may involve accessing of the data by the plurality of users. This in turn may affect the security or privacy of the data being managed through the database systems.
[003] In the present scenario, while facilitating the database management systems, the users execute Structured Query Language (SQL) queries of varying complexities to the databases and access the data. As a result of firing these SQL queries against the databases response may be received. In certain cases, portion(s) of the data may need to be protected from being accessed by the users, and hence the privacy associated to the portion(s) of the data is required to be preserved. The portion(s) of the data may refer to sensitive information present in the response. Therefore, the portion(s) of the data have to be secured, such that users are not permitted or prevented to/from access the said portion of the data received in response to the execution of the SQL queries.
[004] However, considering the amount of data, it is a technical challenge to identify the portion(s) of the data which needs to be prevented and/or secured from access by the users. Specifically, there is a technical problem of identifying region(s) of the response which may contain the portion(s) of the data whose privacy is to be preserved. The response of SQL queries consists of database rows divided into vertical regions called columns. The cell is the smallest identifiable region in the response and is the cross section between a specific row and a column. This cell is the smallest container for containing any information in the database specification. Considering the cellular structure associated with the databases, there is further technical challenge in identifying the portion(s) of the data to be protected from unauthorized access in the vertical and the horizontal spaces of the response. Furthermore, there is a technical challenge to identify the portion(s) in specific cells or in any subset of the response.
SUMMARY
[005] Before the present systems and methods, are described, it is to be understood that this disclosure is not limited to the particular systems, and methodologies described, as there can be multiple possible embodiments which are not expressly illustrated in the present disclosure. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present application. This summary is provided to introduce concepts related to systems and methods for identifying sensitive data in a response obtained based upon execution of a query on a database, and the concepts are further described below in the detailed description. This summary is not intended to identify essential features of the disclosure nor is it intended for use in determining or limiting the scope of the disclosure.
[006] In one implementation, a system for identifying sensitive data in a response obtained based upon execution of a query on a database is disclosed. In one aspect, the system may comprise a processor and a memory coupled to the processor. The processor may execute a plurality of modules present in the memory. The plurality of modules may further comprise a rule configuration module, a query analyzer module and a demarcation module. The rule configuration module may configure a plurality of rules on data stored in a database. In one embodiment, a rule of the plurality of rules may be configured to indicate presence of sensitive information in one or more datasets of the data. The query analyzer module may receive a query on the database. The query may be received in order to retrieve a response from the database. In one embodiment, the response may comprise a dataset, of the one or more datasets, relevant to the query. The query analyzer module may further process the query based upon at least one rule of the plurality of rules in order to identify a sensitive column in the response. The sensitive column may be identified from columns associated with the dataset. In one aspect, the sensitive column comprises a subset of sensitive information. Further, the query analyzer module may analyze the query based upon the identification of the sensitive column in order to determine contextual information associated with the query. The demarcation module may identify a scenario of a plurality of scenarios based upon the contextual information. Further, the demarcation module may execute an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell in the sensitive column. The sensitive cell comprises sensitive data.
[007] In another implementation, a method for identifying sensitive data in a response obtained based upon execution of a query on a database is disclosed. The method may comprise configuring a plurality of rules on data stored in a database. In one embodiment, a rule of the plurality of rules may be configured to indicate presence of sensitive information in one or more datasets of the data. The method may comprise receiving a query on the database. The query may be received in order to retrieve a response from the database. In one embodiment, the response may comprise a dataset, of the one or more datasets, relevant to the query. The method may comprise processing the query based upon at least one rule of the plurality of rules in order to identify a sensitive column in the response. The sensitive column may be identified from columns associated with the dataset. In one aspect, the sensitive column comprises a subset of sensitive information. Further, the method may comprise analyzing the query based upon the identification of the sensitive column in order to determine contextual information associated with the query. The method may comprise identifying a scenario of a plurality of scenarios based upon the contextual information. Further, the method may comprise executing an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell in the sensitive column. The sensitive cell comprises sensitive data. In this implementation, the aforementioned method may be performed by a processor using a set of instructions stored in a memory.
[008] In yet another implementation, a non-transitory computer readable medium embodying a program executable in a computing device for identifying sensitive data in a response obtained based upon execution of a query on a database is disclosed. The program may comprise a program code for configuring a plurality of rules on data stored in a database. In one embodiment, a rule of the plurality of rules may be configured to indicate presence of sensitive information in one or more datasets of the data. The program may comprise a program code for receiving a query on the database. The query may be received in order to retrieve a response from the database. In one embodiment, the response may comprise a dataset, of the one or more datasets, relevant to the query. The program may comprise a program code for processing the query based upon at least one rule of the plurality of rules in order to identify a sensitive column in the response. The sensitive column may be identified from columns associated with the dataset. In one aspect, the sensitive column comprises a subset of sensitive information. Further, program may comprise a program code for analyzing the query based upon the identification of the sensitive column in order to determine contextual information associated with the query. The program may comprise a program code for identifying a scenario of a plurality of scenarios based upon the contextual information. Further, the program may comprise a program code for executing an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell in the sensitive column. The sensitive cell comprises sensitive data.
BRIEF DESCRIPTION OF THE DRAWINGS
[009] The foregoing detailed description of embodiments is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the disclosure, there is shown in the present document example constructions of the disclosure; however, the disclosure is not limited to the specific methods and systems disclosed in the document and the drawings.
[0010] The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to refer like features and components.
[0011] Figure 1 illustrates a network implementation of a system for identifying sensitive data in a response obtained based upon execution of a query on a database, in accordance with an embodiment of the present disclosure.
[0012] Figure 2 illustrates the system, in accordance with an embodiment of the present disclosure.
[0013] Figures 3-12 illustrate a method for identifying sensitive data in a response obtained based upon execution of a query on a database, in accordance with an embodiment of the present disclosure.
DETAILED DESCRIPTION
[0014] Some embodiments of this disclosure, illustrating all its features, will now be discussed in detail. The words "comprising," "having," "containing," and "including," and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that the singular forms "a," "an," and "the" include plural references unless the context clearly dictates otherwise. Although any systems and methods similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present disclosure, the exemplary, systems and methods are now described. The disclosed embodiments are merely exemplary of the disclosure, which may be embodied in various forms.
[0015] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure is not intended to be limited to the embodiments illustrated, but is to be accorded the widest scope consistent with the principles and features described herein.
[0016] The present disclosure describes method(s) and system(s) for identifying sensitive data in a response obtained based upon execution of a query on a database. The method(s) and the system(s) may facilitate defining rules capable of being applied on a response received from the database. The rules defined may help in identifying vertical regions or portions in the response which might be sensitive. The rules are static information applicable to all responses, if the query executed by the user is found to be sensitive. The method(s) and the system(s) may further facilitate analyzing the query and determining structure defined within the query. The method(s) and the system(s) may further facilitate defining sensitivity of the column and the query. The method(s) and the system(s) may facilitate identifying sensitive region in the response returned after the execution of query. The identification of the sensitive region may depend on type of query and the contextual information associated with the query.
[0017] While aspects of described system and method for identifying sensitive data in a response obtained based upon execution of a query on a database may be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary system.
[0018] Although the present subject matter is explained considering that the system 102 is implemented on a server, it may be understood that the system 102 may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a cloud-based computing environment and the like. It will be understood that the system102 may be accessed by multiple users through one or more user devices 104-1, 104-2…104-N, collectively referred to as user devices 104 hereinafter, or applications residing on the user devices 104. In one implementation, the system102 may comprise the cloud-based computing environment in which a user may operate individual computing systems configured to execute remotely located applications. Examples of the user devices 104 may include, but are not limited to, a portable computer, a personal digital assistant, a handheld device, a workstation and the like. The user devices 104 are communicatively coupled to the system102 through a network 106.
[0019] In one implementation, the network 106 may be a wireless network, a wired network or a combination thereof. The network 106 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network 106 may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.
[0020] Referring now to Figure 2, the system 102 is illustrated in accordance with an embodiment of the present disclosure. In one embodiment, the system 102 may include a processor 202, an input/output (I/O) interface 204, and a memory 206. The processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor 202 is configured to fetch and execute computer-readable instructions stored in the memory 206.
[0021] The I/O interface 204 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 204 may allow the system 102 to interact with the user directly or through the user devices 104. Further, the I/O interface 204 may enable the system 102 to communicate with other computing devices, such as web servers and external data servers (not shown). The I/O interface 204 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. The I/O interface 204 may include one or more ports for connecting a number of devices to one another or to another server.
[0022] The memory 206 may include any computer-readable medium and computer program product known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memory 206 may include modules 208, and data 210.
[0023] The modules 208 include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types. The modules 208 may comprise a rule configuring module 212, a query analyzer module 214, a demarcation module 216 and other module 218. The other module 218 may include programs or coded instructions that supplement applications and functions of the system 102. The module 208 described herein may be implemented as software modules that may be executed in the cloud-based computing environment of the system 102.
[0024] The data 210, amongst other things, serves as a repository for storing data processed, received, and generated by one or more of the modules 208. The data 210 may also include a database 220, a rule repository 221 and other data 222. The other data 222 may include data generated as a result of the execution of one or more modules in the other module 218. The detail working of the system 102 along with the modules and/or components of the system 102 is further described referring to figures 2-12 as below.
RULE CONFIGURING MODULE 212
[0025] In an embodiment, the rule configuring module 212 may configure a plurality of rules on data stored in the database 220. In one aspect, a rule of the plurality of rules is configured to indicate presence of sensitive information in one or more datasets of the data. The one or more datasets are arranged in form of a plurality of tables. Further, each table comprises a row and a column. The intersection of the row and the column forms a cell. The plurality of rules is configured in a configuration file (not shown in figure) stored in the rule repository 221 coupled with the database 220. The rules configured are to be applicable on responses received from the database 220 as a result of execution of queries on the database 220. In one example, the database may be a Structured Query Language (SQL) database. In one embodiment, the rules may defined by the user using the rule configuring module 212. The rules defined may enable identification of vertical regions in the responses which might be sensitive. The rule configuring module 212 may help the user to configure the rules on sensitive information to capture exception scenario in regular column masking. Specifically, the rule configuring module 212 may be configured to handle the scenario of non-sensitive information existing at cell level within a sensitive column.
QUERY ANALYZER MODULE 214
[0026] In one embodiment, the query analyzer module 214 may receive a query on the database 220. In one example, the query received is a structured query language (SQL) query. The query may be received in order to retrieve a response from the database 220. The response may comprise a dataset, of the one or more datasets, relevant to the query. The query analyzer module 214 may process the query based upon at least one rule of the plurality of rules stored in the configuration file in order to identify a sensitive column in the response. The sensitive column is identified from columns associated with the dataset. The query may be processed by creating mapping between each column, table and schema using a column list, a table list and a schema list associated with the query. Further, each column may be defined either as a sensitive column or a non-sensitive column based upon the at least one rule by accessing the at least one rule from the plurality of rules. Furthermore, the query may be designated as a sensitive query based upon the sensitive column. The sensitive column comprises a subset of sensitive information.
[0027] In one embodiment, after the identification of the sensitive column the query analyzer module 214 may further analyze the query in order to determine contextual information associated with the query. The contextual information may comprise type of the query, order of the query, regular expression, primary key range and a SQL clause of the query. Thus the analysis of the query facilitates in determining structure defined within the query and thereby defining the sensitivity of the column and the query. In some embodiments, the query analyzer module 214 may perform the analysis of the query with respect to extracting parts of the query based on the rules defined in order to identify the vertical regions in the response which are sensitive. The analysis of the query may further facilitate linking of various structural information of the database 220 including tables, columns, schemas, and the like based on the rules defined by the rule configuring module 212. The tasks/functionalities performed by the query analyzer module 214 are described referring to figure 3 by way of an example.
[0028] Referring now to Figure 3, a method 300 for the analysis of the query by the query analyzer module 214 is shown, in accordance with an embodiment of the present disclosure.
[0029] As illustrated, at block 302, the query submitted by a user is read by the query analyzer module 214.
[0030] At block 304, SELECTION list may be extracted from the query.
[0031] At block 306, FROM list may be extracted from the query.
[0032] At block 308, COLUMNS list from the selection may be compiled.
[0033] At block 310, TABLES list from the FROM list and the SELECTION list may be compiled.
[0034] At block 312, it is verified whether the user has selected a schema.
[0035] At block 314, SCHEMAS list from the database and from the query may be compiled, when it is determined that the user has not selected the schema based on the verification step at the block 312.
[0036] At block 316, the expressions in the COLUMNNS list may be parsed in order to identify columns, when it is determined that the user has selected the schema based on the verification step at the block 312.
[0037] At block 318, mapping between each column, table and schema may be created using COLUMNS list, the TABLES list and the SCHEMAS list.
[0038] At block 320, rules from the configuration file may be read and/or accessed.
[0039] At block 322, for each column in the COLUMNS list the sensitivity may be defined.
[0040] At block 324, the sensitivity of the query may be defined.
[0041] Further, the analysis of the query, via the query analyzer module 214 may help the demarcation module 216 to select one of multiple approaches/algorithms in order identify the sensitive regions in the response, as explained in detail as below.
DEMARCATION MODULE 216
[0042] In one embodiment, the demarcation module 216 may be configured to identify the sensitive region(s) or sensitive portion(s) in the data based on the analysis of the query and the rules defined and/or configured by the user. The identification of the sensitive region(s) or the sensitive portion(s) may be based on the selection of one of the multiple approaches, hereinafter referred to as algorithms interchangeably, corresponding to different scenarios. In an embodiment, the demarcation module 216 may identify a scenario of a plurality of scenarios based upon the contextual information. The contextual information consists of the order of the queries (in case of the complex queries), regular expression (for sensitive columns), primary key range, SQL clauses (like order by). Further, the contextual information may also comprise the type of the query. Further, the demarcation module 216 may execute an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell(s) in the sensitive column. The sensitive cell comprises sensitive data.
[0043] Thus, the demarcation module 216 may utilize the static information captured during the configuration of the rules/policies by the rule configuration module 212 and dynamic information (sensitivity of a column, sensitivity of the query) which is generated as a result of the query analysis by the query analyzer module 214 in order to identify the sensitive regions. Based on this information (static and dynamic), the demarcation module may execute one or more approaches/algorithms to identify the sensitive regions. The one or more algorithms executed by the demarcation module 216 are further explained, by way of examples, referring to figures 4, 5, 6 7, and 8.
[0044] In one aspect, it is to be understood that steps (302-322 as shown in Figure 3 and explained above) executed by the query analyzer module 214 for defining the sensitivity of the column constitute demarcation in the vertical space. Further, the demarcation in the horizontal space is identified based on various scenarios as explained below:
[0045] It must be understood that for each of the scenarios explained below, a sensitive query is a query containing sensitive information according to rule file (i.e. Query analysis reveals the query contains sensitive columns). Furthermore, a non-sensitive query is a query which does not contain sensitive information.
Scenario 1.1: Columns having the same name / different name in different tables configured as sensitive (in the configuration file, hereinafter referred to as a rule file, storing the plurality of rules) appearing as part of the query does not share value patterns (RegX) and query result is not sorted.
In order to explain the above scenario, let us assume/consider:
Sql – Sql query.
Ti – Table i.
Cm – is the mth column in query.
Ti.C – column C from the table Ti.
Ti.C.ValRegX = Value pattern for column C from the table Ti.
Ti.C.ValRegX.Values: All the values that ValRegX can possibly generate.

For any pair (i, j)
Ti.C, Tj.C ? Sql.
Ti.C.ValRegX.Values n Tj.C.ValRegX.Values = F.

Input: Rule file containing information about schema, table, and column and value pattern for column.
Output:
Start row index
End row index
[0046] Figure 4 illustrates a flow diagram 400 depicting steps followed by the demarcation module 216 for implementing algorithm 1.1corresponding to the scenario 1.1 as disclosed above.
[0047] As illustrated, at block 402, index of the sensitive column may be retrieved from the analysis of the query analyzer module 214 with respect to defining of the sensitive column.
[0048] At block 404, the valRegX may be retrieved from the rule file.
[0049] At block 406, the valRegX may be matched with a first row.
[0050] At block 408, verification is done whether a match is found as a result of the matching of the valRegX with the first row at the block 406.
[0051] At block 410, a record in the database is incremented such that at least one record is matched with the valRegX.
[0052] At block 412, a row number is marked as a start index based on the matching of the at least one record with the valRegX at the block 406.
[0053] At block 414, the records in the database may be traversed unless non-matching record is found.
[0054] At block 416, a record is marked as an end row index based on the traversing of the records at the block 414.
Scenario 1.2: Columns having same name / different name in different tables configured as sensitive (in the rule file) appearing as part of the query do not share value patterns (RegX) and query result is not sorted. The scenario 1.2 describes demarcation of specific cells in the database 220 in accordance with an embodiment of the present disclosure.
In order to explain the above scenario, let us assume/consider:

For any pair (i, j)
Ti.C, Tj.C ? Sql.
Where,
Ti ? Tj && Ti.C.ValRegX ?Tj.C.ValRegX
n = Total number of rows in the result set

In this embodiment assume that the contributing tables T1, T2, T3…Tj have their contributing rows in the same order as their definition.

Input: RegX for all Columns in Tj. Rule file containing about schema, table, column, RegX to match and mask.

Output: Start row index, End row index
[0055] Figure 5 illustrates a flow diagram 500 depicting steps followed by the demarcation module 216 for implementing algorithm 1.1corresponding to the scenario 1.2 as disclosed above.
[0056] As illustrated, at block 502, the column index may be retrieved from the analysis of the query analyzer module 214 with respect to defining of the sensitive column.
[0057] At block 504, the valRegX may be matched with the first record.
[0058] At block 506, it may be determined whether or not match is found based on the matching step at the block 504.
[0059] At block 508, when the match is found at the block 506, the method 600 may be implemented as illustrated in figure 6.
[0060] At block 510, the valRegX may be matched with last record, when it is determined that the match is not found at the block 506.
[0061] At block 512, verification is done based on the matching at the block 510, whether or not the match is found.
[0062] At block 514, when the match is found at the block 512, the method 700 may be implemented as illustrated in figure 7.
[0063] At block 516, the valRegX may be matched with middle record, when it is determined that the match is not found at the block 512.
[0064] At block 518, verification is done based on the matching at the block 516, whether or not the match is found.
[0065] At block 520, when the match is found at the block 516, the method 800 may be implemented as illustrated in figure 8.
[0066] At block 522, when the match is not found at the block 516, the method 900 may be implemented as illustrated in figure 9.
[0067] Now referring figure 6, the method 600 is described. As illustrated, at block 602, the ls may be marked as the start index.
[0068] At block 604, leprev is assigned value of le, and le is determined based on ls and the le. Specifically, le=ls+le/2. The ls, the le and the leprev indicate the start position, the end position and the previous position of the end position respectively.
[0069] At block 606, it is checked whether the difference of the le and ls is either 1 or 0.
[0070] At block 608, the leth record may be matched with the valRegX, when it is determined that the difference of the le and the ls is neither 1 nor 0.
[0071] At block 610, it is verified whether the match is found based on the matching at the block 608.
[0072] At block 612, the le is assigned value of the leprev, and the ls is assigned value of the le, when it is determined that there is match found at the block 610.
[0073] At block 614, it is verified whether the leth record matches with the valRegX, when it is determined at the block 606 that the difference of the le and the ls is 1 or 0.
[0074] At block 616, the le may be marked as the end index, when it is determined that the leth record is matched at the block 614.
[0075] At block 618, the ls may be marked as the end index, when it is determined that the leth record is not matched at the block 614.
[0076] Now referring figure 7, the method 700 is described. As illustrated, at block 702, the le may be marked as the end index.
[0077] At block 704, lsprev is assigned value of ls, and the ls may be determined based on the ls and the le. Specifically, ls=ls+le/2. The ls, the le and the lsprev indicate the start position, the end position and the previous position of the start position respectively.
[0078] At block 706, it is checked whether the difference of the le and ls is either 1 or 0.
[0079] At block 708, the ls record may be matched with the valRegX, when it is determined that the difference of the le and the ls is neither 1 nor 0.
[0080] At block 710, it is verified whether the match is found based on the matching at the block 708.
[0081] At block 712, the le is assigned value of the ls, and the ls is assigned value of the lsprev, when it is determined that there is a match found at the block 710.
[0082] At block 714, it is verified whether the ls matches with the RegX, when it is determined at the block 706 that the difference of the le and the ls is 1 or 0.
[0083] At block 716, the ls may be marked as the start index, when it is determined that the ls record is matched with the RegX at the block 714.
[0084] At block 718, the le may be marked as the start index, when it is determined that the ls record is not matched with the RegX at the block 714.
[0085] Referring to figure 8, the method 800 is described. At block 802, the middle index may be marked as temporary end index, and middle+1 index may be marked as temporary start index.
[0086] At block 804, the method 600 may be implemented in order to identify the end index.
[0087] At block 806, the method 700 may be implemented in order to identify the start index.
[0088] Referring to figure 9, the method 900 is described. At block 902, the column having the valRegX matched with the middle row may be retrieved.
[0089] At block 904, the order of query matching the column may be retrieved.
[0090] At block 906, it may be verified whether the order of the query is less than order of sensitive query.
[0091] At block 908, the ls may be determined based on the le, when it is determined that the order of query is less than the order of sensitive query at the block 906. Specifically, herein, ls=le/2.
[0092] At block 910, the le may be determined based on the le, when it is determined that the order of query is not less than the order of sensitive query at the block 906. Specifically, herein, le=le/2.
[0093] At block 912, the middle row may be retrieved and matched with the valRegX of the sensitive column.
[0094] At block 914, it may be verified that whether the middle row is matched with the valRegX of the sensitive column based on the matching step at the block 912.
[0095] At block 916, the sensitive row may be marked as the middle row, when the match is found at the block 914. Else, the method 900 may proceed with the step at the block 902 again. Further, the method may proceed with the implementation of the method 800 for the identification of the end index and the start index.
[0096] In an embodiment, the middle row is defined as (Is + Ie) / 2. Further, the order of the query is the sequence in which it appears. The given query sql may consists of the other queries q1, q2 ...qn.
Scenario 2: Tables with their primary keys defined in different ranges but RegX for the sensitive columns may be equal or subset of each other and result is not sorted. The scenario 2 describes demarcation through primary key in accordance with an embodiment of the present disclosure In order to explain the above scenario, let us assume/consider:
Sql: Sql query.
T: Table.
Ti: ith Table.
Ti.PriKey: A set of Primary Key values of Table Ti.
Ti.C: Column C of the Table Ti.
Ti.C.ValRegX.Values: All the values that ValRegX can possibly generate.

For any pair (i, j)
Ti, Tj ? Sql and
Ti.Prikey n Tj.PriKey = Ø and

There exists (i, j) such that
Ti.C.ValRegX = Tj.C.ValRegX or Ti.C.ValRegX.Values ? Tj.C.ValRegX.Values or Tj.C.ValRegX.Values ? Ti.C.ValRegX.Values or Ti.C.ValRegX.Values n Tj.C.ValRegX.Values ?Ø

[0097] Figure 10 illustrates a flow diagram 1000 depicting steps followed by the demarcation module 216 for implementing algorithm 2 corresponding to the scenario 2 as disclosed above.
[0098] As illustrated, at block 1002, the index of the sensitive column may be retrieved from the analysis of the query analyzer module 214 with respect to defining of the sensitive column.
[0099] At block 1004, range of primary key values for the table containing the sensitive column may be obtained from the rule file.
[00100] At block 1006, first primary key value for the range may be checked.
[00101] At block 1008, it may be verified, whether the first primary key is within the range of the primary key values.
[00102] At block 1010, the next record is checked for the matching, when it is determined, at the block 1008 that the first primary key is not in the range.
[00103] At block 1012, the row may be marked as the start index, when it is determined, at the block 1008 that the first primary key is in the range.
[00104] At block 1014, the next record is checked for the matching after the marking of the row as the start index at the block 1012.
[00105] At block 1016, it may be verified, whether the next primary key is within the range of the primary key values.
[00106] At block 1018, the next record is checked for the matching, when it is determined, at the block 1016 that the next primary key is in the range. Specifically, if the primary key is in the range, the step corresponding to the block 1016 is re-performed by moving to the next record.
[00107] At block 1020, one record is moved up and marked as the end index, when it is determined, at the block 1016 that the next primary key is not in the range.
Scenario 3: Range of the values of Primary keys of the tables may intersect and RegX for the sensitive columns may be equal or subset of each other and result is not sorted. In order to explain the above scenario, let us assume/consider:
Sql: Sql query.
T: Table.
Ti: ith Table.
Ti.PriKey: A set of Primary Key values of Table Ti.
Ti.C: Column C of the Table Ti.
Ti.C.ValRegX.Values: All the values that ValRegX can possibly generate.

There exists (i, j) such that
Ti, Tj ? Sql and
Ti.Prikey n Tj.PriKey ?Ø and

Ti.C.ValRegX = Tj.C.ValRegX or Ti.C.ValRegX.Values ? Tj.C.ValRegX.Values or Tj.C.ValRegX.Values ? Ti.C.ValRegX.Values or Ti.C.ValRegX.Values n Tj.C.ValRegX.Values ?Ø

[00108] Figure 11 illustrates a flow diagram 1100 depicting steps followed by the demarcation module 216 for implementing algorithm 3 corresponding to the scenario 3 as disclosed above.
[00109] As illustrated, at block 1102, the index of the sensitive column may be retrieved from the analysis of the query analyzer module 214 with respect to defining of the sensitive column.
[00110] At block 1104, the tables and linked conditions of the tables may be obtained from the query analyzer module 214.
[00111] At block 1106, the select query may be executed to fetch records counts for the query selected.
[00112] At block 1108, the start index may be calculated based on the records count for the non sensitive query written in order.
[00113] At block 1110, the end index may be calculated based on the records count of the sensitive query.
[00114] In the above scenario described, Start row index = ? count_records_nonsensitve_query (for each non sensitive query appearing before sensitive query) and; End row index = Start row index + count_records_sensitive_query
Scenario 4: Result of the query is sorted/ ordered. RegX of two columns do not differ and Primary Keys may intersect.
[00115] When the result of the query is sorted or ordered, sensitive information gets scattered. In such cases, identifying the sensitive information requires additional query processing. One can use the approach/algorithm 1 mentioned corresponding to the Scenario 1 if the values of the RegX differ, but if RegX of two columns do not differ and Primary key spaces are also intersecting, then all the approaches/algorithms mentioned above will not be useful. In such scenario, one may have to go for a combination of response and query modification. This approach will be used only if the result of the query is sorted/ ordered otherwise one of the above mentioned approaches can demarcate the sensitive region. In order to explain the above scenario, let us assume/consider:
Sql: Sql query.
T: Table.
Ti: ith Table.
Ti.PriKey: A set of Primary Key values of Table Ti.
Ti.C: Column C of the Table Ti.
Ti.C.ValRegX.Values: All the values that ValRegX can possibly generate.

Q1, Q2…Qn ? Sql, where Q1, Q2… Qn are the queries appearing as part of the complete Sql query.
There exists (i, j) such that
Ti, Tj ? Sql and
Ti.Prikey n Tj.PriKey ?Ø and
Ti.C.ValRegX = Tj.C.ValRegX or Ti.C.ValRegX.Values ? Tj.C.ValRegX.Values or Tj.C.ValRegX.Values ? Ti.C.ValRegX.Values
[00116] In one embodiment, the query modification approach as per the present disclosure may modify the query to identify the sensitive region in the result against the masking the sensitive information itself. In addition to that, the query modification approach does not touch the existing list of columns to be retrieved in the modified way. On the contrary, the present disclosure enables adding a new column in the result while the remaining query will be the same so as the preserve the integrity of the overall approach. More particularly, the present disclosure enables analyzing only the response.
[00117] Figure 12 illustrates a flow diagram 1200 depicting steps followed by the demarcation module 216 for implementing algorithm 4 corresponding to the scenario 4 as disclosed above.
[00118] At block 1202, the sensitive queries from the complete SQL query may be retrieved.
[00119] At block 1204, for each sensitive query, column in the list of columns may be added with a unique string constant.
[00120] At block 1206, the query may be executed in order to retrieve the result.
[00121] At block 1208, the index of the sensitive column may be retrieved from the analysis of the query analyzer module 214 with respect to defining of the sensitive column.
[00122] At block 1210, for each record, the unique string constant may be utilized in order to identify the sensitive record.
[00123] At block 1212, the duplicate records from the result may be removed.
[00124] Exemplary embodiments discussed above may provide certain advantages. Though not required to practice aspects of the disclosure, these advantages may include those provided by the following features.
[00125] Some embodiments enable a system and a method for identifying sensitive information in a vertical region of the response.
[00126] Some embodiments enable a system and a method for identifying sensitive information in a horizontal region of the response.
[00127] Some embodiments enable a system and a method for identifying sensitive information in a specific cell of the response.
[00128] Some embodiments enable a system and a method for identifying sensitive information in any sub-set of the response.
[00129] Although implementations for methods and systems for identifying sensitive data in a response obtained based upon execution of a query on a database have been described in language specific to structural features and/or methods, it is to be understood that the implementations and/or embodiments are not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations for identifying sensitive data in a response obtained based upon execution of a query on a database. ,CLAIMS:WE CLAIM:

1. A method for identifying sensitive data in a response obtained based upon execution of a query on a database, the method comprising:
configuring a plurality of rules on data stored in a database, wherein a rule of the plurality of rules is configured to indicate presence of sensitive information in one or more datasets of the data;
receiving a query on the database, wherein the query is received in order to retrieve a response from the database, wherein the response comprises a dataset, of the one or more datasets, relevant to the query;
processing the query based upon at least one rule of the plurality of rules in order to identify a sensitive column in the response, wherein the sensitive column is identified from columns associated with the dataset, and wherein the sensitive column comprises a subset of sensitive information;
analyzing the query based upon the identification of the sensitive column in order to determine contextual information associated with the query;
identifying a scenario of a plurality of scenarios based upon the contextual information; and
executing an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell in the sensitive column, wherein the sensitive cell comprises sensitive data, and wherein the configuring, the receiving, the processing, the analyzing, the identifying and the executing are performed by a processor using a set of instructions stored in the memory.
2. The method of claim 1, wherein the one or more datasets are arranged in form of a plurality of tables, wherein each table comprises a row and a column, and wherein intersection of the row and the column forms a cell.
3. The method of claim 1, wherein the plurality of rules are configured in a configuration file stored in a rule repository coupled with the database.
4. The method of claim 1, wherein the query is a structured query language (SQL) query, and wherein the database is a SQL database.
5. The method of claim 1, wherein the query is processed by
creating mapping between each column, table and schema using a column list, a table list and a schema list associated with the query,
accessing the at least one rule from the plurality of rules,
defining each column either as a sensitive column or a non-sensitive column based upon the at least one rule, and
designating the query as a sensitive query based upon the defining of the sensitive column.
6. The method of claim 1, wherein the contextual information comprises, type of the query, order of the query, regular expression, primary key range and a SQL clause.
7. A system for identifying sensitive data in a response obtained based upon execution of a query on a database, the system comprising:
a processor; and
a memory coupled to the processor, wherein the processor executes a plurality of modules stored in the memory, and wherein the plurality of modules comprising:
a rule configuration module to configure a plurality of rules on data stored in a database, wherein a rule of the plurality of rules is configured to indicate presence of sensitive information in one or more datasets of the data;
a query analyzer module to
receive a query on the database, wherein the query is received in order to retrieve a response from the database, wherein the response comprises a dataset, of the one or more datasets, relevant to the query;
process the query based upon at least one rule of the plurality of rules in order to identify a sensitive column in the response, wherein the sensitive column is identified from columns associated with the dataset, and wherein the sensitive column comprises a subset of sensitive information; and
analyze the query based upon the identification of the sensitive column in order to determine contextual information associated with the query; and
a demarcation module to
identify a scenario of a plurality of scenarios based upon the contextual information; and
execute an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell in the sensitive column, wherein the sensitive cell comprises sensitive data.
8. The system of claim 7, wherein the plurality of rules are configured in a configuration file stored in a rule repository coupled with the database.

9. The system of claim 7, wherein the query analyzer module processes the query by
creating mapping between each column, table and schema using a column list, a table list and a schema list associated with the query,
accessing the at least one rule from the plurality of rules,
defining each column either as a sensitive column or a non-sensitive column based upon the at least one rule, and
designating the query as a sensitive query based upon the defining of the sensitive column.
10. A non-transitory computer readable medium embodying a program executable in a computing device for identifying sensitive data in a response obtained based upon execution of a query on a database, the program comprising:
a program code for configuring a plurality of rules on data stored in a database, wherein a rule of the plurality of rules is configured to indicate presence of sensitive information in one or more datasets of the data;
a program code for receiving a query on the database, wherein the query is received in order to retrieve a response from the database, wherein the response comprises a dataset, of the one or more datasets, relevant to the query;
a program code for processing the query based upon at least one rule of the plurality of rules in order to identify a sensitive column in the response, wherein the sensitive column is identified from columns associated with the dataset, and wherein the sensitive column comprises a subset of sensitive information;
a program code for analyzing the query based upon the identification of the sensitive column in order to determine contextual information associated with the query;
a program code for identifying a scenario of a plurality of scenarios based upon the contextual information; and
a program code for executing an algorithm, of a plurality of algorithms, on the sensitive column based upon the scenario identified in order to identify a sensitive cell in the sensitive column, wherein the sensitive cell comprises sensitive data.

Documents

Orders

Section Controller Decision Date

Application Documents

# Name Date
1 3722-MUM-2013-FORM 1(16-12-2013).pdf 2013-12-16
1 3722-MUM-2013-IntimationOfGrant30-09-2022.pdf 2022-09-30
2 3722-MUM-2013-CORRESPONDENCE(16-12-2013).pdf 2013-12-16
2 3722-MUM-2013-PatentCertificate30-09-2022.pdf 2022-09-30
3 Form-2(Online).pdf 2018-08-11
3 3722-MUM-2013-Annexure [18-02-2022(online)].pdf 2022-02-18
4 Form 2.pdf 2018-08-11
4 3722-MUM-2013-PETITION UNDER RULE 137 [18-02-2022(online)].pdf 2022-02-18
5 Figure of Abstract.jpg 2018-08-11
5 3722-MUM-2013-RELEVANT DOCUMENTS [18-02-2022(online)].pdf 2022-02-18
6 3722-MUM-2013-Written submissions and relevant documents [18-02-2022(online)].pdf 2022-02-18
6 3722-MUM-2013-FORM 26(6-3-2014).pdf 2018-08-11
7 3722-MUM-2013-CORRESPONDENCE(6-3-2014).pdf 2018-08-11
7 3722-MUM-2013-Correspondence to notify the Controller [28-01-2022(online)].pdf 2022-01-28
8 3722-MUM-2013-FORM-26 [28-01-2022(online)].pdf 2022-01-28
8 3722-MUM-2013-FER.pdf 2019-11-18
9 3722-MUM-2013-OTHERS [18-05-2020(online)].pdf 2020-05-18
9 3722-MUM-2013-US(14)-HearingNotice-(HearingDate-09-02-2022).pdf 2022-01-17
10 3722-MUM-2013-CLAIMS [18-05-2020(online)].pdf 2020-05-18
10 3722-MUM-2013-FER_SER_REPLY [18-05-2020(online)].pdf 2020-05-18
11 3722-MUM-2013-COMPLETE SPECIFICATION [18-05-2020(online)].pdf 2020-05-18
12 3722-MUM-2013-CLAIMS [18-05-2020(online)].pdf 2020-05-18
12 3722-MUM-2013-FER_SER_REPLY [18-05-2020(online)].pdf 2020-05-18
13 3722-MUM-2013-OTHERS [18-05-2020(online)].pdf 2020-05-18
13 3722-MUM-2013-US(14)-HearingNotice-(HearingDate-09-02-2022).pdf 2022-01-17
14 3722-MUM-2013-FER.pdf 2019-11-18
14 3722-MUM-2013-FORM-26 [28-01-2022(online)].pdf 2022-01-28
15 3722-MUM-2013-Correspondence to notify the Controller [28-01-2022(online)].pdf 2022-01-28
15 3722-MUM-2013-CORRESPONDENCE(6-3-2014).pdf 2018-08-11
16 3722-MUM-2013-FORM 26(6-3-2014).pdf 2018-08-11
16 3722-MUM-2013-Written submissions and relevant documents [18-02-2022(online)].pdf 2022-02-18
17 3722-MUM-2013-RELEVANT DOCUMENTS [18-02-2022(online)].pdf 2022-02-18
17 Figure of Abstract.jpg 2018-08-11
18 3722-MUM-2013-PETITION UNDER RULE 137 [18-02-2022(online)].pdf 2022-02-18
18 Form 2.pdf 2018-08-11
19 Form-2(Online).pdf 2018-08-11
19 3722-MUM-2013-Annexure [18-02-2022(online)].pdf 2022-02-18
20 3722-MUM-2013-PatentCertificate30-09-2022.pdf 2022-09-30
20 3722-MUM-2013-CORRESPONDENCE(16-12-2013).pdf 2013-12-16
21 3722-MUM-2013-IntimationOfGrant30-09-2022.pdf 2022-09-30
21 3722-MUM-2013-FORM 1(16-12-2013).pdf 2013-12-16

Search Strategy

1 search_13-11-2019.pdf

ERegister / Renewals

3rd: 25 Nov 2022

From 27/11/2015 - To 27/11/2016

4th: 25 Nov 2022

From 27/11/2016 - To 27/11/2017

5th: 25 Nov 2022

From 27/11/2017 - To 27/11/2018

6th: 25 Nov 2022

From 27/11/2018 - To 27/11/2019

7th: 25 Nov 2022

From 27/11/2019 - To 27/11/2020

8th: 25 Nov 2022

From 27/11/2020 - To 27/11/2021

9th: 25 Nov 2022

From 27/11/2021 - To 27/11/2022

10th: 25 Nov 2022

From 27/11/2022 - To 27/11/2023

11th: 27 Nov 2023

From 27/11/2023 - To 27/11/2024

12th: 27 Nov 2024

From 27/11/2024 - To 27/11/2025