Abstract: The present disclosure relates to a method and system for generating and transforming test data. In one embodiment, a user query is received in natural language and parsed to generate keywords using lemmatization. Based on the generated keywords and filter conditions in the user query, a data source specific executable query suitable for data sources is generated and executed against each data source to generate test data. The method determines if there are any missing test data in the generated test data and creates missing test data based on the data type, number of records required. The method also automatically transforms the generated test data into corresponding test data suitable to the requirements of a target system. Thus, the system generates test data specific to different data sources based on query provided in natural language and transforms the generated test data to comply with the requirements of the target system. Figure 3
Claims:We Claim:
1. A method for generating test data, said method comprising:
receiving, by a processor (110) of a test data generation and transformation system (102), a test data query (208), in natural language, comprising at least a selection parameter for generation of test data (214) from one or more data sources (106) coupled to the test data generation and transformation system (102);
parsing, by the processor (110), the received test data query (208) to extract a plurality of domain specific context names (210);
mapping, by the processor, the plurality of domain specific context names (210) with predetermined domain specific meta-data (122) to determine one or more columns, tables, and filter conditions associated with the received test data query (208), the domain specific meta-data data (122) is associated with each column and table of one or more data sources (106);
determining, by the processor (110), a data source specific executable query (212) comprising one or more columns, tables, and filter conditions associated with the received test data query (208), based on at least the selection parameter and the mapped plurality of domain specific context names (210); and
executing, by the processor, the data source specific executable query (212) in the one or more data sources (106) to generate the test data (214).
2. The method as claimed in claim 1, wherein the domain specific meta-data (122) is determined by one or more steps of:
extracting one or more tables, one or more columns of each table, and data type of each column of each table;
filtering one or more columns, tables from the one or more extracted tables, columns, and data types;
lemmatizing each of the filtered relevant column and table to determine one or more alternative names corresponding to each column and table; and
generating domain context specific names for each column and table based on one or more alternative names determined.
3. The method as claimed in claim 1, wherein the one or more data sources (106) comprise relational databases, NoSQL databases, and file sources.
4. The method as claimed in claim 3, wherein each of the relational and NoSQL databases is configured based on one or more parameters including data source server IP/Port address, data source authentication information, and data source schema information, further wherein the file sources are configured based on parameters including data source file path name, and data source authentication information.
5. The method as claimed in claim 1, wherein parsing of the test data query (208) comprising:
identifying one or more historical queries related to the received test data query (208), based on one or more cognitive learning techniques; and
determining domain specific meta-data (122) corresponding to the one or more identified historical queries for generating the generic data structure.
6. The method as claimed in claim 1, further comprising:
identifying missing test data in the generated test data, based on deviation in the count of records, parameters, filter conditions, group values, accuracy of data values associated with the test data (214) compared with the count of records, parameters, filter conditions, group values associated with the test data query (208); and
generating the one or more identified missing test data based on the number of columns to create, and number of rows to create received as input.
7. The method as claimed in claim 1, further comprising transforming the generated test data (214) into a pre-defined format for consumption of the generated data by a target system (108).
8. The method as claimed in claim 1, further comprising presenting the determined data source specific executable query (212) to a user for validation.
9. The method as claimed in claim 8, further comprising updating the domain specific meta-data (122) based on the validation of the data source specific executable query (212).
10. A test data generation and transformation system, said system comprising:
a processor (110); and
a memory (112), communicatively coupled to the processor (110), wherein the memory (112) stores processor-executable instructions, which, on execution, cause the processor (110) to:
receive a test data query (208), in natural language, comprising at least a selection parameter for generation of test data (214) from one or more data sources (106) coupled to the test data generation and transformation system (102);
parse the received test data query (208) to extract a plurality of domain specific context names (210);
map the plurality of domain specific context names (210) with predetermined domain specific meta-data (122) to determine one or more columns, tables, and filter conditions associated with the received test data query (208), the domain specific meta-data data (122) is associated with each column and table of one or more data sources (106);
determine a data source specific executable query (212) comprising one or more columns, tables, and filter conditions associated with the received test data query (208), based on at least the selection parameter and the mapped plurality of domain specific context names (210); and
execute the data source specific executable query (212) in the one or more data sources (106) to generate the test data (214).
11. The system as claimed in claim 10, wherein the processor (110) is configured to determine the domain specific meta-data (122) is by:
extracting one or more tables, one or more columns of each table, and data type of each column of each table;
filtering one or more columns, tables from the one or more extracted tables, columns, and data types using one or more condition separators;
lemmatizing each of the filtered relevant column and table to determine one or more alternative names corresponding to each column and table; and
generating domain context specific names for each column and table based on one or more alternative names determined.
12. The system as claimed in claim 10, wherein the one or more data sources comprise relational databases, NoSQL databases, and file sources.
13. The system as claimed in claim 12, wherein each of the relational and NoSQL databases is configured based on one or more parameters including data source server IP/Port address, data source authentication information, and data source schema information, further wherein the file sources are configured based on parameters including data source file path name, and data source authentication information.
14. The system as claimed in claim 10, wherein the processor (110) is configured to parse the test data query (208) by:
identifying one or more historical queries related to the received test data query (208), based on one or more cognitive learning techniques; and
determining domain specific meta-data (122) corresponding to the one or more identified historical queries for generating the generic data structure.
15. The system as claimed in claim 10, wherein the processor (110) is further configured to:
identify missing test data in the generated test data (214), based on deviation in the count of records, parameters, filter conditions, group values, accuracy of data values associated with the test data (214) compared with the count of records, parameters, filter conditions, group values associated with the test data query (208); and
generate the one or more identified missing test data based on the number of columns to create, and number of rows to create received as input.
16. The system as claimed in claim 10, wherein the processor (110) is further configured to transform the generated test data (214) into a pre-defined format for consumption of the generated data by a target system (108).
17. The system as claimed in claim 10, wherein the processor (110) is further configured to present the determined data source specific executable query (212) to a user for validation.
18. The system as claimed in claim 17, wherein the processor (110) is configured to update the domain specific meta-data (122) based on the validation of the data source specific executable query (212).
Dated this 20th day of January 2017
M.S. Devi
Of K&S Partners
Agent for the Applicant
, Description:FIELD OF THE DISCLOSURE
The present subject matter is related, in general to testing database applications, and more particularly, but not exclusively to a test data generation and transformation system and method thereof.
| Section | Controller | Decision Date |
|---|---|---|
| Section 15 | Rajeev Kumar | 2024-01-31 |
| Section 15 | Rajeev Kumar | 2024-02-02 |
| # | Name | Date |
|---|---|---|
| 1 | Power of Attorney [20-01-2017(online)].pdf | 2017-01-20 |
| 2 | Form 5 [20-01-2017(online)].pdf | 2017-01-20 |
| 3 | Form 3 [20-01-2017(online)].pdf | 2017-01-20 |
| 4 | Form 18 [20-01-2017(online)].pdf_144.pdf | 2017-01-20 |
| 5 | Form 18 [20-01-2017(online)].pdf | 2017-01-20 |
| 6 | Drawing [20-01-2017(online)].pdf | 2017-01-20 |
| 7 | Description(Complete) [20-01-2017(online)].pdf_143.pdf | 2017-01-20 |
| 8 | Description(Complete) [20-01-2017(online)].pdf | 2017-01-20 |
| 9 | REQUEST FOR CERTIFIED COPY [24-01-2017(online)].pdf | 2017-01-24 |
| 10 | REQUEST FOR CERTIFIED COPY [21-03-2017(online)].pdf | 2017-03-21 |
| 11 | Other Patent Document [28-04-2017(online)].pdf | 2017-04-28 |
| 12 | Correspondence by Agent_Form 1_02-05-2017.pdf | 2017-05-02 |
| 13 | abstract 201741002359 .jpg | 2017-05-05 |
| 14 | 201741002359-FER.pdf | 2020-06-02 |
| 15 | 201741002359-PETITION UNDER RULE 137 [01-12-2020(online)].pdf | 2020-12-01 |
| 16 | 201741002359-Information under section 8(2) [01-12-2020(online)].pdf | 2020-12-01 |
| 17 | 201741002359-FORM 3 [01-12-2020(online)].pdf | 2020-12-01 |
| 18 | 201741002359-FER_SER_REPLY [02-12-2020(online)].pdf | 2020-12-02 |
| 19 | 201741002359-US(14)-HearingNotice-(HearingDate-01-01-2024).pdf | 2023-12-13 |
| 20 | 201741002359-US(14)-ExtendedHearingNotice-(HearingDate-08-01-2024).pdf | 2023-12-15 |
| 21 | 201741002359-POA [22-12-2023(online)].pdf | 2023-12-22 |
| 22 | 201741002359-FORM 13 [22-12-2023(online)].pdf | 2023-12-22 |
| 23 | 201741002359-Correspondence to notify the Controller [22-12-2023(online)].pdf | 2023-12-22 |
| 24 | 201741002359-AMENDED DOCUMENTS [22-12-2023(online)].pdf | 2023-12-22 |
| 25 | 201741002359-Written submissions and relevant documents [23-01-2024(online)].pdf | 2024-01-23 |
| 26 | 201741002359-PatentCertificate02-02-2024.pdf | 2024-02-02 |
| 27 | 201741002359-IntimationOfGrant02-02-2024.pdf | 2024-02-02 |
| 1 | search002359E_01-06-2020.pdf |