Abstract: This disclosure relates to method and system for data sampling using an artificial neural network (ANN) model. In an embodiment, the method includes identifying a plurality of numerical data columns and a plurality of categorical data columns in population dataset, determining a set of predictor variables and a set of predictand variables by applying a linear regression on the plurality of numerical data columns, generating a sequential prediction model based on the set of predictor variables and the set of predictand variables, and performing stratified sampling on the plurality of categorical data columns to generate a set of stratified samples. The method further includes generating a sample key based on the set of stratified samples and the sequential prediction model, and generating a sample dataset representative of the population dataset based on the sample key. Figure 2
Claims:WE CLAIM
1. A method for sampling population dataset using an artificial neural network (ANN) model, the method comprising:
identifying, by a data sampling device, a plurality of numerical data columns and a plurality of categorical data columns in the population dataset;
determining, by the data sampling device, a set of predictor variables and a set of predictand variables by applying a linear regression on the plurality of numerical data columns;
generating, by the data sampling device, a sequential prediction model based on the set of predictor variables and the set of predictand variables;
performing stratified sampling, by the data sampling device, on the plurality of categorical data columns to generate a set of stratified samples;
generating, by the data sampling device, a sample key based on the set of stratified samples and the sequential prediction model; and
generating, by the data sampling device, a sample dataset representative of the population dataset based on the sample key.
2. The method of claim 1, wherein determining the set of predictor variables comprise:
performing correlation between each of at least two predictor columns;
removing at least one predictor column from each of the at least two predictor columns when the correlation is above a predetermined threshold.
3. The method of claim 1, wherein performing stratified sampling further comprises generating a sample size for the population dataset, wherein generating the sample size comprises generating the sample size based on a population size at a predetermined margin of error or at a predetermined confidence level, and wherein performing stratified sampling further comprises filtering the plurality of categorical data columns based on the sample size.
4. The method of claim 1, wherein generating the sample key further comprises selecting a set of sample indices by iteratively evaluating each sample from the set of stratified samples, wherein evaluating each sample from the set of stratified samples comprises:
determining a mean absolute error by comparing actual predictand variables with predicted predictand variables generated by the sequential prediction model; and
selecting a sample with the mean absolute error being the least.
5. A system for sampling population dataset using an artificial neural network (ANN) model, the system comprising:
a data sampling device comprising at least one processor and a computer-readable medium storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
identifying a plurality of numerical data columns and a plurality of categorical data columns in the population dataset;
determining a set of predictor variables and a set of predictand variables by applying a linear regression on the plurality of numerical data columns;
generating a sequential prediction model based on the set of predictor variables and the set of predictand variables;
performing stratified sampling on the plurality of categorical data columns to generate a set of stratified samples;
generating a sample key based on the set of stratified samples and the sequential prediction model; and
generating a sample dataset representative of the population dataset based on the sample key.
6. The system of claim 5, wherein determining the set of predictor variables comprise:
performing correlation between each of at least two predictor columns;
removing at least one predictor column from each of the at least two predictor columns when the correlation is above a predetermined threshold.
7. The system of claim 5, wherein performing stratified sampling further comprises generating a sample size for the population dataset, wherein generating the sample size comprises generating the sample size based on a population size at a predetermined margin of error or at a predetermined confidence level and wherein performing stratified sampling further comprises filtering the plurality of categorical data columns based on the sample size.
8. The system of claim 5, wherein generating the sample key further comprises selecting a set of sample indices by iteratively evaluating each sample from the set of stratified samples, wherein evaluating each sample from the set of stratified samples comprises:
determining a mean absolute error by comparing actual predictand variables with predicted predictand variables generated by the sequential prediction model; and
selecting a sample with the mean absolute error being the least.
Dated this 29th day of June, 2019
R Ramya Rao
Of K&S Partners
Agent for the Applicant
IN/PA-1607
, Description:TECHNICAL FIELD
[001] This disclosure relates generally to data sampling, and more particularly to a method and a system for data sampling using an artificial neural network (ANN) model.
| # | Name | Date |
|---|---|---|
| 1 | 201941026058-AMENDED DOCUMENTS [28-02-2022(online)].pdf | 2022-02-28 |
| 1 | 201941026058-STATEMENT OF UNDERTAKING (FORM 3) [29-06-2019(online)].pdf | 2019-06-29 |
| 2 | 201941026058-CLAIMS [28-02-2022(online)].pdf | 2022-02-28 |
| 2 | 201941026058-REQUEST FOR EXAMINATION (FORM-18) [29-06-2019(online)].pdf | 2019-06-29 |
| 3 | 201941026058-POWER OF AUTHORITY [29-06-2019(online)].pdf | 2019-06-29 |
| 3 | 201941026058-FER_SER_REPLY [28-02-2022(online)].pdf | 2022-02-28 |
| 4 | 201941026058-FORM 18 [29-06-2019(online)].pdf | 2019-06-29 |
| 4 | 201941026058-FORM 13 [28-02-2022(online)].pdf | 2022-02-28 |
| 5 | 201941026058-OTHERS [28-02-2022(online)].pdf | 2022-02-28 |
| 5 | 201941026058-FORM 1 [29-06-2019(online)].pdf | 2019-06-29 |
| 6 | 201941026058-POA [28-02-2022(online)].pdf | 2022-02-28 |
| 6 | 201941026058-DRAWINGS [29-06-2019(online)].pdf | 2019-06-29 |
| 7 | 201941026058-FER.pdf | 2021-10-17 |
| 7 | 201941026058-DECLARATION OF INVENTORSHIP (FORM 5) [29-06-2019(online)].pdf | 2019-06-29 |
| 8 | 201941026058-FORM 3 [28-04-2020(online)].pdf | 2020-04-28 |
| 8 | 201941026058-COMPLETE SPECIFICATION [29-06-2019(online)].pdf | 2019-06-29 |
| 9 | 201941026058-Proof of Right (MANDATORY) [29-11-2019(online)].pdf | 2019-11-29 |
| 9 | 201941026058-Request Letter-Correspondence [02-07-2019(online)].pdf | 2019-07-02 |
| 10 | 201941026058-Form 1 (Submitted on date of filing) [02-07-2019(online)].pdf | 2019-07-02 |
| 10 | 201941026058-Power of Attorney [02-07-2019(online)].pdf | 2019-07-02 |
| 11 | 201941026058-Form 1 (Submitted on date of filing) [02-07-2019(online)].pdf | 2019-07-02 |
| 11 | 201941026058-Power of Attorney [02-07-2019(online)].pdf | 2019-07-02 |
| 12 | 201941026058-Proof of Right (MANDATORY) [29-11-2019(online)].pdf | 2019-11-29 |
| 12 | 201941026058-Request Letter-Correspondence [02-07-2019(online)].pdf | 2019-07-02 |
| 13 | 201941026058-COMPLETE SPECIFICATION [29-06-2019(online)].pdf | 2019-06-29 |
| 13 | 201941026058-FORM 3 [28-04-2020(online)].pdf | 2020-04-28 |
| 14 | 201941026058-DECLARATION OF INVENTORSHIP (FORM 5) [29-06-2019(online)].pdf | 2019-06-29 |
| 14 | 201941026058-FER.pdf | 2021-10-17 |
| 15 | 201941026058-DRAWINGS [29-06-2019(online)].pdf | 2019-06-29 |
| 15 | 201941026058-POA [28-02-2022(online)].pdf | 2022-02-28 |
| 16 | 201941026058-FORM 1 [29-06-2019(online)].pdf | 2019-06-29 |
| 16 | 201941026058-OTHERS [28-02-2022(online)].pdf | 2022-02-28 |
| 17 | 201941026058-FORM 13 [28-02-2022(online)].pdf | 2022-02-28 |
| 17 | 201941026058-FORM 18 [29-06-2019(online)].pdf | 2019-06-29 |
| 18 | 201941026058-POWER OF AUTHORITY [29-06-2019(online)].pdf | 2019-06-29 |
| 18 | 201941026058-FER_SER_REPLY [28-02-2022(online)].pdf | 2022-02-28 |
| 19 | 201941026058-REQUEST FOR EXAMINATION (FORM-18) [29-06-2019(online)].pdf | 2019-06-29 |
| 19 | 201941026058-CLAIMS [28-02-2022(online)].pdf | 2022-02-28 |
| 20 | 201941026058-STATEMENT OF UNDERTAKING (FORM 3) [29-06-2019(online)].pdf | 2019-06-29 |
| 20 | 201941026058-AMENDED DOCUMENTS [28-02-2022(online)].pdf | 2022-02-28 |
| 1 | SearchStrategyE_24-09-2021.pdf |