Sign In to Follow Application
View All Documents & Correspondence

Method And System For Data Sampling Using Artificial Neural Network (Ann) Model

Abstract: This disclosure relates to method and system for data sampling using an artificial neural network (ANN) model. In an embodiment, the method includes identifying a plurality of numerical data columns and a plurality of categorical data columns in population dataset, determining a set of predictor variables and a set of predictand variables by applying a linear regression on the plurality of numerical data columns, generating a sequential prediction model based on the set of predictor variables and the set of predictand variables, and performing stratified sampling on the plurality of categorical data columns to generate a set of stratified samples. The method further includes generating a sample key based on the set of stratified samples and the sequential prediction model, and generating a sample dataset representative of the population dataset based on the sample key. Figure 2

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
29 June 2019
Publication Number
01/2021
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
ipr@akshipassociates.com
Parent Application

Applicants

WIPRO LIMITED
Doddakannelli, Sarjapur Road, Bangalore

Inventors

1. Abbas Singapurwala
301 Raj Gold Residency, 783 Khatiwala Tank, Indore 452104

Specification

Claims:WE CLAIM
1. A method for sampling population dataset using an artificial neural network (ANN) model, the method comprising:
identifying, by a data sampling device, a plurality of numerical data columns and a plurality of categorical data columns in the population dataset;
determining, by the data sampling device, a set of predictor variables and a set of predictand variables by applying a linear regression on the plurality of numerical data columns;
generating, by the data sampling device, a sequential prediction model based on the set of predictor variables and the set of predictand variables;
performing stratified sampling, by the data sampling device, on the plurality of categorical data columns to generate a set of stratified samples;
generating, by the data sampling device, a sample key based on the set of stratified samples and the sequential prediction model; and
generating, by the data sampling device, a sample dataset representative of the population dataset based on the sample key.

2. The method of claim 1, wherein determining the set of predictor variables comprise:
performing correlation between each of at least two predictor columns;
removing at least one predictor column from each of the at least two predictor columns when the correlation is above a predetermined threshold.

3. The method of claim 1, wherein performing stratified sampling further comprises generating a sample size for the population dataset, wherein generating the sample size comprises generating the sample size based on a population size at a predetermined margin of error or at a predetermined confidence level, and wherein performing stratified sampling further comprises filtering the plurality of categorical data columns based on the sample size.

4. The method of claim 1, wherein generating the sample key further comprises selecting a set of sample indices by iteratively evaluating each sample from the set of stratified samples, wherein evaluating each sample from the set of stratified samples comprises:
determining a mean absolute error by comparing actual predictand variables with predicted predictand variables generated by the sequential prediction model; and
selecting a sample with the mean absolute error being the least.

5. A system for sampling population dataset using an artificial neural network (ANN) model, the system comprising:
a data sampling device comprising at least one processor and a computer-readable medium storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
identifying a plurality of numerical data columns and a plurality of categorical data columns in the population dataset;
determining a set of predictor variables and a set of predictand variables by applying a linear regression on the plurality of numerical data columns;
generating a sequential prediction model based on the set of predictor variables and the set of predictand variables;
performing stratified sampling on the plurality of categorical data columns to generate a set of stratified samples;
generating a sample key based on the set of stratified samples and the sequential prediction model; and
generating a sample dataset representative of the population dataset based on the sample key.

6. The system of claim 5, wherein determining the set of predictor variables comprise:
performing correlation between each of at least two predictor columns;
removing at least one predictor column from each of the at least two predictor columns when the correlation is above a predetermined threshold.

7. The system of claim 5, wherein performing stratified sampling further comprises generating a sample size for the population dataset, wherein generating the sample size comprises generating the sample size based on a population size at a predetermined margin of error or at a predetermined confidence level and wherein performing stratified sampling further comprises filtering the plurality of categorical data columns based on the sample size.

8. The system of claim 5, wherein generating the sample key further comprises selecting a set of sample indices by iteratively evaluating each sample from the set of stratified samples, wherein evaluating each sample from the set of stratified samples comprises:
determining a mean absolute error by comparing actual predictand variables with predicted predictand variables generated by the sequential prediction model; and
selecting a sample with the mean absolute error being the least.

Dated this 29th day of June, 2019

R Ramya Rao
Of K&S Partners
Agent for the Applicant
IN/PA-1607
, Description:TECHNICAL FIELD
[001] This disclosure relates generally to data sampling, and more particularly to a method and a system for data sampling using an artificial neural network (ANN) model.

Documents

Application Documents

# Name Date
1 201941026058-AMENDED DOCUMENTS [28-02-2022(online)].pdf 2022-02-28
1 201941026058-STATEMENT OF UNDERTAKING (FORM 3) [29-06-2019(online)].pdf 2019-06-29
2 201941026058-CLAIMS [28-02-2022(online)].pdf 2022-02-28
2 201941026058-REQUEST FOR EXAMINATION (FORM-18) [29-06-2019(online)].pdf 2019-06-29
3 201941026058-POWER OF AUTHORITY [29-06-2019(online)].pdf 2019-06-29
3 201941026058-FER_SER_REPLY [28-02-2022(online)].pdf 2022-02-28
4 201941026058-FORM 18 [29-06-2019(online)].pdf 2019-06-29
4 201941026058-FORM 13 [28-02-2022(online)].pdf 2022-02-28
5 201941026058-OTHERS [28-02-2022(online)].pdf 2022-02-28
5 201941026058-FORM 1 [29-06-2019(online)].pdf 2019-06-29
6 201941026058-POA [28-02-2022(online)].pdf 2022-02-28
6 201941026058-DRAWINGS [29-06-2019(online)].pdf 2019-06-29
7 201941026058-FER.pdf 2021-10-17
7 201941026058-DECLARATION OF INVENTORSHIP (FORM 5) [29-06-2019(online)].pdf 2019-06-29
8 201941026058-FORM 3 [28-04-2020(online)].pdf 2020-04-28
8 201941026058-COMPLETE SPECIFICATION [29-06-2019(online)].pdf 2019-06-29
9 201941026058-Proof of Right (MANDATORY) [29-11-2019(online)].pdf 2019-11-29
9 201941026058-Request Letter-Correspondence [02-07-2019(online)].pdf 2019-07-02
10 201941026058-Form 1 (Submitted on date of filing) [02-07-2019(online)].pdf 2019-07-02
10 201941026058-Power of Attorney [02-07-2019(online)].pdf 2019-07-02
11 201941026058-Form 1 (Submitted on date of filing) [02-07-2019(online)].pdf 2019-07-02
11 201941026058-Power of Attorney [02-07-2019(online)].pdf 2019-07-02
12 201941026058-Proof of Right (MANDATORY) [29-11-2019(online)].pdf 2019-11-29
12 201941026058-Request Letter-Correspondence [02-07-2019(online)].pdf 2019-07-02
13 201941026058-COMPLETE SPECIFICATION [29-06-2019(online)].pdf 2019-06-29
13 201941026058-FORM 3 [28-04-2020(online)].pdf 2020-04-28
14 201941026058-DECLARATION OF INVENTORSHIP (FORM 5) [29-06-2019(online)].pdf 2019-06-29
14 201941026058-FER.pdf 2021-10-17
15 201941026058-DRAWINGS [29-06-2019(online)].pdf 2019-06-29
15 201941026058-POA [28-02-2022(online)].pdf 2022-02-28
16 201941026058-FORM 1 [29-06-2019(online)].pdf 2019-06-29
16 201941026058-OTHERS [28-02-2022(online)].pdf 2022-02-28
17 201941026058-FORM 13 [28-02-2022(online)].pdf 2022-02-28
17 201941026058-FORM 18 [29-06-2019(online)].pdf 2019-06-29
18 201941026058-POWER OF AUTHORITY [29-06-2019(online)].pdf 2019-06-29
18 201941026058-FER_SER_REPLY [28-02-2022(online)].pdf 2022-02-28
19 201941026058-REQUEST FOR EXAMINATION (FORM-18) [29-06-2019(online)].pdf 2019-06-29
19 201941026058-CLAIMS [28-02-2022(online)].pdf 2022-02-28
20 201941026058-STATEMENT OF UNDERTAKING (FORM 3) [29-06-2019(online)].pdf 2019-06-29
20 201941026058-AMENDED DOCUMENTS [28-02-2022(online)].pdf 2022-02-28

Search Strategy

1 SearchStrategyE_24-09-2021.pdf