Sign In to Follow Application
View All Documents & Correspondence

Method And System For Extracting Data Points From A Data File

Abstract: METHOD AND SYSTEM FOR EXTRACTING DATA-POINTS FROM A DATA FILE Abstract Of The Invention The present invention provides a method, system and computer program product for extracting data-points from a data file. A data-point is extracted in a data-base by pointing at a portion of computer recognizable text in the data file by a pointing device. The data-point associated with the pointed portion of the computer recognizable text is thereby selected and stored in the database.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
28 July 2008
Publication Number
6/2010
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

INFOSYS TECHNOLOGIES LIMITED
PLOT NO. 44 & 97A, ELECTRONICS CITY, HOSUR ROAD BANGALORE - 560 100, KARNATAKA

Inventors

1. V, RAJA
C3-80, A WHO, 28 ARCOT ROAD, SALIGRAMAM, CHENNAI 600093, TAMILNADU, INDIA
2. NARAYANAN, SANTHOSH
#205, BLOCK 1, B WING, SKYLINE CITY, CHANDRA LAYOUT, VIJAYANAGAR, BANGALORE 560072, KARNATAKA

Specification

BACKGROUND The present invention relates to extracting data-points from a data file. More specifically, it relates to extracting data-points from a data file to be imported to enterprising applications. With advent of the Internet and outsourcing, various industries, such as banking, insurance, software, and so forth, maintain huge data associated with their customers, For example, a typical insurance company maintains relevant information, such as the name, the date of birth, the address, the policy number, and the like associated with each customer. The customers typically fill out such information corresponding to each of these fields in a form on paper. Thereafter, these companies update their database by copying the infomnation from the paper forms associated with each customer in the respective fields of their enterprising applications. Currently, the information associated with the fields is filled manually in the corresponding enterprising applications by users such as insurance agents. These agents generally receive the scanned image files of fonns filled on paper. Thereafter, the agents copy the information corresponding to various fields manually from each scanned image file in the enterprising applications, i.e. manually typing the data-points corresponding to the fields. This may result in numerous errors such as, typographical errors, the relevant information filled into wrong fields, and so forth. Moreover, the manual entry process is time consuming and hence has proved to be less productive. Further, recently, various software products have been made available in the market that provide a functionality of converting the scanned image file into a file that enables character recognition. The files are generated by using optical character recognition (OCR) algorithms. These files that are generated by OCR algorithms enable the agents to copy the information directly into the enterprising application in contrast to conventional manual feeding. The agents typically copy the infonnation from the converted image file to the enterprising applications. Though it avoids manual typing of the information and is less time consuming, copying the information still lead to problems like, the relevant information entered into wrong fields, switching between various enterprising applications, and the like. Further, the agents copy the relevant information separately by using 'copy' and 'paste' while switching between various windows, thereby leading to a cumbersome process. These agents generally use a key-board or a pointing device, such as a mouse to 'copy' and 'paste' the relevant information thus resulting to either in multiple key strokes of the key-board or multiple clicks by the pointing device. Thus, this also proves to be a time consuming process. Further, such software products ■ do not provide cost-effective solutions. In light of the above, there is a need for a system and method that enables the agents to copy the information from the scanned images in lesser time. Further the system simultaneously should enable an error-free transfer of the information to the enterprising applications. Furthermore, the system should be cost-effective. SUMMARY An object the invention is to effectively transfer data-points to a database. To achieve the objective mentioned above, the invention provides a method and a computer program product for extracting data-points from a data file. The data file contains computer recognizable text. The data file containing the computer recognizable text is displayed to a user through a user interface. A data-point is selected by pointing at a portion of the computer recognizable text associated with the data-point through a pointing device. Thereafter, the selected data-point is stored corresponding to a predefined field in a database. Further, the invention provides a data capturing module for extracting data-points from a data file. The data capturing module includes a display module for displaying the data file containing computer recognizable text to a user through a user-interface. Thereafter, a portion of the computer recognizable text is pointed by the user by a pointing device. A selection module thereby selects a data-point associated with the pointed portion of the computer recognizable text. Subsequently, a storage module stores the data-point corresponding to a pre-defined field in a database. The method, system and computer program product described above have a number of advantages. The method and system enables the user to extract the data-points from a data-file in an automated manner. This improves the speed at which the data-points are extracted from a data-file and exported to an enterprising application, thereby increasing the productivity of the user. Further, the system is cost-effective as it is developed using existing technologies/applications, such as Microsoft Office®, OCR algorithms, and so forth. BRIEF DESCRIPTION OF THE DRAWINGS The various embodiments of the invention will hereinafter be described in conjunction with the appended drawings, provided to illustrate and not to limit the invention, wherein like designations denote like elements, and in which: FIG. 1 illustrates an environment in which various embodiments of the invention may be practiced; FIG. 2 is a block diagram of a data capturing module for extracting data-points from a data file, in accordance with an embodiment of the invention; FIG. 3 is a block diagram of a data captuhng module for extracting the data-points from the data file, in accordance with another embodiment of the invention; FIG. 4 is a flowchart depicting a method for extracting data-points from a data file, in accordance with an embodiment of the invention; and FIG. 5 Is a flowchart depicting a method for extracting the data-points from the data file, in accordance with another embodiment of the invention. DETAILED DESCRIPTION OF THE DRAWINGS The invention describes a method, system and computer program product for extracting various data-points from a data file. The data-points are extracted in order to be exported to an enterprising application. The method and system enable the selection of the data-points. Further the method enables the storage of the selected data-points in a structured format in a database. Thereafter, the selected data-points are exported from the database to the enterprising application through one or more scripting tools. FIG. 1 illustrates an environment 100 in which various embodiments of the invention may be practiced. In various embodiments of the invention, environment 100 is a data processing unit, a network of data processing units, and the like. Various examples of the data processing units include personal computers, laptops, personal digital assistants (PDA), mobile devices, and the like. Further, environment 100 includes an enterprising application 102, a data capturing module 104, a data file 106, and a database 108. Enterprising application 102 is a software application maintaining and processing various records pertaining to an enterprise. In an exemplary embodiment of the invention, these records can be pertaining to the customers of an organization. Various examples of enterprising application 102 Include, but are not limited to, an application of an insurance service provider, an application managing the employees' information of a company, Enterprise Resource Planning (ERR) applications, and so forth. Further, enteiprising application 102 include various fields like, 'name of the customer*, and 'Insurance policy number of the customer', 'date of birth', 'mother's maiden name', etc, associated with each customer for which the data is required from the users. Generally, the required data corresponding to the fields associated with each customer is present in an image file corresponding with each customer. Various examples of the image file can be a scanned document of an offer letter, an account statement, an insurance policy receipt, a screen shot of an application, a screen shot of a portion of an application, and the like. Data capturing module 104 first converts the image file to data file 106. In other words, data capturing module 104 converts the content available in the image file to data file 106 containing computer recognizable text. It may be apparent to any person skilled in the art that the computer recognizable text can be copied from data file 106. In various embodiments of the invention, the computer recognizable text includes characters, numbers and symbols. In another embodiment of the invention, data file 106 may be generated from a file whose contents are computer recognizable. An example of such fiie may be Microsoft® Word document. Subsequently, data capturing module 104 displays data file 106 containing the computer recognizable text to a user through a user-interface. Further, data capturing module 104 selects various data-points associated corresponding fields of enterprising application 102 from data file 106. In an embodiment of the invention, the user points at a portion of the computer recognizable text of the corresponding data-points associated with each field of enterprising application 102 using a pointing device. For example, the user points at a portion of the computer recognizable text associated with the corresponding name of the customer. Data capturing module 104 thereby stores the selected data-point corresponding to pre¬defined field in a structured format in data base 108. Further, the pre-defined field is associated with the corresponding field of enterprising application 102. In another embodiment of the invention, the pre-defined field is defined by a user. Various examples of database 108 include but are not limited to, Microsoft® excel database, Microsoft® access database, and the lil

Documents

Application Documents

# Name Date
1 1801-CHE-2008 FORM-18 06-10-2009.pdf 2009-10-06
1 1801-CHE-2008-AbandonedLetter.pdf 2017-07-20
2 1801-CHE-2008 FORM-13 28-10-2009.pdf 2009-10-28
2 1801-CHE-2008-Form-13-281009.pdf 2016-09-30
3 1801-CHE-2008-FER.pdf 2016-04-29
3 1801-che-2008 form-5.pdf 2011-09-03
4 1801-che-2008 form-3.pdf 2011-09-03
4 1801-CHE-2008 AMENDED PAGES OF SPECIFICATION 03-06-2015.pdf 2015-06-03
5 1801-che-2008 form-1.pdf 2011-09-03
5 1801-CHE-2008 CORRESPONDENCE OTHERS 03-06-2015..pdf 2015-06-03
6 1801-che-2008 drawings.pdf 2011-09-03
6 1801-CHE-2008 FORM-1 03-06-2015.pdf 2015-06-03
7 1801-che-2008 description(complete).pdf 2011-09-03
7 1801-CHE-2008 FORM-13 03-06-2015.pdf 2015-06-03
8 1801-che-2008 abstract.pdf 2011-09-03
8 1801-che-2008 correspondence-others.pdf 2011-09-03
9 1801-che-2008 claims.pdf 2011-09-03
10 1801-che-2008 correspondence-others.pdf 2011-09-03
10 1801-che-2008 abstract.pdf 2011-09-03
11 1801-che-2008 description(complete).pdf 2011-09-03
11 1801-CHE-2008 FORM-13 03-06-2015.pdf 2015-06-03
12 1801-che-2008 drawings.pdf 2011-09-03
12 1801-CHE-2008 FORM-1 03-06-2015.pdf 2015-06-03
13 1801-che-2008 form-1.pdf 2011-09-03
13 1801-CHE-2008 CORRESPONDENCE OTHERS 03-06-2015..pdf 2015-06-03
14 1801-che-2008 form-3.pdf 2011-09-03
14 1801-CHE-2008 AMENDED PAGES OF SPECIFICATION 03-06-2015.pdf 2015-06-03
15 1801-CHE-2008-FER.pdf 2016-04-29
15 1801-che-2008 form-5.pdf 2011-09-03
16 1801-CHE-2008-Form-13-281009.pdf 2016-09-30
16 1801-CHE-2008 FORM-13 28-10-2009.pdf 2009-10-28
17 1801-CHE-2008-AbandonedLetter.pdf 2017-07-20
17 1801-CHE-2008 FORM-18 06-10-2009.pdf 2009-10-06