System And Method For Pre Processing An Image Before Extracting Text

System And Method For Pre Processing An Image Before Extracting Text Data

Abstract: The present disclosure relates to system(s) and method(s) for pre-processing an image before extracting text data from the image. The system is configured to maintain a set of standard templates, wherein each standard template corresponds to type of text data. Further, the system may receive a target image and divide the target image into a set of regions based on a predefined criteria. Further, the system may identify one or more pre-processing techniques corresponding to each region from the set of regions. The system may generate a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region. Finally the system may extract text data from the refined target image by processing the refined target image using at least one OCR technique.

Patent Information

Application #

Filing Date

21 August 2018

Publication Number

36/2018

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

ip@legasis.in

Parent Application

Patent Number

Legal Status

Grant Date

2024-01-04

Renewal Date

Applicants

HCL Technologies Limited

A-9, Sector - 3, Noida 201 301, Uttar Pradesh, India

Inventors

1. SOUNDARARAJAN, Ameli Merlin

HCL Technologies Limited, Chennai SEZ, SDB2, Chennai - 600119, Tamil Nadu, India

2. SHANMUGASUNDARAM, Yuvarajan

HCL Technologies Limited, Chennai SEZ, SDB2, Chennai - 600119, Tamil Nadu, India

3. SADASIVAM, Sivasakthivel

HCL Technologies Limited, AMB-3, 64 & 65, South Phase, 2nd main road, Ambattur Industrial Estate, Chennai - 600058, Tamil Nadu, India

Specification

The following specification describes the invention and the manner in which it is to be performed.
CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
[001]The present application does not claim priority from any patent application.

TECHNICAL FIELD
[002] The present disclosure in general relates to the field of data processing. More particularly, the present invention relates to a system and method to pre-process image data before text extraction.

BACKGROUND

[003] In many office environments huge amounts of time is spent on unnecessary tasks such as referring many hardcopies for inputting data, or searching through piles of documents, papers and files to retrieve information needed to complete a task. The growing trend of document digitization, by scanning document/files and extracting the text in it for further level of usage is followed not only in office environments, but environments such as Claim filling, Insurance, physical book to eBook converter, automation testing based on texts in the display and the like. The text can be extracted from the scanned / camera images and must be moved into digitized data. Currently OCR technology plays a major role for this type of digitization. But due to image quality, lighting, text background, tilted image, etc., the extracted text will not be the optimum result of the image. Many places it may not provide accurate result. Hence the OCR techniques have to be fine-tuned and have to provide the optimum result.
SUMMARY
[004] Before the present systems and method for pre-processing an image before extracting text data from the image is illustrated. It is to be understood that this application is not limited to the particular systems, and methodologies described, as there can be multiple possible embodiments that are not expressly illustrated in the present disclosure. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present application. This summary is provided to introduce concepts related to systems and method for pre-processing an image before extracting text data from the image. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.
[005] In one implementation, a system for pre-processing an image before extracting text data from the image is illustrated. The system comprises a memory and a processor coupled to the memory, wherein the processor is configured execute programmed instructions stored in the memory to maintain a set of standard templates, wherein each standard template corresponds to type of text data, and wherein each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques. Further, the processor is configured to execute programmed instructions stored in the memory to receive a target image and divide the target image into a set of regions based on a predefined criteria. Further, the processor is configured to execute programmed instructions stored in the memory to identify one or more pre-processing techniques corresponding to each region from the set of regions, wherein the one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates. Further, the processor is configured to execute programmed instructions stored in the memory to generate a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region. Further, the processor is configured to execute programmed instructions stored in the memory to extract text data from the refined target image by processing the refined target image using at least one OCR technique.
[006] In one implementation, a method for pre-processing an image before extracting text data from the image is illustrated. The method may comprise steps for maintaining a set of standard templates, wherein each standard template corresponds to type of text data, and wherein each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques. The method may further comprise steps for receiving a target image and divide the target image into a set of regions based on a predefined criteria. The method may further comprise steps for identifying one or more pre-processing techniques corresponding to each region from the set of regions, wherein the one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates. The method may further comprise steps for generating a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region. The method may further comprise steps for extracting text data from the refined target image by processing the refined target image using at least one OCR technique.
[007] In yet another implementation, a computer program product having embodied computer program for pre-processing an image before extracting text data from the image is disclosed. The program may comprise a program code to maintain a set of standard templates, wherein each standard template corresponds to type of text data, and wherein each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques. The program may comprise a program code to receive a target image and divide the target image into a set of regions based on a predefined criteria. The program may comprise a program code to identify one or more pre-processing techniques corresponding to each region from the set of regions, wherein the one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates. The program may comprise a program code to generate a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region. The program may comprise a program code to extract text data from the refined target image by processing the refined target image using at least one OCR technique.
BRIEF DESCRIPTION OF DRAWINGS
[008] The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to refer like features and components.
[009] Figure 1 illustrates a network implementation of a system configured for pre-processing an image before extracting text data from the image, in accordance with an embodiment of the present subject matter.
[0010] Figure 2 illustrates the system configured for pre-processing the image before extracting text data from the image, in accordance with an embodiment of the present subject matter.
[0011] Figure 3 illustrates a method for pre-processing the image before extracting text data from the image, in accordance with an embodiment of the present subject matter.

DETAILED DESCRIPTION
[0012] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. The words “maintaining”, “generating”, “forecasting”, “displaying”, and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise. Although any systems and methods similar or equivalent to those described herein can be used in pre-processing an image before extracting text data from the image, the exemplary, systems and method for pre-processing of the image is now described.
[0013] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure for pre-processing an image before extracting text data from the image is not intended to be limited to the embodiments illustrated, but is to be accorded the widest scope consistent with the principles and features described herein.
[0014] The system enables identification of appropriate pre-process technique to get the text from an image. In one embodiment, the extracted text from image may be used for automation testing of graphical user interface. Now a days devices GUI has many screens and support multi-languages. Testing huge number of screens with multiple languages is very tedious for the user since he has to sit and understand all languages. Using this one time learning approach the present system can provide all the images to the pre-processor. This will find the appropriate pre-process for image either region specific/ full image and accordingly enable automated testing of the images/ user interface screens.
[0015] In one embodiment, the system may comprise two components. The First component is a configurable input items such as image / Image list / Cropped Images with corresponding expected text and different types of techniques need to be applied. The Second component comprises a Processor, OCR tool, Pre-processor, Pre-process Learner and List of Pre-process applied file. The processor may be configured to execute/ control the other modules in the system. the OCR tool may be any OCR tool which is easily available in the open source domain. The pre-processor takes care of Image processing and applies the predefined image processing steps on a given images. Image processing steps correspond to as Grayscale, Invert, Threshold, Skew, Bilinear, Bipolar, Interpolation, and the like. This list can be added / removed with some more pre-processing technique depends upon the result and configurable items. The system further comprises a Learner. The Learner is a module which will take care of applying the pre-processing technique has to be taken for. The Learner module will validate with configurable items and resultant extracted text. If it does match, it will store the corresponding pre-processes applied to the resultant file. If it doesn’t match, analyzer will move to next pre-process technique depends upon the configurable items. It will keep on running and finding the list of pre-process techniques and the resultant file will be the input to the pre-processor in automation flow.
[0016] In one embodiment, the second component is configured to read configurable items and will analyze the pre-process technique for that cropped image and provide the resultant file. Initially, the second component is configured to accept the configurable item shared by the user or accepted from a database. In the next step, the pre-process Learner in second component will first search for the text and mark the region of text on to the images. Further, the pre-processor is configured to identify the regions of text and apply the pre-process techniques for the configured image. Furthermore, the Learner in the second component will get the pre-processed image from the pre-processor and by using the OCR engine it will extract the text. Furthermore, learner is configured to Validate the extracted text with the expected text and if it matches then it will be collected as a list of applied pre-process techniques and final resultant pre-process set file will be generated. This resultant file will be an input to the pre-processor for next image iteration. Further, the network implementation of system configured for pre-processing an image before extracting text data from the image is illustrated with Figure 1.
[0017] Referring now to Figure 1, a network implementation 100 of a system 102 for pre-processing an image before extracting text data from the image is disclosed. Although the present subject matter is explained considering that the system 102 is implemented on a server, it may be understood that the system 102 may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, and the like. In one implementation, the system 102 may be implemented over a server. Further, the system 102 may be implemented in a cloud network. In one embodiment, the system may be implemented as a Platform as a Service (Paas). The system 102 may further be configured to communicate with an image capturing and maintenance platform 108.
[0018] Further, it will be understood that the system 102 may be accessed by multiple users through one or more user devices 104-1, 104-2…104-N, collectively referred to as user device 104 hereinafter, or applications residing on the user device 104. Examples of the user device 104 may include, but are not limited to, a portable computer, a personal digital assistant, a handheld device, and a workstation. The user device 104 may be communicatively coupled to the system 102 through a network 106.
[0019] In one implementation, the network 106 may be a wireless network, a wired network or a combination thereof. The network 106 may be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network 106 may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Hypertext Transfer Protocol Secure(HTTPS), File Transfer Protocol(FTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further, the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like. In one embodiment, the system 102 may be configured to receive data from the image capturing and maintenance platform 108. The data may be received in the form of one or more images. Once the system 102 receives the data, the system 102 is configured to process the data as described with respect to figure 2.
[0020] Referring now to figure 2, the system 102 is configured for pre-processing an image before extracting text data from the image in accordance with an embodiment of the present subject matter. In one embodiment, the system 102 may include at least one processor 202, an input/output (I/O) interface 204, and a memory 206. The at least one processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, at least one processor 202 may be configured to fetch and execute computer-readable instructions stored in the memory 206.
[0021] The I/O interface 204 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 204 may allow the system 102 to interact with the user directly or through the user device 104. Further, the I/O interface 204 may enable the system 102 to communicate with other computing devices, such as web servers and external data servers (not shown). The I/O interface 204 may facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. The I/O interface 204 may include one or more ports for connecting a number of devices to one another or to another server.
[0022] The memory 206 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memory 206 may include modules 208 and data 210.
[0023] The modules 208 may include routines, programs, objects, components, data structures, and the like, which perform particular tasks, functions or implement particular abstract data types. In one implementation, the modules 208 may be configured to perform functions of the speech controller, visual face recognition & controller, and modulation & frame decomposer. The module 208 may include a data management module 212, an image capturing module 214, an image segmentation module 216, an image analysis module 218, an image pre-processing module 220, an OCR engine 222, and other modules 224. The other modules 224 may include programs or coded instructions that supplement applications and functions of the system 102.
[0024] The data 210, amongst other things, serve as a repository for storing data processed, received, and generated by one or more of the modules 208. The data 210 may also include a central data 228, and other data 230. In one embodiment, the other data 230 may include data generated as a result of the execution of one or more modules in the other modules 224. In one implementation, a user may access the system 102 via the I/O interface 204. The user may be registered using the I/O interface 204 in order to use the system 102. In one aspect, the user may access the I/O interface 204 of the system 102 for obtaining information, providing input information or configuring the system 102. The functioning of all the modules in the system 102 is described as below:
DATA MANAGEMENT MODULE 212
[0025] In one embodiment, the data management module 212 may be configured for maintain a set of standard templates. Each standard template corresponds to a type of text data. In one embodiment, each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques. In one embodiment, the one or more pre-processing techniques are selected from Grayscale, Invert, Threshold, Skew, Bilinear, Bipolar, or Interpolation. In one embodiment, the type of text data may include numbers, alphabets or sign represented in different languages and fonts.
[0026] In one embodiment, each standard template is associated with a predefined text data. The one or more pre-processing techniques associated with each standard template is identified by processing the standard template using each pre-processing technique from the set of pre-processing techniques to generate a set of processed standard templates. Further, OCR processing is applied on the set of processed standard templates to fetch text data corresponding to each processed standard template. Further, the text data corresponding to each processed standard template is compared with predefined text data to determine accuracy of OCR conversion corresponding to each pre-processing technique from the set of pre-processing techniques. Furthermore, the one or more pre-processing techniques are assigned to the standard templates based on accuracy of OCR conversion associated with each pre-processing techniques.
IMAGE CAPTURING MODULE 214
[0027] In one embodiment, the image capturing module 214 is configured to receive a target image. The target image may be captured by an image capturing device such as camera, scanner, print screen command over a user interface and the like. In one embodiment, the target image may be received from image capturing and maintenance platform 108. In one embodiment, the target image may correspond to at least one of scanned text document, photo, animated image, or Graphical user interface screen shot of a software application. The Graphical user interface screen shot of the software application are processed to extract text data from each region of the Graphical user interface screen shot. In one embodiment, the text data is compared with a predefined text of a software application for graphical user interface software testing.
IMAGE SEGMENTATION MODULE 216
[0028] In one embodiment, the image segmentation module 216 is configured to divide the target image into a set of regions based on a predefined criteria. In one embodiment, the predefined criteria is selected from at least one of font size, cluster of data, type of font, text boundary, text colour, and sharpness of image. In one embodiment, one or more predefined criterion may be used in order to enable image segmentation.
IMAGE ANALYSIS MODULE 218
[0029] In one embodiment, the image analysis module 218 is configured to identify one or more pre-processing techniques corresponding to each region from the set of regions. The one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates.
IMAGE PRE-PROCESSING MODULE 220
[0030] In one embodiment, the image pre-processing module 220 is configured to generate a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region. The refined target image can be easily read by the OCR engine 222. In one embodiment, when the region in the target image does not match with the set of standard templates, the Image pre-processing module 220 is further configured to assign a new pre-processing techniques to the set of target templates. In one embodiment, the new pre-processing technique may be manually identified.
OCR ENGINE MODULE 222
[0031] Finally, the OCR engine 222 is configured to extract text data from the refined target image by processing the refined target image using at least one OCR technique.
[0032] Referring now to figure 3, a method 300 for pre-processing an image before extracting text data from the image, is disclosed in accordance with an embodiment of the present subject matter. The method 300 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, and the like, that perform particular functions or implement particular abstract data types. The method 300 may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.
[0033] The order in which the method 300 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 300 or alternate methods. Additionally, individual blocks may be deleted from the method 300 without departing from the spirit and scope of the subject matter described herein. Furthermore, the method 300 can be implemented in any suitable hardware, software, firmware, or combination thereof. However, for ease of explanation, in the embodiments described below, the method 300 may be considered to be implemented in the above described system 102.
[0034] At block 302, the data management module 212 may be configured for maintain a set of standard templates. Each standard template corresponds to a type of text data. In one embodiment, each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques. In one embodiment, the one or more pre-processing techniques are selected from Grayscale, Invert, Threshold, Skew, Bilinear, Bipolar, or Interpolation. In one embodiment, the type of text data may include numbers, alphabets or sign represented in different languages and fonts.
[0035] In one embodiment, each standard template is associated with a predefined text data. The one or more pre-processing techniques associated with each standard template is identified by processing the standard template using each pre-processing technique from the set of pre-processing techniques to generate a set of processed standard templates. Further, OCR processing is applied on the set of processed standard templates to fetch text data corresponding to each processed standard template. Further, the text data corresponding to each processed standard template is compared with predefined text data to determine accuracy of OCR conversion corresponding to each pre-processing technique from the set of pre-processing techniques. Furthermore, the one or more pre-processing techniques are assigned to the standard templates based on accuracy of OCR conversion associated with each pre-processing techniques.
[0036] At block 304, the image capturing module 214 is configured to receive a target image. The target image may be captured by an image capturing device such as camera, scanner, print screen command over a user interface and the like. In one embodiment, the target image may be received from image capturing and maintenance platform 108. In one embodiment, the target image may correspond to at least one of scanned text document, photo, animated image, or Graphical user interface screen shot of a software application. The Graphical user interface screen shot of the software application are processed to extract text data from each region of the Graphical user interface screen shot. In one embodiment, the text data is compared with a predefined text of a software application for graphical user interface software testing.
[0037] At block 306, the image segmentation module 216 is configured to divide the target image into a set of regions based on a predefined criteria. In one embodiment, the predefined criteria is selected from at least one of font size, cluster of data, type of font, text boundary, text colour, and sharpness of image. In one embodiment, one or more predefined criterion may be used in order to enable image segmentation.
[0038] At block 308, the image analysis module 218 is configured to identify one or more pre-processing techniques corresponding to each region from the set of regions. The one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates.
[0039] At block 310, the image pre-processing module 220 is configured to generate a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region. The refined target image can be easily read by the OCR engine 222. In one embodiment, when the region in the target image does not match with the set of standard templates, the Image pre-processing module 220 is further configured to assign a new pre-processing techniques to the set of target templates. In one embodiment, the new pre-processing technique may be manually identified.
[0040] At block 312, the OCR engine 222 is configured to extract text data from the refined target image by processing the refined target image using at least one OCR technique.
[0041] Although implementations for systems and methods for pre-processing an image before extracting text data from the image has been described, it is to be understood that the appended claims are not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations for pre-processing an image before extracting text data from the image.

Claims:
1. A system for pre-processing an image before extracting text data from the image, the system comprises:
a memory;
a processor coupled to the memory, wherein the processor is configured to execute programmed instructions stored in the memory for:
maintain a set of standard templates, wherein each standard template corresponds to type of text data, and wherein each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques;
receive a target image;
divide the target image into a set of regions based on a predefined criteria;
identify one or more pre-processing techniques corresponding to each region from the set of regions, wherein the one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates;
generate a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region; and
extract text data from the refined target image by processing the refined target image using at least one OCR technique.

2. The system of claim 1, wherein each standard template is associated with a predefined text data, wherein the one or more pre-processing techniques associated with each standard template is identified by
processing the standard template using each pre-processing technique from the set of pre-processing techniques to generate a set of processed standard templates;
applying OCR processing on the set of processed standard templates to fetch text data corresponding to each processed standard template;
comparing the text data, corresponding to each processed standard template, with predefined text data to determine accuracy of OCR conversion corresponding to each pre-processing technique from the set of pre-processing techniques; and
assigning one or more pre-processing techniques to the standard templates based on accuracy of OCR conversion associated with each pre-processing techniques.

3. The system of claim 1 further configured to assign a new pre-processing techniques to the set of target templates, when the region in the target image does not match with the set of standard templates.

4. The system of claim 1, wherein the predefined criteria is selected from at least one of font size, cluster of data, type of font, text boundary, text colour, and sharpness of image.

5. The system of claim 1, wherein the one or more pre-processing techniques are selected from Grayscale, Invert, Threshold, Skew, Bilinear, Bipolar, or Interpolation.

6. The system of claim 1, wherein the target image corresponds to at least one of scanned text document, photo, animated image, or Graphical user interface screen shot of a software application.

7. The system of claim 6, wherein the Graphical user interface screen shot of the software application are processed to extract text data from each region of the Graphical user interface screen shot, and wherein the text data is compared with a predefined text of a software application for graphical user interface software testing.

8. A method for Orchestration of Services among Partners and Original Equipment Manufacturers, the method comprises steps of:
maintaining, by a processor, a set of standard templates, wherein each standard template corresponds to type of text data, and wherein each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques;
receiving, by the processor, a target image;
dividing, by the processor, the target image into a set of regions based on a predefined criteria;
identifying, by the processor, one or more pre-processing techniques corresponding to each region from the set of regions, wherein the one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates;
generating, by the processor, a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region; and
extracting, by the processor, text data from the refined target image by processing the refined target image using at least one OCR technique.

9. The method of claim 8, wherein each standard template is associated with a predefined text data, wherein the one or more pre-processing techniques associated with each standard template are identified by
processing the standard template using each pre-processing technique from the set of pre-processing techniques to generate a set of processed standard templates;
applying OCR processing on the set of processed standard templates to fetch text data corresponding to each processed standard template;
comparing the text data, corresponding to each processed standard template, with predefined text data to determine accuracy of OCR conversion corresponding to each pre-processing technique from the set of pre-processing techniques; and
assigning one or more pre-processing techniques to the standard templates based on accuracy of OCR conversion associated with each pre-processing techniques.

10. The method of claim 8, further configured to assign a new pre-processing techniques to the set of target templates, when the region in the target image does not match with the set of standard templates.

11. The method of claim 8, wherein the predefined criteria is selected from at least one of font size, cluster of data, type of font, text boundary, text colour, and sharpness of image.

12. The method of claim 8, wherein the one or more pre-processing techniques are selected from Grayscale, Invert, Threshold, Skew, Bilinear, Bipolar, or Interpolation.

13. The method of claim 8, wherein the target image corresponds to at least one of scanned text document, photo, animated image, or Graphical user interface screen shot of a software application.

14. The method of claim 13, wherein the Graphical user interface screen shot of the software application are processed to extract text data from each region of the Graphical user interface screen shot, and wherein the text data is compared with a predefined text of a software application for graphical user interface software testing.

15. A computer program product having embodied thereon a computer program for pre-processing an image before extracting text data from the image, the computer program product comprises:
a program code for maintaining a set of standard templates, wherein each standard template corresponds to type of text data, and wherein each standard template is associated with one or more pre-processing techniques selected from a set of pre-processing techniques;
a program code for receiving a target image;
a program code for dividing the target image into a set of regions based on a predefined criteria;
a program code for identifying one or more pre-processing techniques corresponding to each region from the set of regions, wherein the one or more pre-processing techniques are identified based on the comparison of each region with the set of standard templates;
a program code for generating a refined target image by processing each region of the target image using the one or more pre-processing techniques corresponding to each region; and
a program code for extracting text data from the refined target image by processing the refined target image using at least one OCR technique.

Documents

Orders

Section	Controller	Decision Date

Application Documents

#	Name	Date
1	201811031351-IntimationOfGrant04-01-2024.pdf	2024-01-04
1	201811031351-STATEMENT OF UNDERTAKING (FORM 3) [21-08-2018(online)].pdf	2018-08-21
2	201811031351-PatentCertificate04-01-2024.pdf	2024-01-04
2	201811031351-REQUEST FOR EXAMINATION (FORM-18) [21-08-2018(online)].pdf	2018-08-21
3	201811031351-Written submissions and relevant documents [03-01-2024(online)].pdf	2024-01-03
3	201811031351-REQUEST FOR EARLY PUBLICATION(FORM-9) [21-08-2018(online)].pdf	2018-08-21
4	201811031351-POWER OF AUTHORITY [21-08-2018(online)].pdf	2018-08-21
4	201811031351-FORM-26 [07-12-2023(online)].pdf	2023-12-07
5	201811031351-FORM-9 [21-08-2018(online)].pdf	2018-08-21
5	201811031351-Correspondence to notify the Controller [04-12-2023(online)].pdf	2023-12-04
6	201811031351-US(14)-HearingNotice-(HearingDate-19-12-2023).pdf	2023-11-17
6	201811031351-FORM 18 [21-08-2018(online)].pdf	2018-08-21
7	201811031351-Proof of Right [20-10-2021(online)].pdf	2021-10-20
7	201811031351-FORM 1 [21-08-2018(online)].pdf	2018-08-21
8	201811031351-FIGURE OF ABSTRACT [21-08-2018(online)].jpg	2018-08-21
8	201811031351-FER.pdf	2021-10-18
9	201811031351-DRAWINGS [21-08-2018(online)].pdf	2018-08-21
9	201811031351-FORM 13 [09-07-2021(online)].pdf	2021-07-09
10	201811031351-COMPLETE SPECIFICATION [21-08-2018(online)].pdf	2018-08-21
10	201811031351-POA [09-07-2021(online)].pdf	2021-07-09
11	201811031351-CLAIMS [08-07-2021(online)].pdf	2021-07-08
11	abstract.jpg	2018-09-20
12	201811031351-COMPLETE SPECIFICATION [08-07-2021(online)].pdf	2021-07-08
12	201811031351-Proof of Right (MANDATORY) [21-02-2019(online)].pdf	2019-02-21
13	201811031351-FER_SER_REPLY [08-07-2021(online)].pdf	2021-07-08
13	201811031351-OTHERS-280219.pdf	2019-03-01
14	201811031351-Correspondence-280219.pdf	2019-03-01
14	201811031351-OTHERS [08-07-2021(online)].pdf	2021-07-08
15	201811031351-Correspondence-280219.pdf	2019-03-01
15	201811031351-OTHERS [08-07-2021(online)].pdf	2021-07-08
16	201811031351-FER_SER_REPLY [08-07-2021(online)].pdf	2021-07-08
16	201811031351-OTHERS-280219.pdf	2019-03-01
17	201811031351-Proof of Right (MANDATORY) [21-02-2019(online)].pdf	2019-02-21
17	201811031351-COMPLETE SPECIFICATION [08-07-2021(online)].pdf	2021-07-08
18	201811031351-CLAIMS [08-07-2021(online)].pdf	2021-07-08
18	abstract.jpg	2018-09-20
19	201811031351-COMPLETE SPECIFICATION [21-08-2018(online)].pdf	2018-08-21
19	201811031351-POA [09-07-2021(online)].pdf	2021-07-09
20	201811031351-DRAWINGS [21-08-2018(online)].pdf	2018-08-21
20	201811031351-FORM 13 [09-07-2021(online)].pdf	2021-07-09
21	201811031351-FER.pdf	2021-10-18
21	201811031351-FIGURE OF ABSTRACT [21-08-2018(online)].jpg	2018-08-21
22	201811031351-FORM 1 [21-08-2018(online)].pdf	2018-08-21
22	201811031351-Proof of Right [20-10-2021(online)].pdf	2021-10-20
23	201811031351-FORM 18 [21-08-2018(online)].pdf	2018-08-21
23	201811031351-US(14)-HearingNotice-(HearingDate-19-12-2023).pdf	2023-11-17
24	201811031351-Correspondence to notify the Controller [04-12-2023(online)].pdf	2023-12-04
24	201811031351-FORM-9 [21-08-2018(online)].pdf	2018-08-21
25	201811031351-POWER OF AUTHORITY [21-08-2018(online)].pdf	2018-08-21
25	201811031351-FORM-26 [07-12-2023(online)].pdf	2023-12-07
26	201811031351-Written submissions and relevant documents [03-01-2024(online)].pdf	2024-01-03
26	201811031351-REQUEST FOR EARLY PUBLICATION(FORM-9) [21-08-2018(online)].pdf	2018-08-21
27	201811031351-REQUEST FOR EXAMINATION (FORM-18) [21-08-2018(online)].pdf	2018-08-21
27	201811031351-PatentCertificate04-01-2024.pdf	2024-01-04
28	201811031351-STATEMENT OF UNDERTAKING (FORM 3) [21-08-2018(online)].pdf	2018-08-21
28	201811031351-IntimationOfGrant04-01-2024.pdf	2024-01-04

Search Strategy

1	searchstrategy_201811031351E_10-01-2021.pdf