System And Method For Performing Optical Character Recognition

Abstract: This disclosure relates generally to optical character recognition (OCR), and more particularly to system and method for positioning characters in output text after performing OCR. In one embodiment, a method is provided for performing OCR. The method includes receiving a textual image, and extracting one or more character images along with one or more corresponding position indices from the textual image. A position index of a character image may include a line number, a word number, and a character number of the character image within the textual image. The method further includes determining one or more characters corresponding to the one or more character images using an OCR algorithm, and generating a text by positioning the one or more characters based on the one or more position indices of the one or more corresponding character images. FIGURE 2

Patent Information

Application #

Filing Date

04 January 2018

Publication Number

27/2019

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

bangalore@knspartners.com

Parent Application

Patent Number

Legal Status

Grant Date

2023-09-16

Renewal Date

Applicants

WIPRO LIMITED

Doddakannelli, Sarjapur Road, Bangalore 560035, Karnataka, India.

Inventors

1. SIBI CHAKRAVARTHY SHANMUGAVEL

32, Kurunji Illam, Balakrishnapuram, Dindigul 624005, Tamil Nadu, India.

2. SANDHYA RAMACHANDRAN

GF002, Uday Residency, #255 7th Cross, 1st Main, Ankappa Layout, Manjunathanagar, Uttarahalli Main Road, Bangalore – 560061, Karnataka, India.

3. ABHINAY REDDY VONGUR

201, V.R Applause Towers, 16-2-738/12, Asmangadh, Malakpet, Hyderabad 500036, Telangana, India.

Specification

Claims:WE CLAIM
1. A method for performing optical character recognition (OCR), the method comprising:
receiving, by an OCR device, a textual image;
extracting, by the OCR device, one or more character images along with one or more corresponding position indices from the textual image, wherein a position index of a character image comprises a line number, a word number, and a character number of the character image within the textual image;
determining, by the OCR device, one or more characters corresponding to the one or more character images using an OCR algorithm; and
generating, by the OCR device, a text by positioning the one or more characters based on the one or more position indices of the one or more corresponding character images.
2. The method of claim 1, further comprising pre-processing the textual image by at least one of:
converting the textual image into a greyscale image,
binarizing the greyscale image,
removing noise from the greyscale image, or
correcting distortions in the greyscale image.
3. The method of claim 1, wherein extracting comprises:
extracting one or more textual line images along with one or more corresponding line numbers from the textual image;
extracting one or more word images along with one or more corresponding word numbers from each of the one or more textual line images; and
extracting the one or more character images along with one or more corresponding character numbers from each of the one or more word images.
4. The method of claim 3, wherein extracting the one or more textual line images along with the one or more corresponding line numbers comprises:
analyzing the textual image to determine one or more contiguous stretches of horizontal non-textual spaces, wherein each of the one or more contiguous stretches of horizontal non-textual spaces comprises a number of horizontal non-textual bins, and wherein each of the horizontal non-textual bins consists of non-text pixels;
determining a mean vertical coordinate of the non-text pixels in each of the one or more contiguous stretches of horizontal non-textual spaces;
segmenting the textual image into one or more textual line images based on the mean vertical coordinate of each of the one or more contiguous stretches of horizontal non-textual spaces; and
storing each of the one or more textual line images along with a corresponding line number.
5. The method of claim 4, wherein the number of horizontal non-textual bins is in excess of or equal to a predetermined horizontal threshold, and wherein the predetermined horizontal threshold is determined based on an analysis of the textual image.
6. The method of claim 3, wherein extracting the one or more word images along with the one or more corresponding word numbers comprises:
analyzing each of the one or more textual line images to determine one or more contiguous stretches of vertical non-textual spaces, wherein each of the one or more contiguous stretches of vertical non-textual spaces comprises a number of vertical non-textual bins in excess of or equal to a first predetermined vertical threshold, and wherein each of the vertical non-textual bins consists of non-text pixels;
determining a mean horizontal coordinate of the non-text pixels in each of the one or more contiguous stretches of vertical non-textual spaces;
segmenting the textual line image into one or more word images based on the mean horizontal coordinate of each of the one or more contiguous stretches of vertical non-textual spaces; and
storing each of the one or more word images along with a corresponding line number and a corresponding word number.
7. The method of claim 6, wherein the first predetermined vertical threshold is determined based on an analysis of the textual image or an analysis of the textual line image.
8. The method of claim 3, wherein extracting the one or more character images along with one or more corresponding character numbers comprises:
analyzing each of the one or more word images to determine one or more contiguous stretches of vertical non-textual spaces, wherein each of the one or more contiguous stretches of vertical non-textual spaces comprises a number of vertical non-textual bins, and wherein each of the vertical non-textual bins consists of non-text pixels;
determining a mean horizontal coordinate of the non-text pixels in each of the one or more contiguous stretches of vertical non-textual spaces;
segmenting the word image into one or more character images based on the mean horizontal coordinate of each of the one or more contiguous stretches of vertical non-textual spaces; and
storing each of the one or more character images along with a corresponding line number, a corresponding word number, and a corresponding character number.
9. The method of claim 8, wherein the number of vertical non-textual bins is less than a first pre-determined vertical threshold, or in excess of or equal to a second predetermined vertical threshold but less than the first pre-determined vertical threshold, wherein the first predetermined vertical threshold is determined based on an analysis of the textual image or an analysis of the textual line image, and wherein the second predetermined vertical threshold is determined based on an analysis of the textual image, an analysis of the textual line image, or an analysis of the word image.
10. The method of claim 1, wherein positioning the one or more characters comprises:
sorting the one or more characters in ascending order based on the one or more line numbers, followed by the one or more word numbers, and followed by the one or more character numbers;
comparing a line number of a character with a line number of preceding character;
upon determining the line number of the character to be greater than the line number of preceding character, placing the character in next line;
upon determining the line number of the character to be same as the line number of preceding character, comparing a word number of a character with a word number of preceding character;
upon determining the word number of the character to be greater than the word number of preceding character, placing the character after a space; and
upon determining the word number of the character to be same as the word number of preceding character, placing the character next to the preceding character.
11. The method of claim 1, wherein the position index of the character image further comprises at least one of a page number, a row number, or a column number.
12. A system for performing optical character recognition (OCR), the system comprising:
at least one processor; and
a computer-readable medium storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
receiving a textual image;
extracting one or more character images along with one or more corresponding position indices from the textual image, wherein a position index of a character image comprises a line number, a word number, and a character number of the character image within the textual image;
determining one or more characters corresponding to the one or more character images using an OCR algorithm; and
generating a text by positioning the one or more characters based on the one or more position indices of the one or more corresponding character images.
13. The system of claim 12, wherein extracting comprises extracting one or more textual line images along with one or more corresponding line numbers from the textual image by:
analyzing the textual image to determine one or more contiguous stretches of horizontal non-textual spaces, wherein each of the one or more contiguous stretches of horizontal non-textual spaces comprises a number of horizontal non-textual bins, and wherein each of the horizontal non-textual bins consists of non-text pixels;
determining a mean vertical coordinate of the non-text pixels in each of the one or more contiguous stretches of horizontal non-textual spaces;
segmenting the textual image into one or more textual line images based on the mean vertical coordinate of each of the one or more contiguous stretches of horizontal non-textual spaces; and
storing each of the one or more textual line images along with a corresponding line number.
14. The system of claim 13, wherein extracting further comprises extracting one or more word images along with one or more corresponding word numbers from each of the one or more textual line images by:
analyzing each of the one or more textual line images to determine one or more contiguous stretches of vertical non-textual spaces, wherein each of the one or more contiguous stretches of vertical non-textual spaces comprises a number of vertical non-textual bins in excess of or equal to a first predetermined vertical threshold, and wherein each of the vertical non-textual bins consists of non-text pixels;
determining a mean horizontal coordinate of the non-text pixels in each of the one or more contiguous stretches of vertical non-textual spaces;
segmenting the textual line image into one or more word images based on the mean horizontal coordinate of each of the one or more contiguous stretches of vertical non-textual spaces; and
storing each of the one or more word images along with a corresponding line number and a corresponding word number.
15. The system of claim 14, wherein extracting further comprises extracting the one or more character images along with one or more corresponding character numbers from each of the one or more word images by:
analyzing each of the one or more word images to determine one or more contiguous stretches of vertical non-textual spaces, wherein each of the one or more contiguous stretches of vertical non-textual spaces comprises a number of vertical non-textual bins, and wherein each of the vertical non-textual bins consists of non-text pixels;
determining a mean horizontal coordinate of the non-text pixels in each of the one or more contiguous stretches of vertical non-textual spaces;
segmenting the word image into one or more character images based on the mean horizontal coordinate of each of the one or more contiguous stretches of vertical non-textual spaces; and
storing each of the one or more character images along with a corresponding line number, a corresponding word number, and a corresponding character number.
16. The system of claim 12, wherein positioning the one or more characters comprises:
sorting the one or more characters in ascending order based on the one or more line numbers, followed by the one or more word numbers, and followed by the one or more character numbers;
comparing a line number of a character with a line number of preceding character;
upon determining the line number of the character to be greater than the line number of preceding character, placing the character in next line;
upon determining the line number of the character to be same as the line number of preceding character, comparing a word number of a character with a word number of preceding character;
upon determining the word number of the character to be greater than the word number of preceding character, placing the character after a space; and
upon determining the word number of the character to be same as the word number of preceding character, placing the character next to the preceding character.
17. The system of claim 12, wherein the position index of the character image further comprises at least one of a page number, a row number, or a column number.

Dated this 4th day of January 2018

R Ramya Rao
Of K&S Partners
Agent for the Applicant , Description:TECHNICAL FIELD
This disclosure relates generally to optical character recognition (OCR), and more particularly to system and method for positioning characters in output text after performing OCR.

Documents

Application Documents

#	Name	Date
1	201841000369-STATEMENT OF UNDERTAKING (FORM 3) [04-01-2018(online)].pdf	2018-01-04
2	201841000369-REQUEST FOR EXAMINATION (FORM-18) [04-01-2018(online)].pdf	2018-01-04
3	201841000369-REQUEST FOR CERTIFIED COPY [04-01-2018(online)].pdf	2018-01-04
4	201841000369-POWER OF AUTHORITY [04-01-2018(online)].pdf	2018-01-04
5	201841000369-FORM 18 [04-01-2018(online)].pdf	2018-01-04
6	201841000369-FORM 1 [04-01-2018(online)].pdf	2018-01-04
7	201841000369-DRAWINGS [04-01-2018(online)].pdf	2018-01-04
8	201841000369-DECLARATION OF INVENTORSHIP (FORM 5) [04-01-2018(online)].pdf	2018-01-04
9	201841000369-COMPLETE SPECIFICATION [04-01-2018(online)].pdf	2018-01-04
10	201841000369-Proof of Right (MANDATORY) [02-05-2018(online)].pdf	2018-05-02
11	Correspondence by Agent_Form1_07-05-2018.pdf	2018-05-07
12	201841000369-PETITION UNDER RULE 137 [04-03-2021(online)].pdf	2021-03-04
13	201841000369-FORM 3 [04-03-2021(online)].pdf	2021-03-04
14	201841000369-FER_SER_REPLY [04-03-2021(online)].pdf	2021-03-04
15	201841000369-FER.pdf	2021-10-17
16	201841000369-US(14)-HearingNotice-(HearingDate-29-05-2023).pdf	2023-04-21
17	201841000369-POA [28-04-2023(online)].pdf	2023-04-28
18	201841000369-FORM 13 [28-04-2023(online)].pdf	2023-04-28
19	201841000369-Correspondence to notify the Controller [28-04-2023(online)].pdf	2023-04-28
20	201841000369-AMENDED DOCUMENTS [28-04-2023(online)].pdf	2023-04-28
21	201841000369-Written submissions and relevant documents [13-06-2023(online)].pdf	2023-06-13
22	201841000369-FORM-26 [13-06-2023(online)].pdf	2023-06-13
23	201841000369-FORM 3 [13-06-2023(online)].pdf	2023-06-13
24	201841000369-PatentCertificate16-09-2023.pdf	2023-09-16
25	201841000369-IntimationOfGrant16-09-2023.pdf	2023-09-16

Search Strategy

1	SearchStrategyMatrix201841000369E_28-08-2020.pdf