System And Method For Detecting And Annotating Bold Text In An Image

System And Method For Detecting And Annotating Bold Text In An Image Document

Abstract: This disclosure relates generally to image processing, and more particularly to system and method for detecting and annotating bold text in an image document. In one embodiment, a method is provided for annotating bold text in an image document. The method comprises receiving the image document, processing the image document to derive a digitized textual image, detecting one or more regions of bold text within the digitized textual image using an adaptive edge rounding filter, and annotating the one or more regions of bold text within the image document. FIG. 4

Patent Information

Application #

Filing Date

30 March 2017

Publication Number

40/2018

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

ipo@knspartners.com

Parent Application

Patent Number

Legal Status

Grant Date

2023-12-19

Renewal Date

Applicants

WIPRO LIMITED

Doddakannelli, Sarjapur Road, Bangalore 560035, Karnataka, India.

Inventors

1. RAGHAVENDRA HOSABETTU

#3080, Venkatadri Nilaya, 2nd Main, 3rd Cross, VHBCS Layout, Banashankari 3rd Stage, Bangalore – 560085, Karnataka, India.

2. RAGHAV NAGPAL

#425, 'O' Block, Platinum City, HMT Main Road, Yeshwanthpura, Bangalore -560022, Karnataka, India.

3. ADITHYA

#1447, 28th Main, South End 1st A Cross, Jayanagar 9th Block, Bangalore 560069, Karnataka, India.

Specification

Claims:WE CLAIMS:
1. A method for annotating bold text in an image document, the method comprising:
receiving, by a bold text detection and annotation engine, the image document;
processing, by the bold text detection and annotation engine, the image document to derive a digitized textual image;
detecting, by the bold text detection and annotation engine, one or more regions of bold text within the digitized textual image using an adaptive edge rounding filter; and
annotating, by the bold text detection and annotation engine, the one or more regions of bold text within the image document.

2. The method of claim 1, wherein processing the image document comprises:
generating a digitized image of the image document; and
removing at least one of a noise and a graphical region from the digitized image to derive the digitized textual image.

3. The method of claim 2, wherein the digitized image comprises a binary image.

4. The method of claim 1, wherein detecting the one or more regions of bold text comprises determining a plurality of regions of interest by applying the adaptive edge rounding filter to the digitized textual image.

5. The method of claim 4, wherein applying the adaptive edge rounding filter comprises:
determining a plurality of characters in the digitized textual image;
for each of the plurality of characters,
determining a height of a character;
determining a radius for the character based on a dots per inch (dpi) of the image document and the height of the character;
for each of a plurality of pixels in the character,
determining a plurality of subpixels within the radius of a pixel;
determining a dominant pixel type among the plurality of subpixels; and
setting the pixel to the dominant pixel type.

6. The method of claim 4, wherein detecting the one or more regions of bold text further comprises applying a multi-dimensional K-nearest neighbor (KNN) algorithm on the plurality of regions of interest.

7. The method of claim 6, wherein applying the multi-dimensional KNN algorithm comprises:
determining a threshold size for the plurality of regions of interest using the multi-dimensional KNN algorithm;
determining a size of each of the plurality of regions of interest; and
selecting the one or more regions of bold text from the plurality of regions of interest based on the size and the threshold size.

8. The method of claim 1, wherein annotating the one or more regions of bold text comprises:
dilating one or more regions, in the image document, corresponding to the one or more regions of bold text to a corresponding word boundary.

9. A system for annotating bold text in an image document, the system comprising:
at least one processor; and
a computer-readable medium storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
receiving the image document;
processing the image document to derive a digitized textual image;
detecting one or more regions of bold text within the digitized textual image using an adaptive edge rounding filter; and
annotating the one or more regions of bold text within the image document.

10. The system of claim 9, wherein detecting the one or more regions of bold text comprises determining a plurality of regions of interest by applying the adaptive edge rounding filter to the digitized textual image.

11. The system of claim 10, wherein applying the adaptive edge rounding filter comprises:
determining a plurality of characters in the digitized textual image;
for each of the plurality of characters,
determining a height of a character;
determining a radius for the character based on a dots per inch (dpi) of the image document and the height of the character;
for each of a plurality of pixels in the character,
determining a plurality of subpixels within the radius of a pixel;
determining a dominant pixel type among the plurality of subpixels; and
setting the pixel to the dominant pixel type.

12. The system of claim 10, wherein detecting the one or more regions of bold text further comprises applying a multi-dimensional K-nearest neighbor (KNN) algorithm on the plurality of regions of interest.

13. The system of claim 12, wherein applying the multi-dimensional KNN algorithm comprises:
determining a threshold size for the plurality of regions of interest using the multi-dimensional KNN algorithm;
determining a size of each of the plurality of regions of interest; and
selecting the one or more regions of bold text from the plurality of regions of interest based on the size and the threshold size.

14. The system of claim 9, wherein annotating the one or more regions of bold text comprises:
dilating one or more regions, in the image document, corresponding to the one or more regions of bold text to a corresponding word boundary.

15. A non-transitory computer-readable storage medium having stored thereon, a set of computer-executable instructions for causing a computer comprising one or more processors to perform steps comprising:
receiving the image document;
processing the image document to derive a digitized textual image;
detecting one or more regions of bold text within the digitized textual image using an adaptive edge rounding filter; and
annotating the one or more regions of bold text within the image document.

16. The non-transitory computer-readable storage medium of claim 15, wherein detecting the one or more regions of bold text comprises determining a plurality of regions of interest by applying the adaptive edge rounding filter to the digitized textual image.

17. The non-transitory computer-readable storage medium of claim 16, wherein applying the adaptive edge rounding filter comprises:
determining a plurality of characters in the digitized textual image;
for each of the plurality of characters,
determining a height of a character;
determining a radius for the character based on a dots per inch (dpi) of the image document and the height of the character;
for each of a plurality of pixels in the character,
determining a plurality of subpixels within the radius of a pixel;
determining a dominant pixel type among the plurality of subpixels; and
setting the pixel to the dominant pixel type.

18. The non-transitory computer-readable storage medium of claim 16, wherein detecting the one or more regions of bold text further comprises applying a multi-dimensional K-nearest neighbor (KNN) algorithm on the plurality of regions of interest.

19. The non-transitory computer-readable storage medium of claim 18, wherein applying the multi-dimensional KNN algorithm comprises:
determining a threshold size for the plurality of regions of interest using the multi-dimensional KNN algorithm;
determining a size of each of the plurality of regions of interest; and
selecting the one or more regions of bold text from the plurality of regions of interest based on the size and the threshold size.

20. The non-transitory computer-readable storage medium of claim 15, wherein annotating the one or more regions of bold text comprises:
dilating one or more regions, in the image document, corresponding to the one or more regions of bold text to a corresponding word boundary.

Dated this 30th day of March, 2017

Swetha SN
Of K&S Partners
Agent for the Applicant
, Description:TECHNICAL FIELD
This disclosure relates generally to image processing, and more particularly to system and method for detecting and annotating bold text in an image document.

Documents

Application Documents

#	Name	Date
1	201741011335-PROOF OF ALTERATION [18-03-2024(online)].pdf	2024-03-18
1	Power of Attorney [30-03-2017(online)].pdf	2017-03-30
2	Form 5 [30-03-2017(online)].pdf	2017-03-30
2	201741011335-IntimationOfGrant19-12-2023.pdf	2023-12-19
3	Form 3 [30-03-2017(online)].pdf	2017-03-30
3	201741011335-PatentCertificate19-12-2023.pdf	2023-12-19
4	Form 18 [30-03-2017(online)].pdf_121.pdf	2017-03-30
5	Form 18 [30-03-2017(online)].pdf	2017-03-30
5	201741011335-PETITION UNDER RULE 137 [14-12-2020(online)].pdf	2020-12-14
6	Form 1 [30-03-2017(online)].pdf	2017-03-30
7	Drawing [30-03-2017(online)].pdf	2017-03-30
8	Description(Complete) [30-03-2017(online)].pdf_120.pdf	2017-03-30
8	201741011335-FER.pdf	2020-06-17
9	Description(Complete) [30-03-2017(online)].pdf	2017-03-30
9	Correspondence By Agent_Form1_23-06-2017.pdf	2017-06-23
10	PROOF OF RIGHT [21-06-2017(online)].pdf	2017-06-21
11	Correspondence By Agent_Form1_23-06-2017.pdf	2017-06-23
11	Description(Complete) [30-03-2017(online)].pdf	2017-03-30
12	201741011335-FER.pdf	2020-06-17
12	Description(Complete) [30-03-2017(online)].pdf_120.pdf	2017-03-30
13	201741011335-Information under section 8(2) [13-12-2020(online)].pdf	2020-12-13
13	Drawing [30-03-2017(online)].pdf	2017-03-30
14	201741011335-FORM 3 [13-12-2020(online)].pdf	2020-12-13
14	Form 1 [30-03-2017(online)].pdf	2017-03-30
15	201741011335-PETITION UNDER RULE 137 [14-12-2020(online)].pdf	2020-12-14
15	Form 18 [30-03-2017(online)].pdf	2017-03-30
16	201741011335-FER_SER_REPLY [14-12-2020(online)].pdf	2020-12-14
16	Form 18 [30-03-2017(online)].pdf_121.pdf	2017-03-30
17	Form 3 [30-03-2017(online)].pdf	2017-03-30
17	201741011335-PatentCertificate19-12-2023.pdf	2023-12-19
18	Form 5 [30-03-2017(online)].pdf	2017-03-30
18	201741011335-IntimationOfGrant19-12-2023.pdf	2023-12-19
19	Power of Attorney [30-03-2017(online)].pdf	2017-03-30
19	201741011335-PROOF OF ALTERATION [18-03-2024(online)].pdf	2024-03-18

Search Strategy

1	2020-06-1212-11-11E_16-06-2020.pdf