Sign In to Follow Application
View All Documents & Correspondence

Localization And Detection Of Text Regions Of Video Sequences

Abstract: The invention disclosed relates to a method and system for localization and detection of text regions of video sequences, the method comprising the following steps: a) compressing the video sequence using an H.264 hybrid video codec; b) calculating the first order differences of DC coefficients of integer transformed residual part of the luma components in the P frames of said compressed video sequence comprising the steps of. i)computing said DC coefficient values for each of the 4x4 sub-blocks obtained from said codec; ii) obtaining said first order differences of said DC coefficient values of neighboring sub-blocks in x and y directions; iii) applying low pass filtering on said obtained first order differences to smoothen the data; iv) defining a threshold value based on which the candidate text regions can be detected; and v) detecting said candidate text regions by comparing the values of said obtained first order differences with said threshold value; c) calculating the vertical edge strengths of each macro block of said compressed video sequence using a deblocking filter; and d) detecting text regions with the help of the values of said first order differences and the values of said vertical edge strengths using a text region detector in a predetermined manner.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
17 October 2008
Publication Number
32/2010
Publication Type
INA
Invention Field
COMMUNICATION
Status
Email
Parent Application
Patent Number
Legal Status
Grant Date
2018-06-26
Renewal Date

Applicants

TATA CONSULTANCY SERVICES LIMITED
NIRMAL BUILDING, 9TH FLOOR. NARIMAN POINT. MUMBAI-400021, MAHARASHTRA. INDIA

Inventors

1. CHATTOPADHYAY TANUSHYAM
TATA CONSULTANCY SERVICES BENGAL INTELLIGENT PARK, BUILDING-D, PLOT NO. A2,M2 & N2, BLOCK-EP, SALT LAKE ELECTRONIC COMPLEX, SECTOR-V, KOLKATA-70001, WEST BENGAL, INDIA
2. SINHA ANIRUDDHA
TATA CONSULTANCY SERVICES BENGAL INTELLIGENT PARK, BUILDING-D, PLOT NO. A2,M2 & N2, BLOCK-EP, SALT LAKE ELECTRONIC COMPLEX, SECTOR-V, KOLKATA-70001, WEST BENGAL, INDIA
3. CHAKI AYAN
TATA CONSULTANCY SERVICES BENGAL INTELLIGENT PARK, BUILDING-D, PLOT NO. A2,M2 & N2, BLOCK-EP, SALT LAKE ELECTRONIC COMPLEX, SECTOR-V, KOLKATA-70001, WEST BENGAL, INDIA

Specification

FORM -2
THE PATENTS ACT, 1970 (39 of 1970) & THE PATENTS RULES, 2003
PROVISIONAL
Specification
(Sec Section 10 and rule 13)
LOCALIZATION AND DETECTION OF TEXT REGIONS OF VIDEO
SEQUENCES


TATA CONSULTANCY SERVICES LTD.,
an Indian Company of Nirmal Building, 9Ih floor, Nariman Point, Mumbai 400 021, Maharashtra, India
THE FOLLOWING SPECIFICATION DESCRIBES THE INVENTION

This invention relates to localization and detection of text regions of video sequences.
In particular, this invention envisages a novel way of text images detection of a video using compressed domain information of a H.264 Codec.
In particular, this invention relates to automatic detection of text region from an input video sequence.
In particular, this invention relates to a non-threshold based technique for detection of text region in an input sequence.
Background of the Invention:
Different commercial agencies spend millions of dollars for advertisement and sponsorship to different sports events like tennis, soccer, cricket and many other popular sports. They usually place their logo and trademarks in billboards placed in the ground. In cricket it is also very popular to paint a trademark or logo in bowler run up and behind the stumps to get a better visibility. But it is difficult for the advertisements agencies to track the benefit or business they are getting out of those sponsorships. Till now, they usually verify the visibility of their brand by annotating the video manually. But these manual methods are tedious, labor intensive and thus costly. So if these trademarks and logos can be recognized automatically from the streamed/broadcasted videos, it will be helpful for the marketing cell of those organizations to monitor their visibility against the advertisement cost. Moreover, if any less computationally expensive methods can be proposed so that it can recognize the trademarks, it can give rise to a new business model concept of e-commerce called channel hyper linking.

In this concept, in the client end, a hyper link is placed on the portion of the videos containing the trademark and logos so that the viewer can directly browse to the website of the company and place an order.
Textual part of a video carries good amount of semantic information of the video data. Different applications like key word based video search, automatic indexing of videos, multilingual subtitle generation, automatic video indexing, and channel hyper linking can be developed by recognizing the text embedded into the video. In the current invention the use case of channel hyper linking is considered. This invention describes a practical and reliable solution/approach to achieve automated tracking and localization of text regions from a streamed video. In the present invention, the features obtained from H.264 compressed domain data are used. the current invention focuses on the problem of identification and localization of the text regions from a video stream in real time only.
In accordance with this invention there is envisaged a novel way of text images detection of a video using compressed domain information of a block based hybrid video Codec.
In particular, this invention envisages a novel way of text images detection of a video using compressed domain information of a H.264 Codec. In accordance with this invention a system is envisaged where the entire video is split into several smaller logical units, each having a text region using compressed domain features of H.264.
This invention is therefore designed to detect the text regions and also locate the position of the text regions in real time.

This invention provides a method for text region detection from a compressed H.264 input video data using compressed domain information of H.264 hybrid video codec comprising the following steps:
(a) calculating the first order difference of DC coefficients of integer transformed residual part of the luma component in the P frames/fields;
(b) calculating the vertical edge strengths computed from de-blocking filter; and
(c) Detecting a text region by evaluating the parameters calculated in the steps (a) and (b).
Typically, step (a) involves calculating the first order difference of DC coefficients of Luma and chroma value for each 4x4 sub block.
Particularly, step (a) includes the steps of: Obtaining a region at which
(a) a change is found in the DC values of integer transformed coefficients of Luma and chroma component;
(b) applying a prescribed low pass filter to smoothen the data, and
(c) Defining a threshold value based on which the candidate text region can be detected.
Typically, step (b) includes the step of: using the de-blocking filter based features in text region detection.

Particularly, step (b) includes the step of: using the vertical edge strength of the de-blocking filter.
The method of the invention further comprises the step of : estimating a candidate text region using the methods described herein above.
Particularly, the method of the invention includes the step of filtering all candidate text regions based on (a) temporal and (b)spatial information.
Particularly, step (a) checks the continuity of text region in consecutive frames., typically by removing ail spurious candidate regions using a pre determined logic.
Particularly, step (b) includes the step of using the coordinate information of candidate text regions to identify the text regions
More specifically, step (b) uses median and mode information of coordinate information of candidate text regions to identify the text regions
Brief Description of the Accompanying Drawings:
The invention will now be described with reference to the accompanying drawings, in which
Figure 1 gives an overview of the method in accordance with this invention Detailed Description of the Invention:

Different types of texts can be found from spoils videos like, trademarks painted in billboards, trademark on the jersey of a player, trademarks painted in the ground, score cards, and the texts coming because of video editing. In the current invention the text regions are identified from compressed domain features of H.264 and identify the trademarks from scorecards . Our proposed method is based on two features which can be directly obtained from compressed domain H.264 stream during decoding it. In this section we shall first discuss about these features, why they are used in the proposed method and finally about the decision making process based on these features. The overview of the process is described in Figure 11 of the accompanying drawings..
The two features used for text region identification from compressed domain information are described below:
DC component of transformed Luma coefficient based features: In H.264 4x4 Integer transformation is used which is different from the 8x8 DCT transformation of MPEG series video codec.
DC components of the integer transformed luma coefficients is a representative of the original video at a lower resolution. In H.264, unlike previous video codecs, 4x4 block size is used. So it gives the iconic representation of the video more precisely. The pseudo code used in the proposed method is given below:
Get the Luma DC value ( dc1) for each 4x4 sub block from decoder
Compute the first order difference (ax(dc1)anday (dc1)) of 'with neighboring
sub blocks in x and y direction.
From observation it is found that the difference is very high for a high contrast
region.

Obtain such for different (not including the test sequences)
Run K-Means algorithm (with K = 2) on them and find the centroid the
high valued cluster.
If is greater than the experimentally obtained threshold value
mark that MB as a candidate text in this frame and store this MB number in an array (a1)
De-blocking filter based feature:
The main focus for placing trademarks into billboards and ground is to get a visibility and thus they are designed to be easily read. As a consequence the texts results in a strong edge at the boundaries of text and background. But in this approach an additional time complexity is required for detecting edges. One of the new features of H.264 which was not there in any previous video CODEC is the deblocking (DB) filter. This invention uses the edge information extracted from the decoder during the process of decoding without computing the edges explicitly.
A conditional filtering is applied to all 4x4 luma block edges of a picture, except for edges at the boundary of the picture and any edges for which the DB filter is disabled by disable_deblocking_filter_idc, as specified in the slice header. This filtering process is performed on a macroblock after the picture construction process prior to DB filter process for the entire decoded picture, with all macroblocks in a picture processed in order of increasing macroblock addresses. For each macroblock and each component, vertical edges are filtered first. The process starts with the edge on the left-hand side of the macroblock proceeding through the edges towards the right-hand side of the macroblock in their geometrical order. Then horizontal edges are filtered, starting with the

edge on the top of the macroblock proceeding through the edges towards the
bottom of the macroblock in their geometrical order.
The pseudo code for selecting candidate frames using this feature is given
below:
Get the strength of DB filter for each MB.
If it is a strong vertical edge mark that MB as a candidate one
Identifying the text regions:
The proposed methodology is based on the following assumptions based on the
observations form different sports video clippings.
(i) Text regions should have a high contrast
(ii) Texts should be aligned horizontally
(iii) Texts have a strong vertical edge with background
(iv) Texts of trademarks and scorecards persists in the video for at least 0.2
seconds Some of the texts in billboard are not perceptible and therefore only the perceptible texts in the video are considered in this invention The above mentioned features (DC based and de-blocking based) together takes care of the observation (i) and (ii). This invention uses the first assumption to identify the MBs with high contrast only and the second assumption is used to identify the strong vertical edges. The pseudo code for removing the non textual part is as below:
Step I. For each candidate MB, identify the X and Y coordinate top left
position for each MB ( cx and cy) Step 2. Find the frequency (fy) of candidate MB in each row.
Step 3. Remove all MBs from a, If fr<2

Step 4. Check for continuity of MBs in each row: For this check the
column number (C() for candidate MBs in a row.
Step 5. If cx(1 + 1)-c,(i) > 2unmark the MB from a1
Where cx(i)' is the column number for i candidate MB in a particular row Step 6. To ensure that time domain filtering one frame is stored into
buffer and display the '"frame while decoding the( i-1)th frame. Step 7. Unmark all candidate MBs in '"frame if there is no candidate MB
in adjacent (i-1)th frame and (i+1) frame. Step 8. Finally all marked candidates MBs are considered as text content
in the video.
Although the invention has been described in terras of particular embodiments and applications, one of ordinary skill in the art, in light of this teaching, can generate additional embodiments and modifications without departing from the spirit of or exceeding the scope of the chained invention. Accordingly, it is to be understood that the drawings and descriptions herein are offered by way of example to facilitate comprehension of the invention and should not be construed to limit the scope thereof.
Dated this 1 7lh day of October 2008.
Mohan Dewan of R K Dewan&Co Applicants' Patent Attorneys.

Documents

Orders

Section Controller Decision Date

Application Documents

# Name Date
1 2236-MUM-2008-FORM 5(15-10-2009).pdf 2009-10-15
1 2236-MUM-2008-RELEVANT DOCUMENTS [28-09-2023(online)].pdf 2023-09-28
2 2236-MUM-2008-FORM 2(TITLE PAGE)-(15-10-2009).pdf 2009-10-15
2 2236-MUM-2008-RELEVANT DOCUMENTS [26-09-2022(online)].pdf 2022-09-26
3 2236-MUM-2008-RELEVANT DOCUMENTS [29-09-2021(online)].pdf 2021-09-29
3 2236-mum-2008-form 2(15-10-2009).pdf 2009-10-15
4 2236-MUM-2008-RELEVANT DOCUMENTS [29-03-2020(online)].pdf 2020-03-29
4 2236-MUM-2008-DRAWING(15-10-2009).pdf 2009-10-15
5 2236-MUM-2008-RELEVANT DOCUMENTS [23-03-2019(online)].pdf 2019-03-23
5 2236-MUM-2008-DESCRIPTION(COMPLETE)-(15-10-2009).pdf 2009-10-15
6 2236-MUM-2008-CORRESPONDENCE(7-11-2008).pdf 2018-08-09
6 2236-MUM-2008-CORRESPONDENCE(15-10-2009).pdf 2009-10-15
7 2236-mum-2008-correspondence.pdf 2018-08-09
7 2236-MUM-2008-CLAIMS(15-10-2009).pdf 2009-10-15
8 2236-MUM-2008-ABSTRACT(15-10-2009).pdf 2009-10-15
9 2236-mum-2008-description(provisional).pdf 2018-08-09
9 2236-MUM-2008-FORM 18(18-11-2010).pdf 2010-11-18
10 2236-MUM-2008-CORRESPONDENCE(18-11-2010).pdf 2010-11-18
10 2236-mum-2008-drawing.pdf 2018-08-09
11 2236-MUM-2008-FER.pdf 2018-08-09
11 Other Patent Document [05-10-2016(online)].pdf 2016-10-05
12 2236-MUM-2008-FORM 1(7-11-2008).pdf 2018-08-09
12 2236-MUM-2008-OTHERS [24-08-2017(online)].pdf 2017-08-24
13 2236-MUM-2008-FER_SER_REPLY [24-08-2017(online)].pdf 2017-08-24
13 2236-mum-2008-form 1.pdf 2018-08-09
14 2236-MUM-2008-CORRESPONDENCE [24-08-2017(online)].pdf 2017-08-24
14 2236-mum-2008-form 2(title page).pdf 2018-08-09
15 2236-MUM-2008-COMPLETE SPECIFICATION [24-08-2017(online)].pdf 2017-08-24
16 2236-MUM-2008-CLAIMS [24-08-2017(online)].pdf 2017-08-24
16 2236-mum-2008-form 2.pdf 2018-08-09
17 2236-mum-2008-form 26.pdf 2018-08-09
17 2236-MUM-2008-ABSTRACT [24-08-2017(online)].pdf 2017-08-24
18 2236-mum-2008-form 3.pdf 2018-08-09
18 2236-MUM-2008-FORM-26 [12-04-2018(online)].pdf 2018-04-12
19 2236-MUM-2008-HearingNoticeLetter.pdf 2018-08-09
19 2236-MUM-2008-Written submissions and relevant documents (MANDATORY) [27-04-2018(online)].pdf 2018-04-27
20 2236-MUM-2008-ORIGINAL UNDER RULE 6 (1A)-050917.pdf 2018-08-09
20 2236-MUM-2008-PatentCertificate26-06-2018.pdf 2018-06-26
21 2236-MUM-2008-IntimationOfGrant26-06-2018.pdf 2018-06-26
21 2236-MUM-2008-ORIGINAL UR 6( 1A) FORM 26-200418.pdf 2018-08-09
22 abstract1.jpg 2018-08-09
23 2236-MUM-2008-IntimationOfGrant26-06-2018.pdf 2018-06-26
23 2236-MUM-2008-ORIGINAL UR 6( 1A) FORM 26-200418.pdf 2018-08-09
24 2236-MUM-2008-PatentCertificate26-06-2018.pdf 2018-06-26
24 2236-MUM-2008-ORIGINAL UNDER RULE 6 (1A)-050917.pdf 2018-08-09
25 2236-MUM-2008-HearingNoticeLetter.pdf 2018-08-09
25 2236-MUM-2008-Written submissions and relevant documents (MANDATORY) [27-04-2018(online)].pdf 2018-04-27
26 2236-mum-2008-form 3.pdf 2018-08-09
26 2236-MUM-2008-FORM-26 [12-04-2018(online)].pdf 2018-04-12
27 2236-MUM-2008-ABSTRACT [24-08-2017(online)].pdf 2017-08-24
27 2236-mum-2008-form 26.pdf 2018-08-09
28 2236-MUM-2008-CLAIMS [24-08-2017(online)].pdf 2017-08-24
28 2236-mum-2008-form 2.pdf 2018-08-09
29 2236-MUM-2008-COMPLETE SPECIFICATION [24-08-2017(online)].pdf 2017-08-24
30 2236-MUM-2008-CORRESPONDENCE [24-08-2017(online)].pdf 2017-08-24
30 2236-mum-2008-form 2(title page).pdf 2018-08-09
31 2236-MUM-2008-FER_SER_REPLY [24-08-2017(online)].pdf 2017-08-24
31 2236-mum-2008-form 1.pdf 2018-08-09
32 2236-MUM-2008-FORM 1(7-11-2008).pdf 2018-08-09
32 2236-MUM-2008-OTHERS [24-08-2017(online)].pdf 2017-08-24
33 2236-MUM-2008-FER.pdf 2018-08-09
33 Other Patent Document [05-10-2016(online)].pdf 2016-10-05
34 2236-MUM-2008-CORRESPONDENCE(18-11-2010).pdf 2010-11-18
34 2236-mum-2008-drawing.pdf 2018-08-09
35 2236-MUM-2008-FORM 18(18-11-2010).pdf 2010-11-18
35 2236-mum-2008-description(provisional).pdf 2018-08-09
36 2236-MUM-2008-ABSTRACT(15-10-2009).pdf 2009-10-15
37 2236-mum-2008-correspondence.pdf 2018-08-09
37 2236-MUM-2008-CLAIMS(15-10-2009).pdf 2009-10-15
38 2236-MUM-2008-CORRESPONDENCE(7-11-2008).pdf 2018-08-09
38 2236-MUM-2008-CORRESPONDENCE(15-10-2009).pdf 2009-10-15
39 2236-MUM-2008-RELEVANT DOCUMENTS [23-03-2019(online)].pdf 2019-03-23
39 2236-MUM-2008-DESCRIPTION(COMPLETE)-(15-10-2009).pdf 2009-10-15
40 2236-MUM-2008-RELEVANT DOCUMENTS [29-03-2020(online)].pdf 2020-03-29
40 2236-MUM-2008-DRAWING(15-10-2009).pdf 2009-10-15
41 2236-MUM-2008-RELEVANT DOCUMENTS [29-09-2021(online)].pdf 2021-09-29
41 2236-mum-2008-form 2(15-10-2009).pdf 2009-10-15
42 2236-MUM-2008-FORM 2(TITLE PAGE)-(15-10-2009).pdf 2009-10-15
42 2236-MUM-2008-RELEVANT DOCUMENTS [26-09-2022(online)].pdf 2022-09-26
43 2236-MUM-2008-FORM 5(15-10-2009).pdf 2009-10-15
43 2236-MUM-2008-RELEVANT DOCUMENTS [28-09-2023(online)].pdf 2023-09-28

Search Strategy

1 SEARCH_STRATEGY_2236MUM2008_22-06-2017.pdf

ERegister / Renewals

3rd: 09 Aug 2018

From 17/10/2010 - To 17/10/2011

4th: 09 Aug 2018

From 17/10/2011 - To 17/10/2012

5th: 09 Aug 2018

From 17/10/2012 - To 17/10/2013

6th: 09 Aug 2018

From 17/10/2013 - To 17/10/2014

7th: 09 Aug 2018

From 17/10/2014 - To 17/10/2015

8th: 09 Aug 2018

From 17/10/2015 - To 17/10/2016

9th: 09 Aug 2018

From 17/10/2016 - To 17/10/2017

10th: 09 Aug 2018

From 17/10/2017 - To 17/10/2018

11th: 09 Aug 2018

From 17/10/2018 - To 17/10/2019

12th: 26 Sep 2019

From 17/10/2019 - To 17/10/2020

13th: 16 Oct 2020

From 17/10/2020 - To 17/10/2021

14th: 21 Sep 2021

From 17/10/2021 - To 17/10/2022

15th: 06 Oct 2022

From 17/10/2022 - To 17/10/2023

16th: 12 Oct 2023

From 17/10/2023 - To 17/10/2024

17th: 30 Sep 2024

From 17/10/2024 - To 17/10/2025

18th: 14 Oct 2025

From 17/10/2025 - To 17/10/2026