Sign In to Follow Application
View All Documents & Correspondence

Method And Apparatus To Enable Automatic Detection Of 3 D Video Format

Abstract: A system to create fully automatic format detection engine comprising medium for extracting relevant pixels and modules to enable identification of the appropriate video format to be passed on to the 3D processing unit.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
07 February 2011
Publication Number
32/2012
Publication Type
INA
Invention Field
COMMUNICATION
Status
Email
Parent Application

Applicants

VALUABLE INNOVATIONS PRIVATE LIMITED
602, 1ST FLOOR, CENTRE POINT, J. B. NAGAR, ANDHERI-KURLA ROAD ANDHERI(E), MUMBAI-400059, INDIA

Inventors

1. MR. SANJAY GAIKWAD
602, 1ST FLOOR, CENTRE POINT, J. B. NAGAR, ANDHERI-KURLA ROAD ANDHERI(E), MUMBAI-400059, INDIA
2. MR. AMEYA HETE
602, 1ST FLOOR, CENTRE POINT, J. B. NAGAR, ANDHERI-KURLA ROAD ANDHERI(E), MUMBAI-400059, INDIA
3. MR. RAJARSHI MUKHOPADHYAY
602, 1ST FLOOR, CENTRE POINT, J. B. NAGAR, ANDHERI-KURLA ROAD ANDHERI(E), MUMBAI-400059, INDIA
4. MR. SELVAKUMAR JAWAHAR
602, 1ST FLOOR, CENTRE POINT, J. B. NAGAR, ANDHERI-KURLA ROAD ANDHERI(E), MUMBAI-400059, INDIA

Specification

FORM 2
THE PATENTS ACT, 1970 [39 OF 1970]
COMPLETE SPECIFICATION [SEE SECTION 10 & RULE 13]
1 TITLE
METHOD AND APPARATUS TO ENABLE AUTOMATIC DETECTION OF 3D VIDEO
FORMAT
2 APPLICANT
NAME VALUABLE INNOVATIONS PRIVATE LIMITED
ADDRESS 602, 1ST FLOOR, CENTRE POINT, J. B. NAGAR,
ANDHERI - KURLA ROAD, ANDHERI (E), MUMBAI - 400059.
NATIONALITY INDIA
The following specification particularly describes the nature of the invention:-

FIELD OF INVENTION AND USE OF INVENTION
The present invention relates to the field of digital video technology specifically with respect to 3D video processing. The invention described here relates to the process and system of detecting a 3D video format on the fly and thereby signaling the format information to a relevant 3D processing unit so that the processing unit can process the video appropriately. The invention specifically can be used in scenarios wherein there are 2D video content coming intermittently with 3D video content. Moreover, the system also relieves the users to provide format information to the 3D display or processing units as the entire detection process is automated.
PRIOR ART AND PROBLEMS TO BE SOLVED
The invention of stereoscope in 1947 has enabled 3D viewing of video content by viewers wherein the Left Frame video sequence and Right Frame video sequence are combined together in such away so that at any point of time a viewer views either a Left Frame with his Left Eye or a Right frame with his right eye.
With rapid penetration of digital video technology, there have been numerous applications and new methods to pack stereoscope video into the existing digital video technology framework so that viewers can still enjoy 3D viewing without having to do major change in their decoder setup. There are many new 3D display technologies that are being announced trying to address efficiency and cost reduction of the overall system.
Today, the Left Frame video sequence and Right frame video sequence are combined in a way so that for a video decoder device it is no different from the usual 2D content. The real transformation happens at the display unit or a 3D processing unit. Although there are multiple formats available to pack the 3D video content to integrate seamlessly with existing digital infrastructure, most dominant video formats are Side-By-Side format, Frame Sequential, Checker-Box format and Top-down formats. The 3D processing units or display units need to understand these formats so as to extract out the Left Frame and Right Frame Information correctly.
The 3D processing unit or display unit requires the format information to process the data correctly. In most of the devices prevalent in the market today, this input is taken from the user. However for layman user to tell about format information may not be a trivial task.

Hence there is a need to automate this process of detecting the right format and signaling the same to a target 3D processing device.
In recent release of HDMI specification 1.4 or later one way this problem has been addressed whereby one can embed a meta-data as part of the HDMI protocol. Surely, this will work with all those 3D display devices which are fully HDMI 1.4 compliant. However, there are many other devices which are non-HDMI or support older protocol, hence such devices will not detect the format information automatically.
In another embodiment to address the same problem, US Pat. No: 20100026783 describes a method to detect the video format. In this patent, the video content is embedded with a specific color or pattern at the content generation end and the same pattern is deciphered by the processing device and interpreted accordingly.
Although this method works well for a system wherein the receiver or processing unit is aware of the pattern but for a system which does not understand this pattern it becomes a limitation. Although, the process described here definitely automates the format detection process at the receiver end but it still requires manual input at the content generation end with respect to the format detection. In this sense the above method is not truly auto detection process.
OBJECTS OF THE INVENTION
In the previous section one discussed some of the existing methods and prior arts. Although, in the existing systems the problem of auto detecting the format has been addressed to some extent at the receiver end, but some type of user input is still required at the content generation side. Thereby, the system is still not fully automated. Keeping this in view the principle object of the invention is to create truly automatic format detection engine which doesn't require any manual input in the entire value chain of content creation and delivery process.
Another object of the invention is to work on raw video signal and not get tied to specific technology or protocol.
A further object of the invention is to have a very simple output which can be easily integrated with existing 3D processing device or 3D display units.
SUMMARY OF THE INVENTION
The present invention provides a system which works like a black-box wherein an input video is processed and a format information is flagged out to a connected 3D processing device. The system comprises of a pixel extractor module which extracts relevant pixels as per the target format and an AFDM (Auto format Detection Module) which exploits the principle of perceptible intensity variations in two successive line of Pixels in specific Region

of Interest, to assert if the said video has an edge in a particular direction and thereby help in determining the appropriate video format in use.
BRIEF DESCRIPTION OF THE DRAWINGS
The present section gives brief description of the accompanying drawings which is essential to understand the method and the advantages of the present invention.
The Figure-1 is a block diagram describing a prior art system.
The Figure-2 is a block diagram describing how the present system fits into the prior art system.
The Figure-3 describes the system in details and how various blocks are interconnected.
The Figure-4 describes the Side-by-Side packing format and the desired Region of Interest on which the detection process should be employed.
The Figure-5 describes the Checker-Box packing format and the desired Region of Interest on which the detection process should be employed.
The Figure-6 describes the Top-down packing format and the desired Region of Interest.
The Figure-7 describes the principle of intensity variation when two frames are stitched together in a single frame.
The Figure-8 depicts the lack of intensity variation at the mid point when only single frame is present.
The Figure-9 describes the process flow that takes place as part of the detection process.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
With respect to Figure-1 a block diagram of a prior hardware system architecture is now described. In prior art systems video signal 104 is an input to the 3D processing unit 100. Along with the video input the processing unit 100 also requires format information as described in Figure 3,4, 5. The Processing Unit 100 processes the input video according to this format information. The format information typically comes as an user input or as part of in-band HDMI protocol. The output of the 3D processing Unit 100 is an appropriately formatted video suitable for the 3D display device.

In one embodiment of the present invention as depicted in Figure-2, the input video signal is split and one part of it is taken as input in the (AFDM) Auto Format Detection Module 203. The AFDM module 203 in turn processes the input video and outputs appropriate format information 205 to the 3D Processing Unit 200.
In one embodiment with respect to Figure -7, it shows the formation of an edge at the middle point of the frame. This edge formation is because of the presence of two video frames in the same Frame wherein one frame belonging to Left Frame and the other belonging to Right frame. It is observed that in most cases the vertical lines i.e. right most column of Left frame and the Left most column of the Right frame will have less correlation and hence there will be perceptible intensity variations in most part of the middle point column pixels of the frame.
In another embodiment with respect to Figure-8, it is being shown that there is no visible vertical edge being formed at the mid point of the frame as the successive vertical lines of pixels belong to the same natural video frame. In most cases the successive pixel lines will be highly correlated and hence the intensity variation will be imperceptible.
It is concluded that in most cases the correlation between vertical Pixel line of right most edge pixel of one frame and vertical Pixel line of left most edge of the other frame will have less correlation and hence perceptible intensity variation will be observed in most part in the vertical direction of Pixels. The present invention exploits this phenomenon to extract out the 3D format information given an input video frame. The subsequent clauses explain various methods of implementation to be employed to extract out the format information.
In one embodiment of the present invention shown in Figure-3, the input video signal is processed at the color space converter module 304. The color space converter module 304 outputs only the luminance values at Pixel clock rate. The output of the color space converter is then fed into a Rol extractor module 303 and also Format Detection Engine 305. The Rol extractor module outputs the desired Region of Interest and feed the target pixel values to the Format Detection Engine 305. The format Detection engine in turn processes the pixel data and outputs the required format information 309.
In one embodiment as shown in Figure-3, the output of the Rol extractor module 303 could be taken into a Buffer Module 306. The Buffer Module 306 will separate out pixel array of Left Frames and Right Frames for Rol 602 as depicted in Figure 5. The output of the Buffer Module 306 will therefore be two array of Pixels one for the Left and the other for the Right Frame.
In another embodiment of Figure-4, also known as Side-By-Side format. The pixels are arranged in a particular arrangement. The Left Frame data normally occupies first half of the

frame in horizontal direction and the remaining half occupied by right Frame data in the horizontal direction. A 3D processing unit on knowing that the input video is Side-By-Side format can process the data appropriately to churn out 3D display compliant video stream.
In another embodiment with respect to Figure-4, it is described as to how the present invention extracts desired Pixels in case of Side-By-Side format. The Format Detection Engine samples the Pixel Data at the Pixel clock rate and processes only those Pixel Luminance values which belong to the desired (Rol) Region of Interest 402. A row wise Sample count is initially set to 0. Every time a new Pixel value is encountered the Row Sample count is incremented by 1. The process checks if Pixel belongs to the column address of Width/2-1 or Width/2. If the Pixel belongs to the column address Width/2-1 then it is stored in a temporary buffer for delayed processing. If the Pixel belongs to the column address Width/2 then the current Pixel and the Previous Pixel stored in the temporary buffer is processed.
The Pixels belonging to the column address Width/2-1 belongs to the right most edge pixel of Left Frame and the Pixel belonging to the column address Width/2 belongs to the left most edge Pixels of Right Frame. The invention tries to determine if there is an edge forming in the Region of Interest 402. It computes the absolute difference between the two pixel values and checks if the value is greater than a predetermined Threshold. If found to be greater than the threshold, it increments a variable called EdgeSampleCount by 1 else it sets EdgeSampleCount to be 0.
In another embodiment of the present invention with respect to Figure-5, it is described as to how the edge detection method can be employed for a Checker-Box format. In this case the only pixels that are "considered are the first and last column of pixels. The Format Detection Engine samples the Pixel Data at the Pixel clock rate and processes only those Pixel Luminance values which belong to the desired (Rol) Region of Interest 502,503. A row wise Sample count is initially set to 0. Every time a new Pixel value is encountered the Row Sample count is incremented by 1. The process checks if Pixel belongs to the column address of first column or last column. If the Pixel belongs to the first column address then it is stored in a temporary buffer for delayed processing. If the Pixel belongs to the last column address then the current Pixel and the Previous Pixel stored in the temporary buffer is processed.
If the Pixels belonging to the first column address belongs to the Left Frame then the pixel belonging to the last column address belongs to a Right Frame and vice-versa. It tries to determine if there is an edge forming in the Region of Interest 502/503. It computes the absolute difference between the two pixel values and checks if the value is greater than a predetermined Threshold. If found to be greater than the threshold, it increments a variable called EdgeSampleCount by 1 else it sets EdgeSampleCount to be 0.

In another embodiment of the present invention with respect to Figure-6, it is described as to how the edge detection method can be employed for a Top-Down format. In this case the only pixels that are considered are the middle two rows. The Format Detection Engine processes only those Pixel Luminance values which belong to the desired (Rol) Region of Interest 602. However, from a hardware realization stand-point in order to extract out desired Rol one may require to use a Buffer Module 304 as depicted in Figure-3. The buffer module 304 in turn will give out a full row of Left Frame Pixels and another row of Right Frame Pixels. The Left Frame Pixels belonging to that of Row address Height/2-1 and the Right Frame Pixels belonging to that of row address Height/2.
In another embodiment with respect to Figure-9, it describes the method of determining an edge within a frame. The input values i.e. luminance values of two successive pixels one belonging to Left Frame and the other belonging to Right frame forms as the input data for subsequent computation. The input block 900 is depicted in the Figure. The next block 901 calculates absolute difference between these two pixels. The block 902 compares with a pre-determined Threshold value. If the Value is below the threshold it goes back to the next pair of input else a variable called EdgeSampleCount is incremented by 1 within the block 903. The EdgeSampleCount variable keeps count of successive pixels which are likely candidate to form the edge pixels. Once the EdgeSampleCount is incremented by 1 a comparison is done with a predetermined value MinEdgeSampleCount in 904. The MinEdgeSampleCount variable is the minimum number of Edge Pixels which will sufficient enough to declare that the current line of Pixels are actually forming an edge. If the EdgeSampleCount value is greater than MinEdgeSampleCount then a frame level flag EdgeFound is set to be True 905. This way one can determine whether an edge exists in a frame.
In order to avoid incorrect detection of Vertical edge due to presence of some genuine edge in the video, the validation of edge at target Rol is done for successive frames for a predetermined number of frames which is denoted by a variable "minFrameCountwithEdge". Hence the algorithm after setting the "EdgeFound" flag to be true, increments the variable FrameCountwithEdge by 1. Then the variable FrameCountwithEdge is compared with minFrameCountwithEdge. If FrameCountwithEdge is greater than or equal to minFrameCountwithEdge a flag is raised declaring the format to be the one for which the validation is being done.
The present invention can be applied in various scenarios wherein the implementation resides at the display device end. Using the above mentioned method one can determine 3D formats on the fly and send appropriate signal to a 3D display device to render the video appropriately.

Anotner application of the present invention pertains to using it at the head-end transmission end wherein a live 3D video is intermixed with 2D video. Since both types of video needs to be processed differently one will require some kind of signaling mechanism to tell the processing unit to process the video appropriately. This can of great practical use in broadcasting scenario where 2D video advertisement or other content may be mixed intermittently with an already live 3D video.
CO NCLUS ION
With respect to the processes, systems and methods etc. which are described herein, it should be understood that, such processes and methods have been described as occurring to some sequence , however these processes could be implemented with the described steps performed in an order other than the one described herein. It is further to be understood that there are steps which could be performed concurrently or some steps could be added or modified as per specific requirements. In short the descriptions of processes herein are described here for purpose of illustrating certain embodiments of the invention and should no way be construed so as to limit the claimed invention.
Accordingly, it is to be understood that the above description is intended to be illustrative and no way limited to other variations. To those skilled with art there are many variations and modification of the present processes which could be apparent and hence the scope of the invention should be determined not with reference to the present description but should be instead determined with reference to the applied claims along with full scope of equivalents to which such claims are entitled.

We Claim:
1 A System Comprising Of
A Pixel Extractor Module For Extracting Relevant Pixels As Per Target Format
A Auto Format Detection Module For Exploiting The Principle Of Perceptible Variations In Two Successive Line Of Pixels In Specific Region Of Interest So As To Enable Identification Of The Appropriate Video Format
A 3D Processing Unit (200)
2 A System As In Claim 1 Above Wherein
The Input Video Signal Is Split And One Part Is Taken As Input In The Auto Format Detection Module (203)
The Auto Format Detection Module (203) In Turn Processes The Input Video And Appropriate Format Information To The 3D Processing Unit (200)
3 A System As In Claim 1 Above
Wherein A Colour Space Converter (304) And Rol Extractor Module (303) And Format Detection Engine Are Also Provided And
Wherein The Input Signal Is Sought To Be Processed At The Colour Space Converter Module (304) And
Wherein The Colour Space Converter (304) Outputs Only The Luminance Values At Pixel Clock Rate And
The Output Of The Colour Space Converter (304) Is Fed Into The Rol Extractor Module (303)
And Format Detection Engine (305)
The Format Detection Engine (305) Processes The Pixel Data And Outputs The Required Format Information (309)
4 A System As In Claim 1 Above Comprising In Addition A Buffer Module (306) Wherein
The Output Of The Rol Extractor Module (303) Is Taken Into A Buffer Module (306)
The Buffer Module (306) Separates Out Pixel Array Of Left Frames And Right Frames For Rol (602)

The Output Of The Buffer Module (306) Will Have Only Two Array Of Pixels One For The Left Frame And The Other For The Right Frame
5 A System As In Claim 1 Wherein
The Pixels Are Arranged In A Particular Arrangement
The Left Frame Data Normally. Occupies First Half Of The Frame In Horizontal Direction
The Right Frame Occupying The Remaining Portion Of The Frame In Horizontal Direction
A 3D Processing Unit Processes The Data For Accessing The Format Of The Video Input And Churns Out The Required Video Stream.
6 A System As In Claim 1 And 5 Herein
The Pixels Are Extracted By Means Of
The Format Detection Engine Sampling The Pixel Data At The Pixel Clock Rate And Processing Only Those Pixel Luminance Values Which Belong To The Desired Region And The Sample Count Initially Set To 0 With Increment By A Value Of 1 On Encountering Each New Pixel Value

Documents

Application Documents

# Name Date
1 abstract1.jpg 2018-08-10
2 344-MUM-2011-FORM 5(1-2-2012).pdf 2018-08-10
3 344-MUM-2011-FORM 3(1-2-2012).pdf 2018-08-10
4 344-MUM-2011-FORM 26(1-2-2012).pdf 2018-08-10
5 344-mum-2011-form 2.pdf 2018-08-10
6 344-mum-2011-form 2(title page).pdf 2018-08-10
7 344-MUM-2011-FORM 2(TITLE PAGE)-(1-2-2012).pdf 2018-08-10
8 344-MUM-2011-FORM 2(1-2-2012).pdf 2018-08-10
9 344-mum-2011-form 1.pdf 2018-08-10
10 344-MUM-2011-FORM 1(1-2-2012).pdf 2018-08-10
11 344-mum-2011-drawing.pdf 2018-08-10
12 344-MUM-2011-DRAWING(1-2-2012).pdf 2018-08-10
13 344-mum-2011-description(provisional).pdf 2018-08-10
14 344-MUM-2011-DESCRIPTION(COMPLETE)-(1-2-2012).pdf 2018-08-10
15 344-mum-2011-correspondence.pdf 2018-08-10
16 344-MUM-2011-CORRESPONDENCE(4-10-2013).pdf 2018-08-10
17 344-MUM-2011-CORRESPONDENCE(26-9-2012).pdf 2018-08-10
18 344-MUM-2011-CORRESPONDENCE(26-3-2013).pdf 2018-08-10
19 344-MUM-2011-CORRESPONDENCE(24-9-2014).pdf 2018-08-10
20 344-MUM-2011-CORRESPONDENCE(19-8-2013).pdf 2018-08-10
21 344-MUM-2011-CORRESPONDENCE(1-2-2012).pdf 2018-08-10
22 344-MUM-2011-CLAIMS(1-2-2012).pdf 2018-08-10
23 344-MUM-2011-ASSIGNMENT(1-2-2012).pdf 2018-08-10
24 344-MUM-2011-ABSTRACT(1-2-2012).pdf 2018-08-10