Abstract: This invention discloses a system and means for selecting prediction mode and block size for motion estimation and rate controlling for video compression, particularly for H.264 codec. The system and method is suitable for enhancing motion estimation and rate control, particularly at macro block levels.
FORM-2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
PROVISIONAL
Specification
(See section 10 and rule 13)
VIDEO CODING DECODING MEANS
TATA CONSULTANCY SERVICES LTD.,
an Indian Company
of Bombay House, 24, Sir Homi Mody Street, Mumbai 400 001,
Maharashtra, India
THE FOLLOWING SPECIFICATION DESCRIBES THE INVENTION.
Field of invention:
This invention relates to video coding decoding means, typically for H.264 video encoder and decoder.
Particularly, this invention relates means for enhancing motion estimation and rate control in an H.264 video CODEC.
More particularly, this invention relates to means for enhacing motion estimation and rate control at macro block levels.
Background of the invention: Introduction:
In this age of multimedia convergence, video based applications for conferencing and streaming have become very popular. The main challenge of implementing these systems is a tradeoff between several factors like bandwidth, image quality, implementation cost and speed (in terms of Mega Cycles per second). Moreover to implement this Encoder-Decoder system (CODECs) in a real time system based on any Digital Signal Processor (DSP) is more difficult because of their restricted resources like memory and CPU speed.
Video based applications are one of the attractive fields of consumer electronics product today. Thus several video CODECs (encoders and decoders) have been developed. Performance of a video CODEC is a tradeoff between three variables: quality, compressed bit rate and computational complexity. Quality of image is measured in terms of Peak Signal to Noise Ratio (PSNR), compressed bit rate is the number of bits transmitted over
2
network per second (measured in bits per second) and computational complexity refers to the processing power required to compress the video sequence. These video CODECs can be implemented on different digital signal processors (DSPs) also. Any embedded implementation requires optimization of different parameters like cost of the processor, which also depends on processor architecture, speed, and memory. So to make an attractive consumer product it is always better to implement such a system in a low cost processor with all features of the CODEC. TMS320C55x is one of such low cost processor. But this processor has limited resources in terms of memory and CPU speed. So running a videophone like application in such a platform requires some enhancements in the way video information is presented so that we get a good image quality at a significantly low bit rate in real time.
Bandwidth and image quality are inversely related to each other and rate distortion curve where PSNR is plotted against coded bit rate is a measure of the tradeoff. For better rate-distortion performance, the PSNR should be comparatively high at lower bit rate. For real time application, computational complexity is an important parameter as it affects both power consumption and real time performance. Memory constraint in an embedded implementation is also an important factor.
Figure 1 illustrates a block diagram of the H.264 video encoder. The H.264 encoder has the following basic functional units:
(i) Pixel interpolation unit (Half pel and quarter pel);
(ii) Inter prediction unit;
(iii) Intra prediction unit;
3
(iv) Transformation, scaling and quantization and inverse transformation
units; and (v) Deblocking filter unit.
H.264 Video CODECs support different partitioning of the macroblock like 16x16, 16x8, 8x16, 8x8, 4x8, 8x4 and 4x4. Smaller block size increases PSNR but reduces the compression and larger block sizes are preferred for better compression at the cost image quality. So if prediction partition size is 4x4 best image quality can be obtained at lower compression and if the size is 16x16 best compression is achieved at lower PSNR.
For varying macroblock sizes quantization parameter may vary in Frame level, slice level or macroblock level and rate controlling in different basic units is defined in each of these parameters. Larger basic unit size increases bit fluctuation and reduces time complexity. On the other hand smaller basic unit increases time complexity but reduces bit fluctuation.
Prior Art:
Image quality of any video sequence is usually measured in terms of PSNR. It is found that PSNR increases with number of bits consumed in encoding. Thus image quality improvement can be thought of as an optimization of bits and PSNR. Image quality is measured in terms of rate-distortion curve where PSNR is plotted against number of bits used in encoded stream. Rate distortion can be minimized by using rate distortion optimization (RDO) model, which is highly computationally expensive and not suitable to meet real time criteria in TMS320C55X.
4
Bit rate means the number of bits transmitted over network per second. In
order to achieve constant bit rate (CBR), we need to compute quantization
parameter (qp) depending upon available bits and Mean absolute difference
(MAD) of original and predicted image. Thus bitrate (b) may be represented
as:
b= f (qp, MAD), where 0 <= qp <= 51
Rate control is achieved by manipulating qp depending upon MAD. H.264 standard suggests that qp may vary in macroblock layer, slice layer or frame layer as delta qp is specified in all these layer headers. So this rate controlling is performed in group of picture (GOP) level, frame level, slice level or Macro Blocks (MB) level. Each of these layers is treated as the basic unit for rate controlling. Basic unit in rate controlling is defined to be a group of continuous MBs.
Mean absolute difference is computed after reconstruction is done which requires qp value. Figure 2 illustrates a schematic block diagram of the steps involved in the process of mode prediction and computation of block size in accordance with the prior art. A linear model is used to predict the MAD of the remaining basic units in the current frame by using those of the co-located basic units in the previous frame. Suppose that the predicted MAD of the 1st basic unit in the current frame and the actual MAD of the 1st basic unit in the previous frame are denoted by MADcb (1) and MADpb (1), respectively. The linear prediction for computation of mean absolute difference is given by MADcb(l) = alxMADph(l) + a2
Where ax and a2 are the coefficients. The initial value of ax and a2 are set to 1 and 0, respectively. They are updated by a linear regression method similar to
5
that for the quadratic R-D model parameters estimation in MPEG-4 rate control after coding each basic unit.
However the limitation of this system while detecting the value for quantization parameter (qp) for a basic unit depending upon predicted MAD and target bits which are computed using a fluid traffic scheme is that this technique is relatively time consuming and therefore expensive. An alternative is to employ a big basic unit. As a result of which a high PSNR can be achieved, but this causes severe bit fluctuation. The use of a small basic unit on the other hand reduces the bit fluctuation with a slight loss in PSNR. However the use of a relatively small basic unit increases computational cost.
An object of this invention is to overcome the limitations of the prior art.
Another object of this invention is to provide a simple, efficient, cost effective method for mode prediction and computation of block size.
Another object of this invention is to provide an encoder for detecting value of quantization parameter with reduced bit fluctuation and reduced computational cost.
Summary of the invention:
In accordance with this invention there is provided an encoder for selecting prediction mode and block size for motion estimation and rate controlling. The encoder comprises:
(i) quantization parameter defining means;
6
(ii) first computing means adapted to compute mean absolute difference; (iii) second computing being adapted to compute sum of absolute
difference;
(iv) selection means adapted to dynamically select a threshold value;
(v) comparing means adapted to compare sum of absolute difference
with the threshold value;
(vi) decision making means for receiving and checking the results of
comparison, said decision making means having a selector means for
selecting prediction mode and block size depending on result of
comparison.
In accordance with another aspect of this invention there is provided a method
for mode prediction and block size selection. The method comprises the
following steps:
Step 1: Defining a quantization parameter (qp) for starting frame;
Step 2: Computing average mean absolute difference at nth frame (MADavg
(n)) as:
Step 3: Computing sum of absolute difference (SAD) between predicted and original pixel value and store it for future reference; Step 4: Computing and selecting threshold value T (n) dynamically; Step 5: Comparing sum of absolute difference with the threshold value to find which value is greater than the other.
Step 6: If SAD value is less than T(n) then selecting appropriate prediction mode and block size.
Step 7 : If SAD value is greater than T(n) then repeat step 3 for next macro block.
7
Brief description of the accompanying drawings:
The invention will be described in detail with reference to a preferred
embodiment. Reference to this embodiment does not limit the scope of the
invention.
In the accompanying drawings:
Figure 1 illustrates a block diagram of the typical H.264 Encoder;
Figure 2 illustrates a schematic block diagram of the steps involved in the
process of mode prediction and block size selection in accordance with the
prior art; and
Figure 3 illustrates a schematic block diagram of the steps involved in the
process of mode prediction and block size selection in accordance with this
invention.
Detailed Description of Invention:
The invention will now be explained with reference to Figure 3 of the accompanying drawings.
Figure 3 illustrates a schematic block diagram of the steps involved in the
method of mode prediction and block size selection in accordance with this
invention. The steps involved in the process are as follows:
Step 1: Defining a quantization parameter (qp) for starting frame;
Step 2: Computing average mean absolute difference at nth frame (MADavg
(n)) as:
8
Step 3: computing sum of absolute difference (SAD) between predicted and
original pixel value and store it for future reference;
Step 4: Computing and selecting threshold value T (n) dynamically;
Step 5: Comparing sum of absolute difference (SAD) with the threshold value
T(n) to find which value is greater than the other.
Step 6: If SAD value is less than T(n) then selecting appropriate prediction
mode and block size.
Step 7 : If SAD value is greater than T(n) then repeat step 3 for next macro
block.
An encoder for mode prediction and block size selection comprises: (i) quantization parameter defining means;
(ii) first computing means adapted to compute mean absolute difference; (iii) second computing means adapted to compute sum of absolute
difference;
(iv) selection means adapted to dynamically select a threshold value; (v) comparing means adapted to compare sum of absolute difference
with the threshold value;
(vi) decision making means for receiving and checking the results of
comparison, said decision making means having a selector means for
selecting prediction mode and block size depending on result of
comparison..
Table 1 describes the complexity of each functional unit related to the H.264 encoder, which would be familiar to a person skilled in the art.
9
Table 1
Basic tool MAC ADD MPY
Interpolation per pixel Half pel 0 15 9
Quarter pel 0 24 12
Deblocking filter (per macroblock) Luma 0 3584 256
Cb 0 896 64
Cr 0 896 64
Transformation and inverse transformation (Per macroblock) Residuecomputation 0 16 0
including Luma AC, Luma DC, Chroma AC and Chroma DC) Integer Transform 0 80 0
Scaling 0 32 32
Inverse Integer Transform 0 80 0
Inv Scaling 0 16 16
Intra prediction Luma 4 x 4 (9 modes) 0 4560 544
Luma 16 x 16 (4 modes and Hadamard used) 16 7414 515
Chroma (Cb) (All modes and Hadamard used) 16 843 133
Chroma (Cr) (All modes and Hadamard used) 16 843 133
Inter prediction Luma 16 x 16 (1/2 pel and l4 pel) 7 33 24105 0
Luma 4 x4 (1/2 pel and V* pel) 72 10 10448 0
Chroma (Cb) (8x8) 0 384 512
Chroma (Cr) (8x8) 0 384 512
10
Table 2 provides the analysis results obtained in Table 1.
Table 2
Basic tool (per macroblock) MAC ADD MPY
Interpolation 0 9984 5376
Deblocking filter 0 5376 384
Transformation and inverse transformation 0 224 48
Intra prediction 48 1366 0 1325
Inter prediction 1409 3532 1 1024
Total 1457 5462 0 2802
Table 3 provides statistical analysis on eight slow moving reference test sequences in accordance with this invention.
Table 3
Sequenc e Total MB P4MB P16M B I4M B I16M B Tota11Mb No offorced IMB %of P16MB
Akiyo 8910 25 8588 235 62 297 297 99.71
News 8910 16 8590 244 60 304 297 99.733
Claire 8910 28 8554 138 190 328 297 99.315
grandma 8910 3 8610 227 70 297 297 99.965
salesman 8910 16 8580 289 25 314 297 99.617
Hall mo nitor 8910 43 8545 240 82 322 297 99.21
Bridge 8910 0 8313 267 330 597 297 96.517
Containe r 8910 0 8613 237 60 297 297 100
11
While considerable emphasis has been placed herein on the various components of the preferred embodiment, it will be appreciated that many alterations can be made and that many modifications can be made in the preferred embodiment without departing from the principles of the invention. These and other changes in the preferred embodiment as well as other embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
Dated this 31st day of March, 2006.
12
| # | Name | Date |
|---|---|---|
| 1 | 494-MUM-2008-CORRESPONDENCE(IPO)-(FER)-(30-04-2012).pdf | 2012-04-30 |
| 2 | 494-MUM-2008-CORRESPONDENCE(IPO)-(HEARING NOTICE)-(02-01-2015).pdf | 2015-01-02 |
| 3 | 494-MUM-2008-CORRESPONDENCE(IPO)-(DECISION)-(23-11-2015).pdf | 2015-11-23 |
| 4 | 494-MUM-2006-GENERAL POWER OF ATTORNEY (23-11-2015).pdf | 2015-11-23 |
| 5 | 494-MUM-2006-CORRESPONDENCE(23-11-2015).pdf | 2015-11-23 |
| 6 | abstract1.jpg | 2018-08-09 |
| 7 | 494-MUM-2006_EXAMREPORT.pdf | 2018-08-09 |
| 8 | 494-MUM-2006-SPECIFICATION(AMENDED)-(27-9-2012).pdf | 2018-08-09 |
| 9 | 494-MUM-2006-REPLY TO HEARING(12-6-2015).pdf | 2018-08-09 |
| 10 | 494-MUM-2006-REPLY TO EXAMINATION REPORT(27-9-2012).pdf | 2018-08-09 |
| 11 | 494-MUM-2006-PETITION UNDER RULE-137(12-6-2015).pdf | 2018-08-09 |
| 12 | 494-MUM-2006-MARKED COPY(27-9-2012).pdf | 2018-08-09 |
| 13 | 494-MUM-2006-MARKED COPY(12-6-2015).pdf | 2018-08-09 |
| 14 | 494-mum-2006-form-3.pdf | 2018-08-09 |
| 15 | 494-mum-2006-form-26.pdf | 2018-08-09 |
| 16 | 494-mum-2006-form-2.pdf | 2018-08-09 |
| 18 | 494-mum-2006-form-1.pdf | 2018-08-09 |
| 19 | 494-MUM-2006-FORM 5(28-3-2007).pdf | 2018-08-09 |
| 20 | 494-MUM-2006-FORM 2(TITLE PAGE)-(PROVISIONAL)-(31-3-2006).pdf | 2018-08-09 |
| 21 | 494-MUM-2006-FORM 2(TITLE PAGE)-(COMPLETE)-(28-3-2007).pdf | 2018-08-09 |
| 22 | 494-MUM-2006-FORM 2(TITLE PAGE)-(27-9-2012).pdf | 2018-08-09 |
| 23 | 494-MUM-2006-FORM 2(TITLE PAGE)-(12-6-2015).pdf | 2018-08-09 |
| 24 | 494-MUM-2006-FORM 2(COMPLETE)-(28-3-2007).pdf | 2018-08-09 |
| 25 | 494-MUM-2006-FORM 18(25-4-2008).pdf | 2018-08-09 |
| 26 | 494-MUM-2006-FORM 13-(12-6-2015).pdf | 2018-08-09 |
| 27 | 494-MUM-2006-FORM 13(12-6-2015).pdf | 2018-08-09 |
| 28 | 494-MUM-2006-FORM 1(31-3-2006).pdf | 2018-08-09 |
| 29 | 494-MUM-2006-FORM 1(27-9-2012).pdf | 2018-08-09 |
| 30 | 494-MUM-2006-FORM 1(23-6-2010).pdf | 2018-08-09 |
| 31 | 494-MUM-2006-FORM 1(12-6-2015).pdf | 2018-08-09 |
| 32 | 494-MUM-2006-FORM 1 (27-9-2012).pdf | 2018-08-09 |
| 33 | 494-mum-2006-drawings.pdf | 2018-08-09 |
| 34 | 494-MUM-2006-DRAWING(28-3-2007).pdf | 2018-08-09 |
| 35 | 494-MUM-2006-DESCRIPTION(COMPLETE)-(28-3-2007).pdf | 2018-08-09 |
| 36 | 494-mum-2006-description (provisional).pdf | 2018-08-09 |
| 37 | 494-MUM-2006-CORRESPONDENCE(25-4-2008).pdf | 2018-08-09 |
| 38 | 494-MUM-2006-CORRESPONDENCE(23-6-2010).pdf | 2018-08-09 |
| 39 | 494-MUM-2006-CLAIMS(AMENDED)-(27-9-2012).pdf | 2018-08-09 |
| 40 | 494-MUM-2006-CLAIMS(AMENDED)-(12-6-2015).pdf | 2018-08-09 |
| 41 | 494-MUM-2006-CLAIMS(28-3-2007).pdf | 2018-08-09 |
| 42 | 494-MUM-2006-ABSTRACT(28-3-2007).pdf | 2018-08-09 |
| 43 | 494-MUM-2006-ABSTRACT(27-9-2012).pdf | 2018-08-09 |