Abstract: An efficient lapped transform is realized using pre- and post-filters (or reversible overlap operators) that are structured of unit determinant component matrices. The pre-and post-filters are realized as a succession of planar rotational transforms and unit determinant planar scaling transforms. The planar scaling transforms can be implemented using planar shears or lifting steps. Further, the planar rotations and planar shears have an implementation as reversible/lossless operations, giving as a result, a reversible overlap operator.
Reversible Overlap Operator For Efficient Lossless Data
Compression
Technical Field
The invention relates generally to digital media (e g, video and image) compression using lapped transforms
Background
Lapped Transforms
The lapped transform is a powerful signal processing technique that is used m data compression See, eg,H S Malvar, Signal Processing with Lapped Transforms Boston, MA Artech House, 1992 However, to date, efficient lapped transforms with linear phase have neither been formulated nor been applied for lossless (reversible) compression of data
As discussed in more detail below, it is known that a lapped transform can be formulated as a pre filter followed by a data transform (and its inverse as the inverse data transform followed by a post filter) See, e g, H S Malvar, "A pre- and post-filtering technique for the reduction of blocking effects," m Proc Picture Coding Symposium, Stockholm, Sweden, Jun 1987, and T D Tran, J Liang, and C Tu, "Lapped Transform via Time-Domain Pre- and Post-Filtering", IEEE Trans on Signal Processing, vol 51, no 6, June 2003 A lossless data transform can be used in this formulation to achieve a good measure of reversibility So far, it was believed that only a certain restricted vanety of pre and post filters could be chosen for reversibility This restricted set is very limited in its compression (rate vs distortion, or R-D) performance In a recent article (W Dai and T Tran, "Regularity-constrained pre- and post-filtering for block DCT-based systems," IEEE Trans on Signal Processing, vol 51, pp 2568-2581, Oct 2003), a
construction in which most elements are reversible and which has good compression properties was presented
In audio compression, several constructions for reversible lapped transforms were introduced See, eg,R Geiger, J Herre, J Koller, andK Brandenburg, "IntMDCT - A link between perceptual and lossless audio coding," in Proc IEEE Int Conf on Acoustics, Speech, and Signal Processing, Orlando, FL, May 2002, and J Li, "Reversible FFT and MDCT viva matnx lifting " m Proc IEEE Int Conf on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004 However, these constructions are applicable only to the modulated lapped transform (MLT), also known as modified discrete cosine transform (MDCT), whose basis functions are orthogonal and are not symmetric (that is, the basis functions are not Imear phase) These transforms are not applicable to data compression applications where Imear phase (symmetric) functions are required, such as in digital picture compression
For picture (image) compression, one of the best-performing transforms in terms of R-D performance is the lapped biorthogonal transform (LBT) See, H S Malvar, "Biorthogonal and nonuniform lapped transforms for transform coding with reduced blocking and ringing artifacts," IEEE Trans on Signal Processing, vol 46, pp 1043-1053, Apr 1998 Unlike the MLT, the LBT basis functions are symmetric, and are not exactly orthogonal (in the LBT, the analysis basis functions are orthogonal to the synthesis basis functions, hence the term biorthogonal) LBTs have been successfully used m image compression applications, but they have not yet been used m lossless image compression, because integer-reversible constructions were not known
Overview of Block Transform-Based Coding
Transform coding is a compression technique used in many audio, image and video compression systems Uncompressed digital image and video is typically represented or captured as samples of picture elements or colors at locations in an image or video frame arranged in a two-dimensional (2D) gnd This is referred to as a spatial-domain representation of the image or video For example, a typical format for images consists of a stream of 24-bit color picture element samples arranged as a gnd Each
sample is a number representing color components at a pixel location m the grid within a color space, such as RGB, or YIQ, among others Vanous image and video systems may use vanous different color, spatial and time resolutions of sampling Similarly, digital audio is typically represented as tune-sampled audio signal stream For example, a typical audio format consists of a stream of 16-bit amplitude samples of an audio signal taken at regular tune intervals
Uncompressed digital audio, image and video signals can consume considerable storage and transmission capacity Transform coding reduces the size of digital audio, images and video by transforming the spatial-domain representation of the signal into a frequency-domain (or other like transform domain) representation, and then reducing resolution of certain generally less perceptible frequency components of the transform-domain representation This generally produces much less perceptible degradation of the digital signal compared to reducing color or spatial resolution of images or video m the spatial domain, or of audio in the time domain
More specifically, a typical block transform-based codec 100 shown in Figure 1 divides the uncompressed digital image's pixels into fixed-size two dimensional blocks (Xi, Xn), each block possibly overlapping with other blocks. A linear transform 120-121 that does spatial-frequency analysis is applied to each block, which converts the spaced samples within the block to a set of frequency (or transform) coefficients generally representing the strength of the digital signal in corresponding frequency bands over the block interval For compression, the transform coefficients may be selectively quantized 130 (I e, reduced in resolution, such as by droppmg least significant bits of the coefficient values or otherwise mapping values in a higher resolution number set to a lower resolution), and also entropy or vanable-length coded 130 into a compressed data stream At decoding, the transform coefficients will inversely transform 170-171 to nearly reconstruct the original color/spatial sampled image/video signal (reconstructed blocks X,, X„)
The block transform 120-121 can be defined as a mathematical operation on a vector x of size N Most often, the operation is a linear multiplication, producmg the transform domain output y^Mx^M being the transform matnx When the input data is
rbitrarily long, it is segmented into N sized vectors and a block transform is applied to each segment For the purpose of data compression, reversible block transforms are chosen In other words, the matnx M is invertible In multiple dimensions (e g, for unage and video), block transforms are typically implemented as separable operations The matrix multiplication is applied separably along each dimension of the data (i e, both rows and columns)
For compression, the transform coefficients (components of vector y) may be selectively quantized (1 e, reduced in resolution, such as by droppmg least significant bits of the coefficient values or otherwise mappmg values in a higher resolution number set to a lower resolution), and also entropy or variable-length coded into a compressed data stream
At decoding m the decoder ISO, the inverse of these operations (dequantization/entropy decoding 160 and inverse block transform 170-171) are applied on the decoder 150 side, as show in Fig 1 While reconstructing the data, the inverse matnx M1 (inverse transform 170-171) is applied as a multiplier to the transform domain data When applied to the transform domain data, the inverse transform nearly reconstructs the ongmal time-domain or spatial-domain digital media.
In many block transform-based coding applications, the transform is desirably reversible to support both lossy and lossless compression depending on the quantization factor With no quantization (generally represented as a quantization factor of 1) for example, a codec utilizing a reversible transform can exactly reproduce the mput data at decoding However, the requirement of reversibility in these applications constrains the choice of transforms upon which the codec can be designed
Many image and video compression systems, such as MPEG and Wmdows Media, among others, utilize transforms based on the Discrete Cosme Transform (DCT) The DCT is known to have favorable energy compaction properties that result in near-optimal data compression In these compression systems, the inverse DCT (IDCT) is employed in the reconstruction loops in both the encoder and the decoder of the compression system for reconstructing individual image blocks The DCT is descnbed by N Ahmed, T Natarajan, and K R Rao, "Discrete Cosine Transform," IEEE
Transactions on Computers, C-23 (January 1974), pp 90-93 An exemplary implementation of the IDCT is described in "IEEE Standard Specification for the Implementations of 8x8 Inverse Discrete Cosine Transform," IEEE Std 1180-1990, December 6,1990
While compressing a still image (or an intra coded frame in a video sequence), most common standards such as MPEG-2, MPEG-4 and Windows Media partition the image into square tiles and apply a block transform to each image tile The transform coefficients in a given partition (commonly known as block) are influenced only by the raw data components within the block Irreversible or lossy operations on the encoder side such as quantization cause artifacts to appear m the decoded image These artifacts are independent across blocks and produce a visually annoying effect known as the blocking effect Likewise for audio data, when non-overlapping blocks are independently transform coded, quantization errors will produce discontinuities in the signal at the block boundaries upon reconstruction of the audio signal at the decoder For audio, a periodic clicking effect is heard
Several techniques are used to combat the blocking effect - the most popular among these are the deblocking filter that smoothes inter block edge boundaries, and spatial extrapolation that encodes differences between the raw mput data and a prediction from neighboring block edges These techniques are not without their flaws For instance, die deblocking filter approach is "open loop", i e the forward transform process does not take mto account die fact that deblocking is going to be performed prior to reconstruction on the decoder side Besides, both these techniques are computationally expensive
In order to minimize the blocking effect, cross block correlations can be exploited One way of achieving cross block correlation is by using a lapped transform as described in H Malvar, "Signal Processing with Lapped Transforms," Artech House, Norwood MA, 1992 A lapped transform is a transform whose mput spans, besides the data elements in the current block, a few adjacent elements in neighboring blocks Likewise, on the reconstruction side the inverse transform influences all data points in the current block as well as a few data pomts in neighboring blocks
For the case of 2-dimensionaI (2D) data, the lapped 2D transform is a function of the current block, together with select elements of blocks to the left, top, right, bottom and possibly top-left, top-nght, bottom-left and bottom-right The number of data pomts in neighboring blocks that are used to compute the current transform is referred to as the overlap
Overview Of The Spatial Domain Lapped Transform
The lapped transform can be implemented in the transform domain, as a step that merges transform domain quantities after a conventional block transform Else, it can be implemented m the spatial-domain by a pre-processing stage that is applied to pixels within the range of overlap These two implementations are mathematically related and therefore equivalent
Figure Error' Reference source not found, shows an example of a conventional spatial-domain lapped transform In the example shown, the overlap is 2 pixels, and two pixels each from the two adjacent blocks shown are pre-processed in pre-processing stage 210 Two pre-processed outputs are sent to each of the blocks for block transform-based coding by codec 100 as m Fig 1 An inverse of the pre-processing stage is applied at post-processing stage 220 after decoding With a judicious choice of pre-processing and block transform, a wide range of lapped transforms can be realized
A key advantage of the spatial domain realization of the lapped transform is that an existing block transform-based codec can be retrofitted with a pre- and postprocessing stage to derive the benefits of the lapped transform, l e, reduced block effect and better compression, using an existing codec framework Pre-processing 210 and post-processing can be represented as a matrix multiplication as shown m Fig Error! Reference source not found Conventionally, the pre-processing and post-processing matrices are inverses of each other, i e, pre-processing matrix (Pj) and the inverse or post-processing matrix (Pi) multiplied together equal the identity matrix /
Definitions
In general, the length N of a transform is the number of transform coefficients in a certain transform block
The support K of a transform is the number of input data pomts that influence coefficients of the transform block Likewise, it is the number of output data pomts that are influenced by each transform coefficient, by the process of inverse transformation
For typical block transforms such as the discrete cosme transform (DCT), the length and support are identical However, lapped transforms (LTs) are an important class of transforms for which the support K is greater than the length N The notation K x N is used to denote the support and length of a lapped transform (Transforms for which K< N are expansive and therefore not used in data compression)
As an example 300, a 6x4 LT 310 shown in Figure 3 is a transform with six mputs and four outputs Since the transform is invertible, two of the mputs are shared with adjacent transform blocks The inverse lapped transform (ILT) 320 produces six outputs from its four mputs Output data pomts near the block boundary (in this case one pomt at each end of the block) are reconstructed by summing the corresponding responses of two adjacent inverse transform blocks
Constraints On Lapped Transforms Used In Compression Systems
In the mathematical sense, lapped transforms are invertible structures, when we consider the mput and output signals, as well as intermediate computation results, as real numbers If infinite precision could be achieved, the input data could be perfectly recovered from its lapped transform coefficients However, infinite precision is not possible in practice, for lossless compression of data, the requirement is to design a transform that operates on mteger or fixed-precision arithmetic, yet perfectly reconstructs the data given the mteger representation of transform coefficients This is a stronger condition than mathematical mvertibihty, and such a transform is referred to here as a "lossless" transform Moreover, it is required that the lossless transform be efficient for data compression (both lossless and lossy) as well That efficiency can be measured by the entropy of the transformed data, the lower that entropy, the more the transformed data can be compressed by standard entropy coding techniques, such as context-based arithmetic coding or adaptive run-length coding
Summary
Various Realizations are described herein of an efficient lapped transform that is reversible in integer arithmetic, and can be used as the basis of an efficient and lossless data compression/decompression system
It can be shown that the most efficient lossless transform designs (that is, those with minimum entropy of the transformed data) require the transform matnx be unit determinant (i e, the determinant of the transform matrix is ±1) In the following descnption, it is assumed that the transform can be represented as a matnx multiphcation, although it is recognized that there may be minor nonlinear phenomena such as data rounding Thus, when we refer to the determinant, truncation or roundmg aspects are not considered
The efficient lapped transform is realized using pre- and post-filters that are referred to herein as "overlap operators" This realization is reversible, yet very R-D efficient Among other applications, these new overlap operators allow the implementation of reversible LBTs, which can be used for lossless image compression The pre- and post-filters use reversible operations Further, the descnbed overlap operators include simplifications for computational efficiency
One realization of the pre and post filtenng operations is as a succession of planar rotational transforms and unit determinant planar scalmg transforms Further, the planar rotations and planar shears have an implementation as reversible/lossless operations, giving as a result, a reversible overlap operator
An exemplary application is in an 8x4 one-dimensional lapped transform realized using computationally efficient approximations of the reversible overlap operators
Additional features and advantages of the invention will be made apparent from the following detailed descnption of embodiments that proceeds with reference to the accompanying drawings
Brief Description Of The Drawings
Figure 1 is a block diagram of a conventional block transform-based codec m the pnorart
Figure 2 is a block diagram of a spatial-domain lapped transform implemented as pre and post processing operations in combination with the block transform-based codec of Figure 1, also in the prior art
Figure 3 is a block diagram illustrating a lapped transform and inverse lapped transform pair on 1-dimensional data.
Figure 4 is a flow diagram of an encoder based on a lapped transform utilizing a reversible overlap operator
Figure 5 is a flow diagram of an decoder based on the lapped transform
Figure 6 is a block diagram illustrating a lapped transform and inverse lapped transform pair on 1-dimensional data usmg pre- and post-filtering operations (or reversible overlap operator) m conjunction with a block transform
Figure 7 is a signal flow graph illustrating a structure of a linear phase pre- (or post-) filter for use as the reversible overlap operator in the lapped transform of Figure 6
Figure 8 is a signal flow graph of lossless scaling as four lifting steps for use in the reversible overlap operator
Figure 9 is a signal flow graph of lossless scaling as five lifting steps for use in the reversible overlap operator
Figure 10 is a signal flow graph of a cascade of 2-pornt seating applied to a larger dimension matrix to realize lossless umt determinant scaling
Figure 11 is a signal flow graph of a reversible overlap operator (or pre-/post-filter) having the structure shown in Figure 7 and usmg the lossless umt determinant seating of Figure 10
Figure 12 is a flow chart of the operation of the reversible overlap operator of Figure 11
Figure 13 is a signal flow graph illustrating an example of a reversible lapped transform implementation usmg the reversible overlap operator of Figure 11
Figure 14 is an impulse response graph of the DC coefficient of the example lapped transform of Figure 13
Figure 15 is a block diagram of a suitable computing environment for implementing the block transform-based codec with improved spatial-domain lapped transform of Figures 4 and 5
Detailed Description
The following description relates to a digital media compression system or codec, which utilizes a reversible overlap operator for a lapped transform For purposes of illustration, an embodiment of a compression system mcorporatmg the reversible overlap operator is an image or video compression system Alternatively, the reversible overlap operator also can be mcorporated into compression systems or codecs for other 2D data. The reversible overlap operator does not require that the digital media compression system encodes the compressed digital media data m a particular coding format
1. Encoder/Decoder
Figures 4 and S are a generalized diagram of the processes employed in a representative 2-dimensional (2D) data encoder 400 and decoder 500 based on a lapped transform using the reversible overlap operator The diagrams present a generalized or simplified illustration of the use and application of this reversible overlap operator m a compression system incorporating the 2D data encoder and decoder In alternative encoders based on this reversible overlap operator, additional or fewer processes than those illustrated in this representative encoder and decoder can be used for the 2D data compression For example, some encoders/decoders may also include color conversion, color formats, scalable coding, lossless coding, macroblock modes, etc The compression system (encoder and decoder) can provide lossless and/or lossy compression of the 2D data, depending on the quantization which may be based on a quantization parameter varying from lossless to lossy
The 2D data encoder 400 produces a compressed bitstream 420 that is a more compact representation (for typical mput) of 2D data 410 presented as mput to the encoder For example, the 2D data input can be an image, a frame of a video sequence, or other data having two dimensions The 2D data encoder tiles 430 the mput data mto
macroblocks, which are 16 x 16 pixels in size in this representative encoder The 2D data encoder further tiles each macroblock into 4x4 blocks 432 A "forward overlap" operator 440 is applied to each edge between blocks, after which each 4x4 block is transformed using a block transform 450 This block transform 450 can be the reversible, scale-free 2D transform descnbed by Snmvasan, U S Patent Application entitled, "Improved Reversible Transform For Lossy And Lossless 2-D Data Compression," filed concurrently herewith, the disclosure of which is hereby incorporated by reference Alternatively, the discrete cosine transform or other block transforms can be used with the reversible overlap operator descnbed herein Subsequent to the transform, the DC coefficient 460 of each 4x4 transform block is subject to a similar processing chain (tiling, forward overlap, followed by 4x4 block transform) The resulting DC transform coefficients and the AC transform coefficients are quantized 470, entropy coded 480 and packetized 490
The decoder performs the reverse process On the decoder side, the transform coefficient bits are extracted 510 from their respective packets, from which the coefficients are themselves decoded 520 and dequantized 530 The DC coefficients 540 are regenerated by applying an inverse transform, and the plane of DC coefficients is "inverse overlapped" using a suitable smoothing operator applied across the DC block edges Subsequently, the entire data is regenerated by applying the 4x4 inverse transform 550 to the DC coefficients, and the AC coefficients 542 decoded from the bitstream Finally, the block edges in the resulting image planes are inverse overlap filtered 560 This produces a reconstructed 2D data output
2. Lapped Transform Realized Using Overlap Operators
More generally, the overlap operator 440 and block transform 450 of the encoder 400 (Figure 4) is an example of a large class of lapped transforms 600 that can be factonzed into a pre filtering operation 610, followed by a block data transform 620 as illustrated in Figure 6 Figure 6 illustrates a generalized example of such factonzed lapped transforms In this illustrated case, the 6x4 lapped transform 310 shown m Figure 3 is factonzed into pre-filter operation 610 and block transform 620 stages The pre
filtering operation 610 and block transform 620 are evenly staggered over the data points In this illustrated 6x4 lapped transform 600 example, each pre filter is a length 2 transformation of the data points straddling adjacent blocks On the decode side, a post filter 640 is apphed after the inverse block transform 630 across block boundaries Likewise, for the general KxN case, the pre filter is applied to die (K-N)/2 data points of each block adjacent to a block boundary
For rnvertibility, the pre-filter 610 and post filter 640 are inverses of each other For realizing a lossless lapped transform, however, this condition is not sufficient This further constrains the pre and post filters 610,640 to be lossless transforms as well, in addition to the block (core) transform 620 to be realized in a lossless manner The DCT can be realized in a lossless manner, using ladder, lathee-, or liffang-based methods, among others See, eg, A A M L Bruekens and A W M van den Enden, "New networks for perfect inversion and perfect reconstruction", IEEE J Selected Areas Communications, vol 10, no 1,1992, and I Daubechies and W Sweldens, "Factoring wavelet transform into lifting steps", J Fourier Anal Appl, vol 4, pp 247-269,1998 A reversible, scale-free 2-dimensional transform also is described by Snnivasan, U S Patent Application entitled, "Improved Reversible Transform For Lossy And Lossless 2-D Data Compression," filed concurrently herewith and incorporated by reference herein Lifhng-based reversible approximations to the DCT m one dimension also are known See, eg, J Liang and T D Tran, "Fast muraplierless approximations of the DCT with the lifting scheme," IEEE Trans Signal Processing, vol 49, pp 3032-3044, Dec 2001
Efficient reversibility further requires that both steps, viz the pre/post filter and the block transform, be unit determinant
3. Reversible Overlap Operator
An efficient reversible overlap operator for use as the pre-filter 610 (Figure 6) of the lossless lapped transform 600 on which the encoder 400/decoder 500 (Figures 4 and 5) is based can be realized as a linear phase pre-filter, which is factonzed into the structure 700 shown in Figure 7 An inverse of this pre-filter (i e, the post-filter 640) also has the same structure but with different coefficients
This linear phase filter structure 700 has multiple orthogonal components, including a cross-over Hadamard network 710 at its input and output The internal arrows in the illustrated Hadamard network 710 denote negation in this diagram The structure 700 further mcludes orthogonal matrices Ul, U2, VI and V2 These components can be implemented in a lossless manner by using lattice/lifting based methods
In addition, the structure 700 has the nonzero scale factors s1 through SM The umt determinant constraint implies that s, = ±1 When all scale factors are ±1, the pre/post
filters can be realized as a lossless transform where the component matrices Ul, U2, VI and V2 are implemented as lossless lattice/lifting steps However, when scale factors are not all ±1, the lossless realization remains a challenge that is addressed as discussed more fully below
With this linear phase pre-filter structure 700, the problem of realizing a lossless pre-/post-filter pan- is reduced to the following three steps
1 Decomposing the filter F into the following form, for orthogonal matrices U1,
U2,V1 and V2
(Equation Removed)
2 Deriving lossless realizations for Ul, U2, V1 and V2, and
3 Deriving a lossless realization for the scaling matrix
As to step 1, the first and last matrices on the right hand side, which define 2 point Hadamard transforms, incorporate the factor of ½ in some terms to make these stages unit determinant The rest is re-arranged to a block diagonal form with two blocks, each of half the linear dimensions of F The singular value decomposition or SVD of each block provides the orthogonal matrices Ul, U2, VI and V2, as well as the scales
The lossless realizations of the component matrices can be derived m Step 2 using standard hfhng-based techniques, such as those descnbed by A AML Bruekensand A W M van den Enden, "New networks for perfect inversion and perfect reconstruction", IEEE J Selected Areas Communications, vol 10, no 1,1992
The lossless realization of the scahng matrix in Step 3 is addressed as follows For simplicity, let us assume that we have a certain 2 mput 2 output component that is (a) lossless and (b) realizes scahng by s (0 < s < 1) for the first component and by 1/s for the second component (other cases can be obtained by reverting the sign of one or both output signals) In other words, we have the input-output relationship given by (s 0)
(Equation Removed)
The determinant of the transformation matrix in equation (2) is s/s = 1 This matrix can be realized in four lifting steps procedure 800 or five lifting steps procedure 900 as shown in Figures 8 and 9 We usually approximate all the liftmg steps m the form of y = (a x + r)» b, where x is the mput and y is the output, and a, b, and r are integers and r is used for rounding error control, to get a division-less integer implementation The transform defined by equation (2) is referred to here as the unit determinant scaling transform, abbreviated as the scaling transform
Interestingly, the scalmg transform is closely related with the shear operation, as defined below
(Equation Removed)
Under the constraint a ~° -1 (a > 0, b > 0), the shear operation has unit determinant and can be realized with three liftmg steps
(Equation Removed)
Here the scaling factors ½ and 2 in the matrices sandwiching the shear matrix are distributed to the shear lifting steps, and the last lifting step of the first matrix is combined with the first shear lifting step while the first lifting step of the last matrix is combined with the first shear lifting step The five step realization as procedure 900 of the scaling transform shown in Figure 9 is based on equation (5) Simplifications to the structure may be possible by canceling inverse operations, where possible, between the 3 groups in equation (1), viz the Hadamard networks, the orthogonal matrices, and the scaling operations (which may in turn be decomposed into Hadamard and shear operations)
More particularly, the effective transform matrix of the four lifting step realization
of lossless scaling as procedure 800 is
(Equation Removed)
where c2 =l-s2 On the other
hand, the effective transform matrix of the five lifting step realization in procedure 900 is
(Equation Removed)
where c2 -l-s2
Although the scaling procedure 800 shown in Figure 8 has one fewer lifting step than the one in Figure 9, the latter procedure 900 has only three non-tnvial lifting steps as opposed to four in the former For the reason stated in the above paragraph, the first or last trivial lifting step m Figure 9 may be merged with prior or subsequent transform steps (for instance, with the Hadamard network 710 at either end of Figure 7) under certain conditions (for instance when Ul, U2 and VI are identities)
The scaling procedure can be easily extended to larger matrices This is illustrated in Figure 10, where M possibly different scale factors s1 through SM are applied
to the M data paths as a cascade 1000 of scaling transforms In order to achieve this m a reversible manner, M-1 reversible scaling transforms are needed in general
One useful special case is when the M scale factors sj through SM can be grouped into M/2 groups of form (s, 1/s) In this case, only M/2 reversible scaling transforms are needed One example is s1= s2= =sm/2= sand sms/z+/=Jy»*2+2= =su=l/s A preferred way of grouping is to maintain symmetry across the central axis, m other words each group scales the coefficients s, and sM+/.i IfM is odd, the one scale factor not grouped is 1, corresponding to the data path along the axis
On signal boundaries where pre / post filters need to extend outside of the signal, one solution is to extend the signal symmetrically and then apply pre / post filters This is not a lossless operation in general because of the scaling Another solution is to skip pre / post filtering on boundaries There is no notable difference between the two solutions m terms of R-D performance as well as perceptual quality (for instance, if used for lossy image/video compression)
Turning now to Figure 11, the reversible overlap operator having the desired R-D efficient (l e, unit determinant) property is then realized as a linear phase pre-filter structure 700 (Figure 7) that includes reversible, unit-determinant Hadamard networks 710, reversible orthogonal rotations 1110 (for component matrices Ul, U2, VI and V2), and reversible unit-determinant scaling 1120 (e g, using the lifting step procedures 800, 900 or cascade 1100) The post filter is analogous to the pre filter and is built using the same construction, albeit with inverse lifting steps in the reverse order This is illustrated in Figure 7, where the number of data values M in the block is in general any natural number Although the illustration is for even valued AS, odd values are also possible by noting that the "1 point Hadamard" transform of the center data value is itself This procedure can be generalized to higher dimensional data.
In summary, the operation of the reversible overlap operator is illustrated m Figure 12 In a first step 1210, the input 2-dimensional digital media data is tiled into blocks (as also shown for the encoder 400 m Figure 4) The reversible overlap operator applies a Hadamard network 710 across adjacent tiles at step 1220 The operator then applies reversible rotations to sums and differences at step 1230, followed by the
reversible scaling operator at step 1240 This is followed by another reversible block rotation (step 1250), and reversible inverse Hadamard network (step 1260)
With reference now to Figure 13, the matrix representations of the reversible block rotations and scaling operators depend upon the desired lapped operator using for instance the arithmetic described in equation (1) Figure 13 shows an example of a post-filter having the structure 700 shown in Figures 7 and 11, which is preceded by a reversible block transform (4 point Hadamard transform in this case) The transfer function of the post-filter is
(Equation Removed)
The low pass component of the Hadamard produces the impulse response shown in the graph in Figure 14
4. Computing Environment
The above described codec based on a lapped transform using a reversible overlap operator can be performed on any of a variety of devices in which digital media signal processing is performed, including among other examples, computers, image and video recording, transmission and receiving equipment, portable video players, video conferencing, and etc The digital media coding techniques can be implemented in hardware circuitry, as well as in digital media processing software executing within a computer or other computing environment, such as shown in Figure 15
Figure is illustrates a generalized example of a suitable computing environment (1500) in which described embodiments may be implemented The computing environment (1500) is not intended to suggest any limitation as to scope of use or functionality of the invention, as the present invention may be implemented in diverse general-purpose or special-purpose computing environments
With reference to Figure 15, the computing environment (1500) includes at least one processing unit (1510) and memory (1520) In Figure 15, this most basic configuration (1530) is included within a dashed line The processing unit (1510) executes computer-executable instructions and may be a real or a virtual processor In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power The memory (1520) may be volatile memory (e g, registers, cache, RAM), non-volatile memory (e g, ROM, EEPROM, flash memory, etc), or some combination of the two The memory (1520) stores software (1580) implementing the descnbed encoder/decoder and transforms
A computing environment may have additional features For example, the computing environment (1500) includes storage (1540), one or more input devices (1550), one or more output devices (1560), and one or more commumcation connections (15 70) An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing environment (1500) Typically, operating system software (not shown) provides an operating environment for other software executing m the computing environment (1500), and coordinates activities of the components of the computing environment (1500)
The storage (1540) may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, CD-RWs, DVDs, or any other medium which can be used to store information and which can be accessed within the computing environment (1500) The storage (1540) stores instructions for the software (1580) implementing the codec based on a lapped transform using the reversible overlap operator
The input device(s) (1550) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice mput device, a scanning device, or another device that provides input to the computing environment (1500) For audio, the mput device(s) (1550) may be a sound card or similar device that accepts audio mput in analog or digital form, or a CD-ROM reader that provides audio samples to the computing environment The output device(s) (1560) may be a display, printer, speaker, CD-wnter, or another device that provides output from the computing environment (1500)
The communication connection(s) (1570) enable communication over a communication medium to another computing entity The communication medium conveys mformation such as computer-executable instructions, compressed audio or video mformation, or other data m a modulated data signal A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode mformation m the signal By way of example, and not limitation, communication media include wired or wireless techniques implemented with an electrical, optical, RF, infrared, acoustic, or other earner
The digital media processmg techniques herein can be described in the general context of computer-readable media Computer-readable media are any available media that can be accessed within a computing environment By way of example, and not limitation, with the computing environment (1500), computer-readable media include memory (1520), storage (1540), communication media, and combinations of any of the above
The digital media processmg techniques herem can be described in the general context of computer-executable instructions, such as those included m program modules, bemg executed in a computing environment on a target real or virtual processor Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc that perform particular tasks or implement particular abstract data types The functionality of the program modules may be combined or spht between program modules as desired m various embodiments Computer-executable instructions for program modules may be executed within a local or distributed computing environment
For the sake of presentation, the detailed description uses terms like "determine," "generate," "adjust," and "apply" to describe computer operations in a computing environment These terms are high-level abstractions for operations performed by a computer, and should not be confused with acts performed by a human bemg The actual computer operations corresponding to these terms vary depending on implementation
4. Variations And Extensions Of the Reversible Overlap Operator
Various modifications and extensions of the above described reversible overlap operator can be made Although the descriptions presented above are for one dimensional data, the same procedure may be applied separably, or non-separably to multiple data dimensions
The orthogonal rotations in the above described reversible overlap operator implementation be replaced by approximations thereof, or by other transforms which may not be orthogonal
Furthermore, although the primary focus in the above description has been on the lossless recovery of input data, the same transform may be used for lossy data compression as well In this case, the loss may occur either in the quantization process, or due to a limited precision/approximate implementation of either pre filter or post filter, or due to other inaccuracies, or a combination of multiple factors
The reversible overlap operator described here may be applied to domains outside of data-compression The lapped transform using the reversible overlap operator may itself be expansive
The reversible overlap operator may be applied, in appropriately modified form, to implement mulurate filter banks, wavelets, lapped transforms with support spanning more than 2 block widths (K> 2 N)
The reversible overlap operator may be applied in a spatially varying manner, in which the extent and shape of overlap filter may vary across the spatial extent of the data.
In view of the many possible embodiments to which the principles of our invention may be applied, we claim as our invention all such embodiments as may come within the scope and spirit of the following claims and equivalents thereto.
We Claim :-
1. A method of encoding and/or decoding digital media data tiled in blocks,
the method comprising:
applying a reversible overlap operator across at least some of the boundaries between blocks, the reversible overlap operator being unit determinant; and
applying a reversible block transform to the blocks,
wherein the applying the reversible overlap operator and the reversible block transform are inverted between encoding and decoding the digital media data.
2. The method of claim 1, wherein the reversible overlap operator is structured of a plurality of components each being unit determinant. .
3. The method of claim 2, wherein the applying the reversible overlap operator comprises:
applying a reversible Hadamard network; applying a reversible block rotation; applying a reversible scaling operator; applying another reversible block rotation; and applying a reversible inverse Hadamard network.
4. The method of claim 3, wherein the applying the reversible scaling
operator comprises performing a four-lifting step procedure comprising:
performing a first lifting step summing a first value with the product of a second value multiplied by a ratio of a scaling factor and one minus the square of the scaling factor;
performing a second lifting step summing the second value with the product of the first sum multiplied by the scaling factor;
performing a third lifting step summing the first sum with the product of the second sum multiplied by the negative of the scaling factor; and
performing a fourth lifting step summing the second sum with the product of the third sum multiplied by the negative of the ratio of the scaling factor and one minus the square of the scaling factor.
5. The method of claim 3, wherein the applying the reversible scaling operator comprises performing a 6ve-lifting step procedure comprising:
performing a first lifting step summing a second value with the first value; performing a second lifting step summing the first value with the product of the
first sum multiplied by
(Equation Removed)
where s is a scaling factor and c is a constant;
performing a third lifting step summing the first sum with the product of the second sum multiplied by
(Equation Removed)
performing a fourth lifting step summing the second sum with the product of the third sum multiplied by -(Equation Removed)
; and
performing a fifth lifting step summing the third sum with the product of the negative of the fourth sum.
6. The method of claim 5, wherein one or more of the lifting steps is combined with or canceled by another step of the reversible overlap operator:
7. The method of claim 5, wherein the applying the reversible scaling operator comprises scaling M data paths by M scaling factors Si through sM using a cascade of 2-point scaling.
8. A digital media encoder and/or decoder comprising:
a data storage buffer for storing digital media data to be encoded and/or decoded; a processor programmed to:
tile the digital media data into blocks;
apply a reversible overlap operator across at least some of the boundaries between blocks, the reversible overlap operator being unit determinant; and
apply a reversible block transform to the blocks, wherein application of the reversible overlap operator and the reversible block transform are inverted between encoding and decoding the digital media data.
9. The digital media encoder and/or decoder of claim 8, wherein the reversible overlap operator is structured of a plurality of component transforms each being unit determinant.
10. The digital media encoder and/or decoder of claim 9, wherein the processor realizes the reversible overlap operator by:
applying a reversible Hadamard network; applying a reversible block rotation; applying a reversible scaling operator; applying another reversible block rotation; and applying a reversible inverse Hadamard network.
11. The digital media encoder and/or decoder of claim 10, wherein the
processor in applying the reversible scaling operator performs a four-lifting step
procedure comprising:
performing a first lifting step summing a first value with the product of a second value multiplied by a ratio of a scaling factor and one minus the square of the scaling factor;
performing a second lifting step summing the second value with the product of the first sum multiplied by the scaling factor;
performing a third lifting step summing the first sum with the product of the second sum multiplied by the negative of the scaling factor, and
performing a fourth lifting step summing the second sum with the product of the third sum multiplied by the negative of the ratio of the scaling factor and one minus the square of the scaling factor.
12. The digital media encoder and/or decoder of claim 10, wherein the
processor in applying the reversible scaling operator performs a five-lifting step
. procedure comprising:
performing a first lifting step summing a second value with the first value; performing a second lifting step summing the first value with the product of the
first sum multiplied by
(Equation Removed)
where 5 is a scaling factor and c is a constant;
performing a third lifting step summing the first sum with the product of the second sum multiplied by
(Equation Removed)
performing a fourth lifting step summing the second sum with the product of the third sum multiplied by
(Equation Removed)
and
performing a fifth lifting step summing the third sum with the product of the negative of the fourth sum.
13. The digital media encoder and/or decoder of claim 12, wherein one or more of the lifting steps is combined with or canceled by another step of the reversible overlap operator:
14. The digital media encoder and/or decoder of claim 12, wherein the processor realizes the reversible scaling operator by scaling M data paths by M scaling factors si through SM using a cascade of 2-point scaling.
15. At least one computer-readable recording medium carrying a computer-
executable digital media processing program thereon for performing a method of
processing digital media tiled in blocks, the method comprising:
applying a reversible overlap operator across at least some of the boundaries between blocks, the reversible overlap operator being unit determinant; and
applying a reversible block transform to the blocks,
wherein the applying the reversible overlap operator and the reversible block transform are inverted between encoding and decoding the digital media data.
16. The at least one computer-readable recording medium of claim 15, wherein the reversible overlap operator is structured of a plurality of components each being unit determinant.
17. The at least one computer-readable recording medium of claim 16, wherein the applying the reversible overlap operator comprises:
applying a reversible Hadamard network; applying a reversible block rotation; -applying a reversible scaling operator; applying another reversible block rotation; and applying a reversible inverse Hadamard network.
18. The at least one computer-readable recording medium of claim 17,
wherein the applying the reversible scaling operator comprises performing a four-lifting
step procedure comprising:
performing a first lifting step summing a first value with the product of a second value multiplied by a ratio of a scaling factor and one minus the square of the scaling factor;
performing a second lifting step summing the second value with the product of the first sum multiplied by the scaling factor;
performing a third lifting step summing the first sum with the product of the second sum multiplied by the negative of the scaling factor; and .
performing a fourth lifting step summing the second sum with the product of the third sum multiplied by the negative of the ratio of the scaling factor and one minus the square of the scaling factor.
19. The at least one computer-readable recording medium of claim 17, wherein the applying the reversible scaling operator comprises performing a five-lifting step procedure comprising:
performing a first lifting step summing a second value with the first value;
performing a second lifting step summing the first value with the product of the
first sum multiplied by
(Equation Removed)
where s is a scaling factor and c is a constant;
performing a third lifting step summing the first sum with the product of the second sum multiplied by
(Equation Removed)
performing a fourth lifting step summing the second sum with the product of the
third sum multiplied by
(Equation Removed)
and
performing a fifth lifting step summing the third sum with the product of the negative of the fourth sum.
20. The at least one computer-readable recording medium of claim 19, wherein one or more of the lifting steps is combined with or canceled by another step of the reversible overlap operator:
21. The at least one computer-readable recording medium of claim 19, wherein the applying the reversible scaling operator comprises scaling M data paths by M scaling factors S1 through SM using a cascade of 2-point scaling.
| Section | Controller | Decision Date |
|---|---|---|
| # | Name | Date |
|---|---|---|
| 1 | 2936-del-2005-Claims-(15-10-2008).pdf | 2008-10-15 |
| 1 | 2936-DEL-2005-RELEVANT DOCUMENTS [15-09-2023(online)].pdf | 2023-09-15 |
| 2 | 2936-del-2005-Form-13-(15-12-2008).pdf | 2008-12-15 |
| 2 | 2936-DEL-2005-RELEVANT DOCUMENTS [26-09-2022(online)].pdf | 2022-09-26 |
| 3 | 2936-DEL-2005-RELEVANT DOCUMENTS [22-09-2021(online)].pdf | 2021-09-22 |
| 3 | 2936-DEL-2005-GPA-(16-06-2010).pdf | 2010-06-16 |
| 4 | 2936-DEL-2005-RELEVANT DOCUMENTS [27-03-2020(online)].pdf | 2020-03-27 |
| 4 | 2936-DEL-2005-Correspondence-Others-(16-06-2010).pdf | 2010-06-16 |
| 5 | 2936-DEL-2005-IntimationOfGrant27-05-2019.pdf | 2019-05-27 |
| 5 | 2936-DEL-2005-Form-1-(29-12-2010).pdf | 2010-12-29 |
| 6 | 2936-DEL-2005-PatentCertificate27-05-2019.pdf | 2019-05-27 |
| 6 | 2936-DEL-2005-Correspondence-Others-(29-12-2010).pdf | 2010-12-29 |
| 7 | 2936-DEL-2005-Response to office action (Mandatory) [21-05-2019(online)].pdf | 2019-05-21 |
| 7 | 2936-del-2005-gpa.pdf | 2011-08-21 |
| 8 | 2936-DEL-2005-Written submissions and relevant documents (MANDATORY) [20-05-2019(online)].pdf | 2019-05-20 |
| 8 | 2936-del-2005-form-5.pdf | 2011-08-21 |
| 9 | 2936-del-2005-form-3.pdf | 2011-08-21 |
| 9 | 2936-DEL-2005-PETITION UNDER RULE 137 [17-05-2019(online)].pdf | 2019-05-17 |
| 10 | 2936-DEL-2005-Correspondence to notify the Controller (Mandatory) [30-04-2019(online)].pdf | 2019-04-30 |
| 10 | 2936-del-2005-form-2.pdf | 2011-08-21 |
| 11 | 2936-del-2005-form-18.pdf | 2011-08-21 |
| 11 | 2936-DEL-2005-HearingNoticeLetter.pdf | 2019-04-08 |
| 12 | 2936-del-2005-form-13.pdf | 2011-08-21 |
| 12 | Abstract [13-10-2015(online)].pdf | 2015-10-13 |
| 13 | 2936-del-2005-form-1.pdf | 2011-08-21 |
| 13 | Claims [13-10-2015(online)].pdf | 2015-10-13 |
| 14 | 2936-del-2005-drawings.pdf | 2011-08-21 |
| 14 | Correspondence [13-10-2015(online)].pdf | 2015-10-13 |
| 15 | 2936-del-2005-description (complete).pdf | 2011-08-21 |
| 15 | Description(Complete) [13-10-2015(online)].pdf | 2015-10-13 |
| 16 | 2936-del-2005-correspondence-others.pdf | 2011-08-21 |
| 16 | Examination Report Reply Recieved [13-10-2015(online)].pdf | 2015-10-13 |
| 17 | OTHERS [13-10-2015(online)].pdf | 2015-10-13 |
| 17 | 2936-del-2005-claims.pdf | 2011-08-21 |
| 18 | 2936-del-2005-assignment.pdf | 2011-08-21 |
| 18 | 2936-del-2005-Correspondence Others-(12-10-2015).pdf | 2015-10-12 |
| 19 | 2936-del-2005-abstract.pdf | 2011-08-21 |
| 19 | details under sec 8.pdf | 2015-06-24 |
| 20 | MTL-GPOA - MLK1.pdf ONLINE | 2015-03-05 |
| 20 | new covering letter.pdf | 2015-06-24 |
| 21 | MS to MTL Assignment.pdf ONLINE | 2015-03-05 |
| 21 | new covering letter.pdf_4653.pdf | 2015-06-24 |
| 22 | 2936-del-2005-FER-(26-05-2015).pdf | 2015-05-26 |
| 22 | FORM-6-701-800(MLK).24.pdf ONLINE | 2015-03-05 |
| 23 | FORM-6-701-800(MLK).24.pdf | 2015-03-13 |
| 23 | MTL-GPOA - MLK1.pdf | 2015-03-13 |
| 24 | MS to MTL Assignment.pdf | 2015-03-13 |
| 25 | MTL-GPOA - MLK1.pdf | 2015-03-13 |
| 25 | FORM-6-701-800(MLK).24.pdf | 2015-03-13 |
| 26 | 2936-del-2005-FER-(26-05-2015).pdf | 2015-05-26 |
| 26 | FORM-6-701-800(MLK).24.pdf ONLINE | 2015-03-05 |
| 27 | MS to MTL Assignment.pdf ONLINE | 2015-03-05 |
| 27 | new covering letter.pdf_4653.pdf | 2015-06-24 |
| 28 | MTL-GPOA - MLK1.pdf ONLINE | 2015-03-05 |
| 28 | new covering letter.pdf | 2015-06-24 |
| 29 | 2936-del-2005-abstract.pdf | 2011-08-21 |
| 29 | details under sec 8.pdf | 2015-06-24 |
| 30 | 2936-del-2005-assignment.pdf | 2011-08-21 |
| 30 | 2936-del-2005-Correspondence Others-(12-10-2015).pdf | 2015-10-12 |
| 31 | 2936-del-2005-claims.pdf | 2011-08-21 |
| 31 | OTHERS [13-10-2015(online)].pdf | 2015-10-13 |
| 32 | 2936-del-2005-correspondence-others.pdf | 2011-08-21 |
| 32 | Examination Report Reply Recieved [13-10-2015(online)].pdf | 2015-10-13 |
| 33 | 2936-del-2005-description (complete).pdf | 2011-08-21 |
| 33 | Description(Complete) [13-10-2015(online)].pdf | 2015-10-13 |
| 34 | 2936-del-2005-drawings.pdf | 2011-08-21 |
| 34 | Correspondence [13-10-2015(online)].pdf | 2015-10-13 |
| 35 | 2936-del-2005-form-1.pdf | 2011-08-21 |
| 35 | Claims [13-10-2015(online)].pdf | 2015-10-13 |
| 36 | Abstract [13-10-2015(online)].pdf | 2015-10-13 |
| 36 | 2936-del-2005-form-13.pdf | 2011-08-21 |
| 37 | 2936-del-2005-form-18.pdf | 2011-08-21 |
| 37 | 2936-DEL-2005-HearingNoticeLetter.pdf | 2019-04-08 |
| 38 | 2936-DEL-2005-Correspondence to notify the Controller (Mandatory) [30-04-2019(online)].pdf | 2019-04-30 |
| 38 | 2936-del-2005-form-2.pdf | 2011-08-21 |
| 39 | 2936-del-2005-form-3.pdf | 2011-08-21 |
| 39 | 2936-DEL-2005-PETITION UNDER RULE 137 [17-05-2019(online)].pdf | 2019-05-17 |
| 40 | 2936-del-2005-form-5.pdf | 2011-08-21 |
| 40 | 2936-DEL-2005-Written submissions and relevant documents (MANDATORY) [20-05-2019(online)].pdf | 2019-05-20 |
| 41 | 2936-del-2005-gpa.pdf | 2011-08-21 |
| 41 | 2936-DEL-2005-Response to office action (Mandatory) [21-05-2019(online)].pdf | 2019-05-21 |
| 42 | 2936-DEL-2005-PatentCertificate27-05-2019.pdf | 2019-05-27 |
| 42 | 2936-DEL-2005-Correspondence-Others-(29-12-2010).pdf | 2010-12-29 |
| 43 | 2936-DEL-2005-IntimationOfGrant27-05-2019.pdf | 2019-05-27 |
| 43 | 2936-DEL-2005-Form-1-(29-12-2010).pdf | 2010-12-29 |
| 44 | 2936-DEL-2005-RELEVANT DOCUMENTS [27-03-2020(online)].pdf | 2020-03-27 |
| 44 | 2936-DEL-2005-Correspondence-Others-(16-06-2010).pdf | 2010-06-16 |
| 45 | 2936-DEL-2005-RELEVANT DOCUMENTS [22-09-2021(online)].pdf | 2021-09-22 |
| 45 | 2936-DEL-2005-GPA-(16-06-2010).pdf | 2010-06-16 |
| 46 | 2936-DEL-2005-RELEVANT DOCUMENTS [26-09-2022(online)].pdf | 2022-09-26 |
| 46 | 2936-del-2005-Form-13-(15-12-2008).pdf | 2008-12-15 |
| 47 | 2936-del-2005-Claims-(15-10-2008).pdf | 2008-10-15 |
| 47 | 2936-DEL-2005-RELEVANT DOCUMENTS [15-09-2023(online)].pdf | 2023-09-15 |