Device And Method For Processing At Least Two Input Values

< Back

Device And Method For Processing At Least Two Input Values

For the reduction of the rounding error, a first and asecond non-integer input value are provided (260, 262) andcombined (268), for example by addition, in non-integerstate to obtain a non-integer result value which is roundedand added (269) to a third input value. Thus, the roundingerror may be reduced at an interface between two rotationsdivided into lifting steps or between a first rotationdivided into lifting steps and a first lifting step of asubsequent multi-dimensional lifting sequence.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

15 March 2006

Publication Number

31/2007

Publication Type

Invention Field

PHYSICS

Status

Parent Application

Patent Number

Legal Status

Grant Date

2013-11-27

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

HAUSASTRASSE 27 C 80686 MUNICH GERMANY

Inventors

1. GIEGER, RALF

MUNZSTRASSE 8C 98693 ILMENAU GERMANY

2. SCHULLER, GERALD

LEOPOLDSTRASSE 13 99089 ERFURT GERMANY

3. SPORER, THOMAS

KIELER STR. 7A 90766 FURTH GERMANY

Specification

DEVICE AND METHOD FOR PROCESSING AT LEAST TWO INPUT VALUES
The present invention relates to a device and method for processing at least two input
values in signal processing and particularly to signal processing of sequential values,
such as audio samples or video samples, which are particularly suitable especially for
lossless coding applications.
The present invention is further suitable for compression algorithms for discrete values
comprising audio and / or image information, and particularly for coding algorithms
including a transform in the frequency domain or time domain or location domain, which
are followed by a coding, such as an entropy coding in the form of a Huffman or
arithmetic coding.
Modern audio coding methods, such as MPEG Layer 3 (MP3) or MPEG AAC, use
transforms, such as the so-called modified discrete cosine transform (MDCT), to obtain a
block-wise frequency representation of an audio signal. Such an audio coder usually
obtains a stream of time-discrete audio samples. The stream of audio samples is
windowed to obtain a windowed block of for example 1, 024 or 2, 048 windowed audio
samples. For the windowing, various window functions are employed, such as a sine
window, etc.
The windowed time-discrete audio samples are then converted to a spectral
representation by means of a filter blank. In principle, a Fourier transform or, for
special reasons, a variety of the Fourier transform, such as an FFT or, as discussed, an
MDCT, may be employed for this. The block of audio spectral values at the output of
the filter bank may then be processed further, as necessary. In the above audio coders,
a quantization of the audio spectral values follows, wherein the quantization
stages are typically chosen so that the quantization noise introduced by the

quantizing is below the psychoacoustic masking threshold,
i.e. is "masked away". The quantization is a lossy coding.
In order to obtain further data amount reduction, the
quantized spectral values are then entropy-coded, for
example by means of Huffman coding. By adding side
information, such as scale factors etc., a bit stream,
which may be stored or transmitted, is formed from the
entropy-coded quantized spectral values by means of a bit
stream multiplexer.
In the audio decoder, the bit stream is split up into coded
quantized spectral values and side information by means of
a bit stream demultiplexer. The entropy-coded quantized
spectral values are first entropy-decoded to obtain the
quantized spectral values. The quantized spectral values
are then inversely quantized to obtain decoded spectral
values comprising quantization noise, which, however, is
below the psychoacoustic masking threshold and will thus be
inaudible. These spectral values are then converted to a
temporal representation by means of a synthesis filter bank
to obtain time-discrete decoded audio samples. In the
synthesis filter bank, a transform algorithm inverse to the
transform algorithm has to be employed. Moreover, the
windowing has to be reversed after the frequency-time
backward transform.
In order to achieve good frequency selectivity, modern
audio coders typically use block overlap. Such a case is
illustrated in Fig. 6a. First for example 2,048 time-
discrete audio samples are taken and windowed by means of
means 402. The window embodying means 402 has a window
length of 2N samples and. provides a block of 2N windowed
samples on the output side. In order to achieve a window
overlap, a second block of 2N windowed samples is formed by
means of means 404, which is illustrated separate from
means 402 in Fig. 6a only for reasons of clarity. The 2,048
samples fed to means 404, however, are not the time-
discrete audio samples immediately subsequent to the first

window, but contain the second half of the samples windowed
by means 402 and additionally contain only 1,024 "new"
samples. The overlap is symbolically illustrated by means
406 in Fig. 6a, causing an overlapping degree of 50%. Both
the 2N windowed samples output by means 402 and the 2N
windowed samples output by means 404 are then subjected to
the MDCT algorithm by means of means 408 and 410,
respectively. Means 408 provides N spectral values for the
first window according to the known MDCT algorithm, whereas
means 410 also provides N spectral values, but for the
second window, wherein there is an overlap of 50% between
the first window and the second window.
In the decoder, the N spectral values of the first window,
as shown in Fig. 6b, are fed to means 412 performing an
inverse modified discrete cosine transform The same
applies to the N spectral values of the second window. They
are fed to means 414 also performing an inverse modified
discrete cosine transform. Both means 412 and means 414
each provide 2N samples for the first window and 2N samples
for the second window, respectively.
In means 416, designated TDAC (time domain aliasing
cancellation) in Fig. 6b, the fact is taken into account
that the two windows are overlapping. In particular, a
sample y1 of the second half of the first window, i.e. with
an index N+k, is summed with a sample y2 from the first
half of the second window, i.e. with an index k, so that N
decoded temporal samples result on the output side, i.e. in
the decoder.
It is to be noted that, by the function of means 416, which
is also referred to as add function, the windowing
performed in the coder schematically illustrated by Fig. 6a
is taken into account somewhat automatically, so that no
explicit "inverse windowing" has to take place in the
decoder illustrated by Fig. 6b.

If the window function implemented by means 402 or 404 is
designated w(k), wherein the index k represents the time
index, the condition has to be met that the squared window
weight w(k) added to the squared window weight w(N+k)
together are 1, wherein k runs from 0 to N-l. If a sine
window is used whose window weightings follow the first
half-wave of the sine function, this condition is always
met, since the square of the sine and the square of the
cosine together result in the value 1 for each angle.
In the window method with subsequent MDCT function
described in Fig. 6a, it is disadvantageous that the
windowing by multiplication of a time-discrete sample, when
thinking of a sine window, is achieved with a floating-
point number, since the sine of an angle between 0 and 180
degrees does not yield an integer, apart from the angle of
90 degrees. Even when integer time-discrete samples are
windowed, ' floating-point numbers result after the
windowing.
Therefore, even if no psychoacoustic coder is used, i.e. if
lossless coding is to be achieved, quantization will be
necessary at the output of means 408 and 410, respectively,
to be able to perform reasonably manageable entropy coding.
Generally, currently known integer transforms for lossless
audio and/or video coding are obtained by a decomposition
of the transforms used therein into Givens rotations and by
applying the lifting scheme to each Givens rotation. Thus a
rounding error is introduced in each step. For subsequent
stages of Givens rotations', the rounding error continues to
accumulate. The resulting approximation error becomes
problematic particularly fox lossless audio coding
approaches, particularly when long transforms are used
providing, for example, 1,024 spectral values, such as it
is the case in the known MDCT with overlap and add (MDCT =
modified discrete cosine transform). Particularly in the
higher frequency range, where the audio signal typically

has a very low energy amount anyway, the approximation
error may quickly become larger than the actual signal, so
that these approaches are problematic with respect to
lossless coding and particularly with respect to the coding
efficiency that may achieved by it.
With respect to the audio coding, integer transforms, i.e.
transform algorithms generating integer output values, are
particularly based on the known DCT-IV, which does not take
into account a DC component, while integer transforms for
image applications are rather based on the DCT-II, which
especially contains the provisions for the DC component.
Such integer transforms are, for example, known in Y. Zeng,
G. Bi and Z. Lin, „Integer sinusoidal transforms based on
lifting factorization", in Proc. ICASSP'01, May 2001, pp.
1,181 - 1,184, K. Komatsu and K. Sezaki, „Reversible
Discrete Cosine Transform", in Proc. ICASSP, 1998, vol. 3,
pp. 1,769 - 1,,772, P. Hao and Q. Shi, „Matrix
factorizations for reversible integer mapping", IEEE Trans.
Signal Processing, Signal Processing, vol. 49, pp. 2,314 -
2,324, and J. Hang, J. Sun and S. Yu, ,,1-d and 2-d
transforms from integers to integers", in Proc. ICASSP'03,
Hongkong, April 2003.
As mentioned above, the integer transform described there
are based on the decomposition of the transform into Glvens
rotations and on the application of the known lifting
scheme to the Givens rotations, which results in the
problem of the accumulating rounding errors. This is
particularly due to the fact that, within a transform,
roundings must be performed many times, i.e. after each
lifting step, so that, particularly in long transforms
causing a corresponding large number of lifting steps,
there must be a particularly large number of roundings. As
described, this results in an accumulated error and
particularly also in a relatively complex processing,
because rounding is performed after every lifting step to
perform the next lifting step.

Subsequently, the decomposition of the MDCT windowing will
be illustrated again with respect to Figs. 9 to 11, as
described in DE 10129240 A1, wherein this decomposition of
the MDCT windowing into Givens rotations with lifting
matrices and corresponding roundings is advantageously
combinable with the concept discussed in Fig. 1 for the
conversion and in Fig. 2 for the inverse conversion, to
obtain a complete integer MDCT approximation, i.e. an
integer MDCT (IntMDCT) according to the present invention,
wherein both a forward and a backward transform concept are
given for the example of an MDCT.
Fig. 3 shows an overview diagram for the inventive
preferred device for processing time-discrete samples
representing an audio signal to obtain integer values based
on which the Int-MDCT integer transform algorithm is
operative. The time-discrete samples are windowed by the
device shown in Fig. 3 and optionally converted to a
spectral representation. The time-discrete samples supplied
to the device at an input 10 are windowed with a window w
with a length corresponding to 2N time-discrete samples to
achieve, at an output 12, integer windowed samples suitable
to be converted to a spectral representation by means of a
transform and particularly the means 14 for performing an
integer DCT. The integer DCT is designed to generate N
output values from N input values which is in contrast to
the MDCT function 408 of Fig. 6a which only generates N
spectral values from 2N windowed samples due to the MDCT
equation.
For windowing the time-discrete samples, first two time-
discrete samples are selected' in means 16 which together
represent a vector of time-discrete samples. A time-
discrete sample selected by the means 16 is in the first
quarter of the window. The other time-discrete sample is in
the second quarter of the window, as discussed in more
detail with respect to Fig. 5. The vector generated by the

means 16 is now provided with a rotation matrix of the
dimension 2 x 2, wherein this operation is not performed
directly, but by means of several so-called lifting
matrices.
A lifting matrix has the property to comprise only one
element depending on the window w and being unequal to "1"
or "0".
The factorization of wavelet transforms in lifting steps is
presented in the specialist publication "Factoring Wavelet
Transforms Into Lifting Steps", Ingrid Daubechies and Him
Sweldens, Preprint, Bell Laboratories, Lucent Technologies,
1996. Generally, a lifting scheme is a simple relation
between perfectly reconstructing filter pairs having the
same low-pass or high-pass filters. Each pair of
complementary filters may be factorized into lifting steps.
This applies particularly to Givens rotations. Consider the
case in which the polyphase matrix is a Givens rotation.
The following then applies:

Each of the three lifting matrices on the right-hand side
of the equal sign has the value "1" as main diagonal
element. There is further, in each lifting matrix, a
secondary diagonal element equal to 0 and a secondary
diagonal element depending on the rotation angle a.
The vector is now multiplied by the third lifting matrix,
i.e. the lifting matrix on the far right in the above
equation, to obtain a first result vector. This is
illustrated in Fig. 3 by means 18. Now the first result
vector is rounded with any rounding function mapping the
set of real numbers into the set of integers, as

illustrated in Fig. 3 by means 20. At the output of the
means 20, a rounded first result vector is obtained. The
rounded first result vector is now supplied to means 22 for
multiplying it by the central, i.e. second, lifting matrix
to obtain a second result vector which is again rounded in
means 24 to obtain a rounded second result vector. The
rounded second result vector is now supplied to means 26
for multiplying it by the lifting matrix shown on the left
in the above equation, i.e. the first one, to obtain a
third result vector which is finally rounded by means of
means 28 to finally obtain integer windowed samples at the
output 12 which, if a spectral representation of the same
is desired, now have to be processed by means 14 to obtain
integer spectral values at a spectral output 30.
Preferably, the means 14 is implemented as integer DCT.
The discrete cosine transform according to type 4 (DCT-IV)
with a length N is given by the following equation:

The coefficients of the DCT-IV form an orthonormal N x N
matrix. Each orthogonal N x N matrix may be decomposed into
N (N-D/2 Givens rotations, as discussed in the specialist
publication P. P. Vaidyanathan, "Multirate Systems And
Filter Banks", Prentice Hall, Englewood Cliffs, 1993. It is
to be noted that other decompositions also exist.
With respect to the classifications of the various DCT
algorithms, see H. S. Malvar, "Signal Processing With
Lapped Transforms", Artech House, 1992. Generally, the DCT
algorithms differ in the kind of their basis functions.
While the DCT-IV preferred herein includes non-symmetric
basis functions, i.e. a cosine quarter wave, a cosine 3/4
wave, a cosine 5/4 wave, a cosine 7/4 wave, etc., the

From the TDAC condition for the window function w, the
following applies:

For certain angles αk, k = 0, ..., N/2-1, this preprocessing
in the time domain may be written as Givens rotation, as
discussed.
The angle a of the Givens rotation depends on the window
function w as follows:

It is to be noted that any window functions w may be
employed as long as they fulfill this TDAC condition.
In the following, a cascaded coder and decoder are
described with respect to Fig. 4. The time-discrete samples
x(0) to x(2N-1), which are "windowed" together by a window,
are first selected by the means. 16 of Fig. 3 such that the
sample x(0) and the sample x(N-1), i.e. a sample from the
first quarter of the window and a sample from the second
quarter of the window, are selected to form the vector at
the output of the means 16. The crossing arrows
schematically represent the lifting multiplications and
subsequent roundings of the means 18, 20 and 22, 24 and 26,
28, respectively, to obtain the integer windowed samples at
the input of the DCT-IV blocks.
When the first vector has been processed as described
above, a second vector is further selected from the samples
x(N/2-l) and x(N/2), i.e. again a sample from the first
quarter of the window and a sample from the second quarter

of the window, and is again processed by the algorithm
described in Fig. 3. Analogously, all other sample pairs
from the first and second quarters of the window are
processed. The same processing is performed for the third
and fourth quarters of the first window. Now there are 2N
windowed integer samples at the output 12 which are now
supplied to a DCT-IV transform as illustrated in Fig. 4. In
particular, the integer windowed samples of the second and
third quarters are supplied to a DCT. The windowed integer
samples of the first quarter of the window are processed .
into a preceding DCT-IV together with the windowed integer
samples of the fourth quarter of the preceding window.
Analogously, in Fig. 4, the fourth quarter of the windowed
integer samples is supplied to a DCT-IV transform together
with the first quarter of the next window. The central
integer DCT-IV transform 32 shown in Fig. 4 now provides N
integer spectral values y(0) to y(N-1). These integer
spectral values may now, for example, be simply entropy-
coded without an interposed quantization being necessary,
because the windowing and transform yield integer output
values.
In the right half of Fig. 4, a decoder is illustrated. The
decoder consisting of backward transform and "inverse
windowing" operates inversely, to the coder. It is known
that an inverse DCT-IV may be used for the backward
transform of a DCT-IV, as illustrated in Fig. 4. The output
values of the decoder DCT-IV 34 are now inversely processed
with the corresponding values of the preceding transform
and/or the following transform, as illustrated in Fig. 4,
in order to generate again time-discrete audio samples x(0)
to x(2-Nl) from the integer windowed samples at the output
of the means 34 and/or the preceding and following
transform.
The operation on the output side takes place by an inverse
Givens rotation, i.e. such that the blocks 26, 28 and 22,
24 and 18, 20, respectively, are traversed in the opposite

direction. This will be illustrated in more detail with
respect to the second lifting matrix of equation 1. When
(in the coder) the second result vector is formed by
multiplication of the rounded first result vector by the
second lifting matrix (means 22), the following expression
results:

The values x, y on the right-hand side of equation 6 are
integers. This, however, does not apply to the value x sin
a. Here, the rounding function r must be introduced, as
illustrated in the following equation:

This operation is performed by the means 24.
The inverse mapping (in the decoder) is defined as follows:

Due to the minus sign in front of the rounding operation,
it becomes apparent that the integer approximation of the
lifting step may be reversed without introducing an error.
The application of this approximation to each of the three
lifting steps leads to an integer approximation of the
Givens rotation. The rounded rotation (in the coder) may be
reversed (in the decoder) without introducing an error by
traversing the inverse rounded lifting steps in reverse
order, i.e. if in decoding the algorithm of Fig. 3 is
performed from bottom to top.
If the rounding function r is point symmetric, the inverse
rounded rotation is identical to the rounded rotation with
the angle -a and is expressed as follows:

discrete cosine transform of, for example, type II (DCT-II)
has axisymmetric and point symmetric basis functions. The
0th basis function has a DC component, the first basis
function is half a cosine wave, the second basis function
is a whole cosine wave, etc. Due to the fact that the DCT-
II gives special emphasis to the DC component, it is used
in video coding, but not in audio coding, because the DC
component is not relevant in audio coding in contrast to
video coding.
In the following, there will be a discussion how the
rotation angle a of the Givens rotation depends on the
window function.
An MDCT with a window length of 2N may be reduced to a
discrete cosine transform of the type IV with a length N.
This is achieved by explicitly performing the TDAC
operation in the time domain and then applying the DCT-IV.
In the case of a 50% overlap, the left half of the window
for a block t overlaps with the right half of the preceding
block, i.e. block t-1. The overlapping part of two
consecutive blocks t-1 and t is preprocessed in the time
domain, i.e. prior to the transform, as follows, i.e. it is
processed between the input 10 and the output 12 of Fig. 3:

The values marked with the tilde are the values at the
output 12 of Fig. 3, while the x values not marked with a
tilde in the above equation are the values at the input 10
and/or following the means 16 for selecting. The running
index k runs from 0 to N/2-1, while w represents the window
function.

The lifting matrices for the decoder, i.e. for the inverse
Givens rotation, in this case result directly from equation
(1) by merely replacing the expression "sin α" by the
expression "-sin α".
In the following, the decomposition of a common MDCT with
overlapping windows 40 to 46 is illustrated again with
respect to Fig. 5. The windows 40 to 46 each have a 50%
overlap. First, Givens rotations are performed per window
within the first and second quarters of a window and/or
within the third and fourth quarters of a window, as
illustrated schematically by arrows 48. Then the rotated
values, i.e. the windowed integer samples, are supplied to
an N-to-N DCT such that always the second and third
quarters of a window and the fourth and first quarters of a
subsequent window, respectively, are converted to a
spectral representation together by means of a DCT-IV
algorithm.
The common Givens rotations are therefore decomposed into
lifting matrices which are executed sequentially, wherein
after each lifting matrix multiplication, a rounding step
is inserted such that the floating point numbers are
rounded immediately after being generated such that, prior
to each multiplication of a result vector with a lifting
matrix, the result vector only has integers.
The output values thus always remain integer, wherein it is
preferred to use also integer input values. This does not
represent a limitation, because any exemplary PCM samples
as they are stored on a CD are integer numerical values
whose value range varies depending on bit width, i.e.
depending on whether the time-discrete digital input values
are 16-bit values or 24-bit values. Nevertheless, the whole
process is invertible, as discussed above, by performing

the inverse rotations in reverse order. There is thus an
integer approximation of the MDCT with perfect
reconstruction, i.e. a lossless transform.
The shown transform provides integer output values instead
of floating point values. It provides a perfect
reconstruction so that no error is introduced when a
forward and then a backward transform are performed.
According to a preferred embodiment of the present
invention, the transform is a substitution for the modified
discrete cosine transform. Other transform methods,
however, may also be performed with integers as long as a
decomposition into rotations and a decomposition of the
rotations into lifting steps is possible.
The integer MDCT has most of the favorable properties of
the MDCT. It has an overlapping structure, whereby a better
frequency selectivity is obtained than with non-overlapping
block transforms. Due to the TDAC function which is already
taken into account in windowing prior to the transform, a
critical sampling is maintained so that the total number of
spectral values representing an audio signal is equal to
the total number of input samples.
Compared to a normal MDCT providing floating point samples,
the described preferred integer transform shows that the
noise compared to the normal MDCT is increased only in the
spectral range in which there is little signal level, while
this noise increase does not become noticeable at
significant signal levels. But the integer processing
suggests itself for an efficient hardware implementation,
because only multiplication steps are used which may
readily be decomposed into shift/add steps which may be
hardware-implemented in a simple and quick way. Of course,
a software implementation is also possible.
The integer transform provides a good spectral
representation of the audio signal and yet remains in the

area, of integers. When it is applied to tonal parts of an
audio signal, this results in good energy concentration.
With this, an efficient lossless coding scheme may be built
up by simply cascading the windowing/transform illustrated
in Fig. 3 with an entropy coder. In particular, stacked
coding using escape values, as it is employed in MPEG AAC,
is advantageous. It is preferred to scale down all values
by a certain power of two until they fit in a desired code
table, and then additionally code the omitted least
significant bits. In comparison with the alternative of the
use of larger code tables, the described alternative is
more favorable with regard to the storage consumption for
storing the code tables. An almost lossless coder could
also be obtained by simply omitting certain ones of the
least significant bits.
In particular for tonal signals, entropy coding of the
integer spectral values allows a high coding gain. For
transient parts of the signal, the coding gain is low,
namely due to the flat spectrum of transient signals, i.e.
due to a small number of spectral values equal to or almost
0. As described in J. Herre, J. D. Johnston: "Enhancing the
Performance of Perceptual Audio Coders by Using Temporal
Noise Shaping (TNS)" 101st AES Convention, Los Angeles,
1996, preprint 4384, this flatness may be used, however, by
using a linear prediction in the frequency ~ domain. An
alternative is a prediction with open loop. Another
alternative is the predictor with closed loop. The first
alternative, i.e. the predictor with open loop, is called
TNS. The quantization after the prediction leads to an
adaptation of the resulting quantization noise to the
temporal structure of the audio signal and thus prevents
pre-echoes in psychoacoustic audio coders. For lossless
audio coding, the second alternative, i.e. with a predictor
with closed loop, is more suitable, since the prediction
with closed loop allows accurate reconstruction of the
input signal. When this technique is applied to a generated
spectrum, a rounding step has to be performed after each

step of the prediction filter in order to stay in the area
of the integers. By using the inverse filter and the same
rounding function, the original spectrum may accurately be
reproduced.
In order to make use of the redundancy between two channels
for data reduction, center-side coding may be also employed
in a lossless manner, if a rounded rotation with an angle
of π/4 is used. In comparison to the alternative of
calculating the sum and difference of the left and the
right channel of a stereo signal, the rounded rotation has
the advantage of energy conservation. The use of so-called
joint stereo coding techniques may be switched on or off
for each band, as it is also performed in the standard MPEG
AAC. Further rotation angles may also be considered to be
able to reduce redundancy between two channels more
flexibly.
Particularly the transform concept illustrated with respect
to Fig. 3 provides an integer implementation of the MDCT,
i.e. an IntMDCT, which operates losslessly with respect to
forward transform and subsequent backward transform. By the
rounding steps 20, 24, 28 and the corresponding rounding
steps in the integer DCT (block 14 in Fig. 3), there is
further always possible aw integer processing, i.e.
processing with more roughly quantized values than they
have been generated, for example, by floating point
multiplication with a lifting matrix (blocks 18, 22, 26 of
Fig. 3) .
The result is that the whole IntMDCT may be performed
efficiently with respect to calculating.
The losslessness of this IntMDCT or, generally speaking,
the losslessness of all coding algorithms referred to as
lossless is related to the fact that the signal, when it is
coded to achieve a coded signal and| when it is afterwards
again decoded to achieve a coded/dtecoded signal, "looks"

exactly like the original signal. In other words, the
original signal is identical to the coded/decoded original
signal. This is an obvious contrast to a so-called lossy
coding, in which, as in the case of audio coders operating
on a psychoacoustic basis, data are irretrievably lost by
the coding process and particularly by the quantizing
process controlled by a psychoacoustic model.
Of course, rounding errors are still introduced. Thus, as
shown with respect to Fig. 3 in the blocks 20, 24, 28,
rounding steps are performed which, of course, introduce a
rounding error which is only "eliminated" in the decoder
when the inverse operations are performed. Thus lossless
coding/decoding concept differ essentially from lossy
coding/decoding concepts in that, in lossless
coding/decoding concepts, the rounding error is introduced
so that it may be eliminated again, while this is not the
case in lossy coding/decoding concepts.
However, if you consider the coded signal, i.e., in the
example of transform coders, the spectrum of a block of
temporal samples, the rounding in the forward transform
and/or generally the quantization of such a signal results
in an error being introduced in the signal. Thus, a
rounding error is superimposed on the ideal error-free
spectrum of the signal, the error being typically, for
example in the case of Fig. 3, white noise equally
including all frequency components of the considered
spectral range. This white noise superimposed on the ideal
spectrum thus represents the rounding error which occurs,
for example, by the rounding in the blocks 20, 24, 28
during windowing, i.e. the pre-processing of the signal
prior to the actual DCT in block 14. It is particularly to
be noted that, for a losslessness requirement, the whole
rounding error must necessarily be coded, i.e. transmitted
to the decoder, because the decoder requires the whole
rounding error introduced in the coder to achieve a correct
lossless reconstruction.

The rounding error may not be problematic when nothing is
"done" with the spectral representation, i.e. when the
spectral representation is only stored, transmitted and
decoded again by a correctly matching inverse decoder. In
that case, the losslessness criterion will always be met,
irrespective of how much rounding error has been introduced
into the spectrum. If, however, something is done with the
spectral representation, i.e. with the ideal spectral
representation of an original signal containing a rounding
error, for example if scalability layers are generated,
etc., all these things work better, the smaller the
rounding error.
Thus, there is also a requirement in lossless
codings/decodings that, on the one hand, a signal should be
losslessly reconstructable by special decoders, that,
however, a signal also should have a minimal rounding error
in its spectral representation to preserve flexibility in
that also non-ideal lossless decoders may be fed with the
spectral representation or that scaling layers, etc. may be
generated.
As discussed above, the rounding error is expressed as
white noise across the entire considered spectrum. On the
other hand, particularly in high quality applications, such
as they are especially interesting for the lossless case,
i.e. in audio applications with very high sampling
frequencies, such as 96 kHz, the audio signal only has a
reasonably signal content in a certain spectral range,
which typically only reaches up to, at the most, 20 kHz.
Typically, the range in which most signal energy of the
audio signal is concentrated will be the range between 0
and 10 kHz, while the signal energy will considerably
decrease in the range above 10 kHz. However, this does not
matter to the white noise introduced by rounding. It
superimposes itself across the entire considered spectral
range of the signal energy. The result is that, in spectral

ranges, i.e. typically in the high spectral ranges where
there is no or only very little audio signal energy, there
will be only the rounding error. At the same time,
particularly due to its non-deterministic nature, the
rounding error is also difficult to code, i.e. it is only
codeable with relatively high bit requirements. The bit
requirements do not play the decisive role, particularly in
some lossless applications. However, for lossless coding
applications to become more and more widespread, it is very
important to operate very bit-efficiently also here to
combine the advantage of the absent quality reduction
inherent in- lossless applications also with corresponding
bit efficiency, as it is known from lossy coding concepts.
Although a rounding error is thus unproblematic in a
lossless context in that it may be eliminated in the
decoding, it is still of considerable significance for
allowing the lossless decoding and/or reconstruction to be
performed in the first place. On the other hand, as already
discussed, the rounding error is responsible for the
spectral representation becoming defective, i.e. being
distorted as compared to an ideal spectral representation
of the unrounded signal. For special cases of application,
in which the spectral representation, i.e. the coded
signal, is actually important, i.e. when, for example,
various scaling layers are generated from the coded signal,
it is still desirable to obtain a coded representation with
a rounding error as small as possible from which, however,
no rounding error has been eliminated that is required for
a reconstruction.
It is the object of the present invention to provide an
artifact-reduced concept for processing input values.
This object is achieved by a device for processing at least
two input values in accordance with claim 1 or a method for
processing at least two input values in accordance with
claim 11 or a computer program in accordance with claim 12.

The present invention is based on the finding that a
reduction of the rounding error may be achieved by reducing
the rounding error, whenever two values would actually have
to be rounded and the two rounded values are then combined
to a third further value, for example by addition, by first
adding the two values in unrounded state, i.e. as floating
point representation, and then only adding the added output
value to the third value. Compared to the usual procedure,
in which each value is processed individually, the
inventive concept further results in saving one summation
process and one rounding process, so that the inventive
concept, in addition to the fact that the rounding error is
reduced, also contributes to a more efficient algorithm
execution.
In a preferred embodiment of the present invention, the
inventive concept is used for reducing the rounding error
when two rotations divided into lifting steps "abut" each
other, i.e. when there is a situation where first a first
value is to be "rotated" together with a third value, and
when the result of this first rotation is then again to be
rotated with a second value.
A further case of application, of the inventive concept for
reducing the rounding error exists when a lifting stage of
a multi-dimensional lifting concept is preceded by
butterflies, such as it occurs when an N-point DCT is split
into two DCTs having half the length, i.e. with N/2 points.
In this case, there will be a butterfly stage before the
actual multi-dimensional lifting and there will be a
rotation stage after the multi-dimensional lifting. In
particular, the roundings required by the butterfly stage
may be combined with the roundings of the first lifting
stage of the multi-dimensional lifting concept to reduce
the rounding error.

Since the number of rounding stages in the integer MDCT
with integer windowing/pre-processing and multi-dimensional
lifting processing for the transform is already
significantly reduced as compared to prior art without
applying the invention, particularly in this situation, the
inventive concept contributes to a significant reduction of
the remaining, although already small, rounding error. This
results in, for example, a spectrum now having only a small
deviation with respect to an ideal spectrum, due to the
still present, but now much reduced rounding error.
Particularly in the context of lossless coding/decoding,
the present invention may be combined with the spectral
shaping of the rounding error, wherein the still remaining
rounding error is spectrally shaped such that it is
"accommodated" in the frequency range of the signal to be
coded in which the signal has a high signal energy anyway,
and that, as a consequence, the rounding error is not
present in ranges in which the signal has no energy anyway.
While, in prior art, a rounding error was distributed white
across the entire spectrum of the signal in lossless coding
and particularly in lossless coding on the basis of integer
algorithms, the rounding error is superimposed on the ideal
spectrum in the form of pink noise, i.e. sucit that the
noise energy due to the rounding is present where the
signal has its highest signal energy anyway, and that thus
the noise due to the rounding error has also little or even
absent energy where the signal to be coded has no energy
itself. Thus the worst case is avoided, in which the
rounding error which is a stochastic signal and is thus
difficult to code, is the only signal to be coded in a
frequency range and thus unnecessarily increases the bit
rate.
When considering an audio signal in which the energy is in
the low frequency range, the means for rounding is designed
to achieve a spectral low pass shaping of the generated
rounding error such that, at high frequencies of the coded

signal, there is neither signal energy nor noise energy,
while the rounding error is mapped into the range where the
signal has a lot of energy anyway.
Particularly for lossless coding applications, this is in
contrast to prior art where a rounding error is spectrally
high-pass filtered to get the rounding error outside of the
audible range. This also corresponds to the case where the
spectral range in which the rounding error is present is
filtered out either electronically or by the ear itself to
eliminate the rounding error. For lossless coding/decoding,
however, the rounding error is absolutely required in the
decoder, because otherwise the algorithm used in the
decoder, which is inverse to the lossless coding algorithm,
generates distortions.
The concept of the spectral shaping of the rounding error
is preferably used in lossless applications with a high
sampling rate, because, particularly in the cases where
spectra theoretically extend to up to more than 40 kHz (due
to oversampling), the same situation is achieved in the
high frequency range, in which there is no. signal energy
anyway, i.e. in which coding may be done very efficiently,
as in the case of a non-integer, coding, in which the signal
energy is also zero in the high frequency range.
As a large number of zeros is coded very efficiently and
the rounding error, which is problematic to code, is
shifted to the range which is typically coded very finely
anyway, the overall data rate of the signal is thus reduced
as compared to the case in which the rounding error is
distributed as white noise across the entire frequency
range. Furthermore, the coding performance— and hence also
the decoding performance - is increased, because no
computing time has to be spent for the coding and decoding
of the high frequency range. The concept thus also has the
result that a faster signal processing may be achieved on
the part of the coder and/or on the part of the decoder.

In an embodiment, the concept of shaping/reducing the
approximation error is applied to invertibie integer
transforms, particularly the IntMDCT. There are two areas
of application, namely, ®n the one hand, multidimensional
lifting, with which the MDCT is considerably simplified
with respect to the required rounding steps, and, on the
other hand, the rounding operations required in integer
windowing, such as they occur in the pre-processing prior
to the actual DCT.
An error feedback concept is used for the spectral shaping
of the rounding error, in which the rounding error is
shifted to the frequency range in which the signal being
processed has the highest signal energy. For audio signals,
and particularly also for video signals, this will be the
low frequency range, so that the error feedback system has
a low-pass property. This results in fewer rounding errors
in the upper frequency range, in which there are normally
fewer signal components. In prior art, the rounding errors
prevail in the upper range, which must then be coded and
thus increase the number of bits required for coding.
Preferably, this rounding error is reduced in the higher
frequencies, which directly reduces the number of bits
required for coding.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
Preferred embodiments of the present invention are
explained in more detail below with respect to the
accompanying drawings, in which:
Fig. 1 shows a block circuit diagram of the concept for
processing a signal having a sequence of discrete
values with spectral shaping of the rounding
error;
Fig. 2a shows a known concept for high-pass spectral
shaping a quantization error;

Fig. 2b shows a concept for low-pass shaping the rounding
error;
Fig. 2c shows a block circuit diagram according to an
embodiment for the spectral shaping/rounding
block;
Fig. 3 shows a block circuit diagram of a preferred
means for processing time-discrete audio samples
to obtain integer values from which integer
spectral values may be determined;
Fig. 4 is a schematic illustration of the decomposition
of an MDCT and an inverse MDCT into Givens
rotations and two DCT-IV operations;
Fig. 5 is an illustration for illustrating the
decomposition of the MDCT with 50 percent overlap
into rotations and DCT-IV operations;
Fig. 6a shows a schematic block circuit diagram of a
known coder with MDCT and 50 percent overlap;
Fig. 6b shows a block cirgpit diagram of a known decoder
for decoding the values generated by Fig. 10a;
Fig. 7 is an illustration of the lifting in windowing
according to Fig. 3;
Fig. 8 is a "resorted" illustration of the lifting of
Fig. 7 for windowing prior to the actual
transform;. ,
Fig. 9 shows an application of the spectral shaping for
windowing according to Figs. 3, 7 and 8;

Fig. 10 shows a block circuit diagrams of a device for
converting according to a preferred embodiment of
the present invention;
Fig. 11 shows a device for inverse converting according
to a preferred embodiment of the present
invention;
Fig. 12 is an illustration of the transformation of two
subsequent blocks of values, as it is useable for
the present invention;
Fig. 13 is a detailed illustration of a multidimensional
lifting step with a forward transform matrix;
Fig. 14 is an illustration of multidimensional inverse
lifting step with a backward transform matrix;
Fig. 15 is an illustration of the present invention for
the decomposition of a DCT-IV of the length N
into two DCT-IVs of the length N/2;
Fig. 16 shows an application of the inventive concept
within the transform with multidimensional
lifting of Fig. 10;
Fig. 17 is an illustration of two successive lifting
steps for the inventive rounding error reduction;
Fig. 18 is an illustration of the inventive concept for
reducing the rounding error in two successive
lifting steps of Fig. 17; and
Fig, 19 shows a preferred combination of the concept of
Fig. 18 with the concept of Fig. 16.
Fig, 1 shows a device for processing a signal having a
sequence of discrete values which is input to means 202 for

manipulating via a signal input 200. The signal is
typically formed to have a first frequency range in which
the signal has a high energy and to have a second frequency
range in which the signal has a comparatively low energy.
If the first signal is an audio signal, it will have the
high energy in the first frequency range, i.e. in the low
frequency range, and will have the low energy in the high
frequency range. If, however, the signal is a video signal,
it will also have the high energy in the low range, and
will have the low energy in the high range. In contrast to
the audio signal, the frequency range in the video signal
is a spatial frequency range, unless successive video
frames are considered in which there also exists a temporal
frequency, for example related to a selected image area, in
successive frames.
The means 202 for manipulating is generally formed to
manipulate the sequence of discrete values so that a
sequence of manipulated values is obtained in which at
least one manipulated value is not integer. This sequence
of non-integer discrete values is fed to means 204 for
rounding the sequence of manipulated values to obtain a
sequence of rounded manipulated values. The means 204 for
rounding is formed to effect a spectral shaping of a
rounding error generated by the rounding so that, in the
first frequency range, i.e. in the frequency range where
the original signal has a high energy, a spectrally shaped
rounding error also has a high energy, and that, in the
second frequency range, i.e. in the frequency range where
the original signal has a low energy, the spectrally shaped
rounding error also has a low or no energy. Generally, the
energy of the spectrally shaped rounding error in the first
frequency range is thus higher than the energy of the
spectrally shaped rounding error in the second frequency
range. However, the spectral shaping preferably does not
change anything in the overall energy of the rounding
error.

Preferably, the device for generating the error-containing
sequence of rounded manipulated values is coupled to means
206 for converting to a spectral representation either
directly or via further manipulation or rounding
combinations. Thus, the error-containing sequence of
rounded manipulated values may be fed directly into the
means 206 for converting to a spectral representation to
achieve a direct spectrum of the error-containing sequence
of rounded manipulated values. However, in an embodiment,
the means for manipulating is a lifting step and/or a
lifting matrix, and the means for rounding is formed to
round the non-integer results of a lifting step. In this
case, the means 204 is followed by a further means for
manipulating performing the second lifting step, which, in
turn, is followed by means for rounding, which, in turn, is
followed by a third means for manipulating implementing the
third lifting step, wherein then there is another
manipulation so that all three lifting steps are
accomplished. Thus, an error-containing sequence of rounded
manipulated values derived from the original error-
containing sequence of rounded manipulated values at the
output of means 204 is generated, which is then finally
converted to a spectral representation, preferably also by
an integer transform, as it is illustrated by block 206.
The output signal of the spectral representation at the
output of block 206 has now/ a spectrum which, in contrast
to prior art, does no longer have a white distributed
rounding error, but a rounding error shaped spectrally,
i.e. so that there is also a high rounding error energy
where the actual "useful spectrum" has a high signal
energy, while even in the best case there is no rounding
error energy in the frequency ranges in which there is no
signal energy.
This spectrum is then supplied to means 208 for entropy-
coding of the spectral representation. The means for
entropy-coding can comprise any coding method, such as a
Huffman coding, an arithmetic coding, etc. Especially for

coding a large number of spectral lines which are zero and
border on each other, a run length coding is also suitable
which, of course, cannot be applied in prior art, because
here an actually deterministic signal must be coded in such
frequency ranges which, however, has a white spectrum and
thus is especially unfavorable for any kind of coding
tools, because the individual spectral values are
completely uncorrelated to each other.
Subsequently, a preferred embodiment of the means 204 for
rounding with spectral shaping is discussed with respect to
Figs. 2a, 2b, 2c.
Fig. 2a shows a known error feedback system for the
spectral shaping of a quantization error, as it is
described in the specialist book "Digitale
Audiosignalverarbeitung", U. Zoelzer, Teubner-Verlag,
Stuttgart, 1997. An input value X(i) is supplied to an
input summer 210. The output signal of the summer 210 is
supplied to a quantizer 212 providing a quantized output
value y(i) at an output of the spectral shaping device. At
a second summer 214, the difference between the value after
the quantizer 212 and the value before the quantizer 212 is
determined, i.e. the rounding error e(i). The output signal
of the second summer 214 is fed to a delay means 216. The
error e(i) delayed by one time unit is then subtracted from
the input value by means of the adder 210. This results in
a high-pass evaluation of the original error signal e(n).
If z-1 (-2 + z-1) is used instead of the delay means z-1
designated 216 in Fig. 2a, the result is a second order
high-pass evaluation. In. certain embodiments, such spectral
shapings of the quantization error are used to "mask out"
the quantization error from the perceptible range, i.e. for
example from the low-pass range of the signal x(n), so that
the quantization error is not perceived.

As shown in Fig. 2b, a low-pass evaluation is performed
instead to achieve a spectral shaping of the error not
outside the range of perception, but exactly into the range
of perception. For this, the output signal of the adder
210, as shown in Fig. 2b, is fed to a rounding block 218
implementing some rounding function which may be, for
example, rounding up, rounding down, rounding by
truncating, rounding up/rounding down to the next integer
or to the next but one, next but two ... integer. In the
error feedback path, i.e. between the adder 214 and the
adder 210, there is now a further feedback block 220 with
an impulse response h(n) and/or a transfer function H

Documents

Application Documents

#	Name	Date
1	abstract-00606-kolnp-2006.jpg	2011-10-06
2	606-kolnp-2006-translated copy of priority document.pdf	2011-10-06
3	606-kolnp-2006-reply to examination report.pdf	2011-10-06
4	606-kolnp-2006-petetion under rule 137.pdf	2011-10-06
5	606-kolnp-2006-gpa.pdf	2011-10-06
6	606-kolnp-2006-form 5.pdf	2011-10-06
7	606-kolnp-2006-form 3.pdf	2011-10-06
8	606-kolnp-2006-form 18.pdf	2011-10-06
9	606-kolnp-2006-examination report.pdf	2011-10-06
10	606-kolnp-2006-correspondence1.1.pdf	2011-10-06
11	606-KOLNP-2006-AMANDED CLAIMS.pdf	2011-10-06
12	00606-kolnp-2006-others.pdf	2011-10-06
13	00606-kolnp-2006-international search report.pdf	2011-10-06
14	00606-kolnp-2006-international publication.pdf	2011-10-06
15	00606-kolnp-2006-form 5.pdf	2011-10-06
16	00606-kolnp-2006-form 3.pdf	2011-10-06
17	00606-kolnp-2006-form 2.pdf	2011-10-06
18	00606-kolnp-2006-form 1.pdf	2011-10-06
19	00606-kolnp-2006-drawings.pdf	2011-10-06
20	00606-kolnp-2006-description complete.pdf	2011-10-06
21	00606-kolnp-2006-claims.pdf	2011-10-06
22	00606-kolnp-2006-abstract.pdf	2011-10-06
23	606-KOLNP-2006-(20-03-2012)-FORM-13.pdf	2012-03-20
24	606-KOLNP-2006-(20-03-2012)-FORM-1.pdf	2012-03-20
25	606-KOLNP-2006-(20-03-2012)-CORRESPONDENCE.pdf	2012-03-20
26	606-KOLNP-2006-(22-04-2013)-CORRESPONDENCE.pdf	2013-04-22
27	606-KOLNP-2006-REPLY TO EXAMINATION REPORT-1.1.pdf	2013-05-09
28	606-KOLNP-2006-PETITION UNDER RULE 137-1.1.pdf	2013-05-09
29	606-KOLNP-2006-OTHERS.pdf	2013-05-09
30	606-KOLNP-2006-INTERNATIONAL SEARCH REPORT & OTHERS.pdf	2013-05-09
31	606-KOLNP-2006-INTERNATIONAL PUBLICATION.pdf	2013-05-09
32	606-KOLNP-2006-GRANTED-SPECIFICATION-COMPLETE-1.1.pdf	2013-05-09
33	606-KOLNP-2006-GRANTED-FORM 5-1.1.pdf	2013-05-09
34	606-KOLNP-2006-GRANTED-FORM 3-1.1.pdf	2013-05-09
35	606-KOLNP-2006-GRANTED-FORM 2-1.1.pdf	2013-05-09
36	606-KOLNP-2006-GRANTED-FORM 1-1.1.pdf	2013-05-09
37	606-KOLNP-2006-GRANTED-DRAWINGS-1.1.pdf	2013-05-09
38	606-KOLNP-2006-GRANTED-DESCRIPTION (COMPLETE)-1.1.pdf	2013-05-09
39	606-KOLNP-2006-GRANTED-CLAIMS-1.1.pdf	2013-05-09
40	606-KOLNP-2006-GRANTED-ABSTRACT-1.1.pdf	2013-05-09
41	606-KOLNP-2006-FORM 26.pdf	2013-05-09
42	606-KOLNP-2006-FORM 18-1.1.pdf	2013-05-09
43	606-KOLNP-2006-EXAMINATION REPORT-1.1.pdf	2013-05-09
44	606-KOLNP-2006-CORRESPONDENCE.pdf	2013-05-09
45	606-KOLNP-2006-(20-09-2013)-PETITION UNDER RULE 137.pdf	2013-09-20
46	606-KOLNP-2006-(20-09-2013)-FORM-5.pdf	2013-09-20
47	606-KOLNP-2006-(20-09-2013)-FORM-3.pdf	2013-09-20
48	606-KOLNP-2006-(20-09-2013)-FORM-2.pdf	2013-09-20
49	606-KOLNP-2006-(20-09-2013)-FORM-1.pdf	2013-09-20
50	606-KOLNP-2006-(20-09-2013)-DRAWINGS.pdf	2013-09-20
51	606-KOLNP-2006-(20-09-2013)-CORRESPONDENCE.pdf	2013-09-20
52	606-KOLNP-2006-(01-03-2016)-FORM-27.pdf	2016-03-01
53	Form 27 [21-03-2017(online)].pdf	2017-03-21
54	606-KOLNP-2006-RELEVANT DOCUMENTS [18-01-2018(online)].pdf	2018-01-18
55	606-KOLNP-2006-RELEVANT DOCUMENTS [06-02-2019(online)].pdf	2019-02-06
56	606-KOLNP-2006-RELEVANT DOCUMENTS [29-02-2020(online)].pdf	2020-02-29
57	606-KOLNP-2006-RELEVANT DOCUMENTS [26-09-2021(online)].pdf	2021-09-26
58	606-KOLNP-2006-RELEVANT DOCUMENTS [05-09-2022(online)].pdf	2022-09-05
59	606-KOLNP-2006-01-02-2023-LETTER OF PATENT.pdf	2023-02-01
60	606-KOLNP-2006-RELEVANT DOCUMENTS [05-09-2023(online)].pdf	2023-09-05
61	606-KOLNP-2006-FORM-27 [04-09-2025(online)].pdf	2025-09-04
62	606-KOLNP-2006-FORM-27 [04-09-2025(online)]-1.pdf	2025-09-04