Audio Signal Encoder, Audio Signal Decoder, Method For Encoding Or

< Back

Audio Signal Encoder, Audio Signal Decoder, Method For Encoding Or Decoding An Audio Signal Using An Aliasing Cancellation

Abstract: An audio signal decoder (200) for providing a decoded representation (212) of an audio content on the basis of an encoded representation (310) of the audio content comprises a transform domain path (230, 240, 242, 250, 260) configured to obtain a time-domain representation (212) of a portion of the audio content encoded in a transform-domain mode on the basis of a first set (220) of spectral coefficients, a representation (224) of an aliasing-cancellation stimulus signal and a plurality of linear-prediction-domain parameters (222). The transform domain path comprises a spectrum processor (230) configured to apply a spectrum shaping to the first set of spectral coefficients in dependence on at least a subset of the linear-prediction-domain parameters, to obtain a spectrally-shaped version (232) of the first set of spectral coefficients. The transform domain path comprises a first frequency-domain-to-time-domain converter (240) configured to obtain a time-domain representation of the audio content on the basis of the spectrally-shaped version of the first set of spectral coefficients. The transform domain path comprises an aliasing-cancellation stimulus filter configured to filter (250) the aliasing-cancellation stimulus signal (324) in dependence on at least a subset of the linear-prediction-domain parameters (222), to derive an aliasing-cancellation synthesis signal (252) from the aliasing-cancellation stimulus signal. The transform domain path also comprises a combiner (260) configured to combine the time-domain representation (242) of the audio content with the aliasing-cancellation synthesis signal (252), or a post-processed version thereof, to obtain an aliasing reduced time-domain signal.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

19 April 2012

Publication Number

06/2013

Publication Type

INA

Invention Field

ELECTRONICS

Status

Parent Application

Patent Number

Legal Status

Grant Date

2021-06-17

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

HANSASTR. 27C, 80686 MÜNCHEN GERMANY

VOICEAGE CORPORATION

750 LUCERNE ROAD, SUITE 250 MONTREAL, QUÉBEC H3R 2H6, CANADA

KONINKLIJKE PHILIPS ELECTRONICS N.V

GROENEWOUDSEWEG 1 NL-5621 BA EINDHOVEN, NETHERLAND

DOLBY INTERNATIONAL AB

APOLLO BUILDING, 3E HERIKERBERGWEG 1-35 NL-1101 CN AMSTERDAM ZUID-OOST, NETHERLAND

Inventors

1. BESSETTE, BRUNO

1600 RUE MURILLO J1N 4G5 SHERBROOKE, QUEBEC CA

2. NEUENDORF, MAX

PARADIESSTRAßE 20 90459 NÜRNBERG DE

3. GEIGER, RALF

JAKOB-HERZ-WEG 36 91052 ERLANGEN DE

4. GOURNAY, PHILIPPE

3012 RUE DU SAUVIGNON J1L 0A2 SHERBROOKE, QUEBEC CA

5. LEFEBVRE, ROCH

259, RUE DE LA BOURGADE QUÉBEC J1X 0L6 MAGOG, CA

6. GRILL, BERNHARD

PETER-HENLEIN-STR. 7 91207 LAUF DE

7. LECOMTE, JÉRÉMIE

TURNSTRASSE 7 90763 FÜRTH DE

8. BAYER, STEFAN

DORTMUNDER STRASSE 14 90425 NÜRNBERG DE

9. RETTELBACH, NIKOLAUS

SPESSARTSTR. 38 90427 NÜRNBERG DE

10. VILLEMOES, LARS

MANDOLINVAEGEN 22 175 56 JAERFAELLA SE

11. SALAMI, REDWAN

4045 ALBERT-DREUX, PLACE H4R 2Y3 SAINT-LAURENT, QUEBEC CA

12. DEN BRINKER, ALBERTUS C.

BOERHAAVELAAN 52 5644 EINDHOVEN NL

Specification

AUDIO SIGNAL ENCODER, AUDIO SIGNAL DECODER, METHOD FOR
ENCODING OR DECODING AN AUDIO SIGNAL USING AN ALIASING-
CANCELLATION
Technical Field
Embodiments according to the invention create an audio signal decoder for providing a
decoded representation of an audio content on the basis of an encoded representation of the
audio content.
Embodiments according to the invention create an audio signal encoder for providing an
encoded representation of an audio content comprising a first set of spectral coefficients, a
representation of an aliasing-cancellation stimulus signal and a plurality of linear-
prediction-domain parameters on the basis of an input representation of the audio content.
Embodiments according to the invention create a method for providing a decoded
representation of an audio content on the basis of an encoded representation of the audio
content.
Embodiments according to the invention create a method for providing an encoded
representation, of an audio content on the basis of an input representation of the audio
content.
Embodiments according to the invention create a computer program for performing one of
said methods.
Embodiments according to the invention create a concept for a unification of unified-
speech-and-audio-coding (also designated briefly as USAC) windowing and frame
transitions.
Background of the Invention
In the following some background of the invention will be explained in order to facilitate
the understanding of the invention and advantages thereof.

During the past decade, big effort has been input on creating the possibility to digitally
store and distribute audio content. One important achievement on this way is the definition
of the International Standard ISO/IEC 14496-3. Part 3 of this Standard is related to a
coding and decoding of audio contents, and sub-part 4 of part 3 is related to general audio
coding. ISO/IEC 14496, part 3, sub-part 4 defines a concept for encoding and decoding of
general audio content. In addition, further improvements have been proposed in order to
improve the quality and/or reduce the required bitrate. Moreover, it has been found that the
performance of frequency-domain based audio coders is not optimal for audio contents
comprising speech. Recently, a unified speech-and-audio codec has been proposed which
efficiently combines techniques from both words, namely speech coding and audio coding.
For some details, reference is made to the publication "A Novel Scheme for Low Bitrate
Unified Speech and Audio Coding - MPEG-RMO " of M. Neuendorf et al. (presented at
the 126th Convention of the Audio Engineering Society, May 7-10, 2009, Munich,
Germany).
In such an audio coder, some audio frames are encoded in the frequency-domain and some
audio frames are encoded in the linear-prediction-domain.
However, it has been found that it is difficult to transition between frames encoded in
different domains without sacrificing a significant amount of bitrate.
In view of this situation, there is a desire to create a concept for encoding and decoding an
audio content comprising both speech and general audio, which allows for efficient
realization of transitions between portions encoded using different modes.
Summary of the Invention
Embodiments according to the invention create an audio signal decoder for providing a
decoded representation of an audio content on the basis of an encoded representation of an
audio content. The audio signal decoder comprises a transform domain path (for example,
a transform-coded excitation linear-prediction-domain-path) configured to obtain a time
domain representation of the audio content encoded in a transform domain mode on the
basis of a first set of spectral coefficients, a representation of an aliasing-cancellation
stimulus signal, and a plurality of linear-prediction-domain parameters (for example,
linear-prediction-coding filter coefficients). The transform domain path comprises a
spectrum processor configured to apply a spectral shaping to the (first) set of spectral
coefficients in dependence on at least a subset of linear-prediction-domain parameters to

obtain a spectrally-shaped version of the first set of spectral coefficients. The transform
domain path also comprises a (first) frequency-domain-to-time-domain-converter
configured to obtain a time-domain representation of the audio content on the basis of the
spectrally-shaped version of the first set of spectral coefficients. The transform domain
path also comprises an aliasing-cancellation-stimulus filter configured to filter the aliasing-
cancellation stimulus signal in dependence on at least a subset of the linear-prediction-
domain parameters, to derive an aliasing-cancellation synthesis signal from the aliasing-
cancellation stimulus signal. The transform domain path also comprises a combiner
configured to combine the time-domain representation of the audio content with the
aliasing-cancellation synthesis signal, or a post-processed version thereof, to obtain an
aliasing-reduced time-domain signal.
This embodiment of the invention is based on the finding that an audio decoder which
performs a spectral shaping of the spectral coefficients of the first set of spectral
coefficients in the frequency-domain, and which computes an aliasing-cancellation
synthesis signal by time-domain filtering an aliasing-cancellation stimulus signal, wherein
both the spectral shaping of the spectral coefficients and the time-domain filtering of the
aliasing-cancellation-stimulus signal are performed in dependence on linear-prediction-
domain parameters, is well-suited for transitions from and to portions (for example,
frames) of the audio signal encoded with different noise shaping and also for transitions
from or to frames which are encoded in different domains. Accordingly, transitions (for
example, between overlapping or non-overlapping frames) of the audio signal, which are
encoded in different modes of a multi-mode audio signal coding, can be rendered by the
audio signal decoder with good auditory quality and at a moderate level of overhead.
For example, performing the spectral shaping of the first set of coefficients in the
frequency-domain allows having the transitions between portions (for example, frames) of
the audio content encoded using different noise shaping concepts in the transform domain,
wherein an aliasing-cancellation can be obtained with good efficiency between the
different portions of the audio content encoded using different noise shaping methods (for
example, scale-factor-based noise shaping and linear-prediction-domain-parameter-based
noise-shaping). Moreover, the above-described concepts also allows for an efficient
reduction of aliasing artifacts between portions (for example, frames) of the audio content
encoded in different domains (for example, one in the transform domain and one in the
algebraic-code-excited-linear-prediction-domain). The usage of a time-domain filtering of
the aliasing-cancellation stimulus signal allows for an aliasing-cancellation at the transition
from and to a portion of the audio content encoded in the algebraic-code-excited-linear-
prediction mode even if the noise shaping of the current portion of the audio content

(which may be encoded, for example, in a transform-coded-excitation linear prediction-
domain mode) is performed in the frequency-domain, rather than by a time-domain
filtering.
To summarize the above, embodiments according to the present invention allow for a good
tradeoff between a required side information and a perceptual quality of transitions
between portions of the audio content encoded in three different modes (for example,
frequency-domain mode, transform-coded-excitation linear-prediction-domain mode, and
algebraic-code-excited-linear-prediction mode).
In a preferred embodiment, the audio signal decoder is a multi-mode audio signal decoder
configured to switch between a plurality of coding modes. In this case, the transform
domain branch is configured to selectively obtain the aliasing cancellation synthesis signal
for a portion of the audio content following a previous portion of the audio content which
does not allow for an aliasing-cancelling overlap-and-add operation or followed by a
subsequent portion of the audio content which does not allow for an aliasing-cancelling
overlap-and-add operation. It has been found that the application of a noise shaping, which
is performed by the spectral shaping of the spectral coefficients of the first set of spectral
coefficients, allows for a transition between portions of the audio content encoded in the
transform domain and using different noise shaping concepts (for example, a scale-factor-
based noise shaping concept and a linear-prediction-domain-parameter-based noise
shaping concept) without using the aliasing-cancellation signals, because the usage of the
first frequency-domain-to-time-domain converter after the spectral shaping allows for an
efficient aliasing-cancellation between subsequent frames encoded in the transform
domain, even if different noise-shaping approaches are used in the subsequent audio
frames. Thus, bitrate efficiency can be obtained by selectively obtaining the aliasing-
cancellation synthesis signal only for transitions from or to a portion of the audio content
encoded in a non-transform domain (for example, in an algebraic code-excited-linear-
prediction-mode).
In a preferred embodiment, the audio signal decoder is configured to switch between a
transform-coded-excitation-linear-prediction-domain mode, which uses a transform-coded-
excitation information and a linear-prediction-domain parameter information, and a
frequency-domain mode, which uses a spectral coefficient information and a scale factor
information. In this case, the transform-domain-path is configured to obtain the first set of
spectral coefficients on the basis of the transform-coded-excitation information and to
obtain the linear-prediction-domain parameters on the basis of the linear-prediction-
domain-parameter information. The audio signal decoder comprises a frequency domain

path configured to obtain a time-domain representation of the audio content encoded in the
frequency-domain mode on the basis of a frequency-domain mode set of spectral
coefficients described by the spectral coefficient information and in dependence on a set of
scale factors described by the scale factor information. The frequency-domain path
comprises a spectrum processor configured to apply a spectral shaping to the frequency-
domain mode set of spectral coefficients, or to a pre-processed version thereof, in
dependence on the scale factors to obtain a spectrally-shaped frequency-domain mode set
of spectral coefficients. The frequency-domain path also comprises a frequency-domain-to-
time-domain converter configured to obtain a time-domain representation of the audio
content on the basis of the spectrally-shaped frequency-domain-mode set of spectral
coefficients. The audio signal decoder is configured such that time-domain representations
of two subsequent portions of the audio content, one of which two subsequent portions of
the audio content is encoded in the transform-coded-excitation linear-prediction-domain
mode, and one of which two subsequent portions of the audio content is encoded in the
frequency-domain mode, comprise a temporal overlap to cancel a time-domain aliasing
caused by the frequency-domain-to-time-domain conversion.
As already discussed, the concept according to the embodiments of the invention is well-
suited for transitions between portions of the audio content encoded in the transform-
coded-excitation-linear-predication-domain mode and in the frequency-domain mode. A
very good quality aliasing-cancellation is obtained due to the fact that the spectral shaping
is performed in the frequency-domain in the transform-coded-excitation-linear-prediction-
domain mode.
In a preferred embodiment, the audio signal decoder is configured to switch between a
transform-coded-excitation-linear-prediction-domain-mode which uses a transform-coded-
excitation information and a linear-prediction-domain parameter information, and an
algebraic-code-excited-linear-prediction mode, which uses an algebraic-code-excitation-
information and a linear-prediction-domain-parameter information. In this case, the
transform-domain path is configured to obtain the first set of spectral coefficients on the
basis of the transform-coded-excitation information and to obtain the linear-prediction-
domain parameters on the basis of the linear-prediction-domain-parameter information.
The audio signal decoder comprises an algebraic-code-excited-linear-prediction path
configured to obtain a time-domain representation of the audio content encoded in the
algebraic-code-excited-linear-prediction (also designated briefly with ACELP in the
following) mode, on the basis of the algebraic-code-excitation information and the linear-
prediction-domain parameter information. In this case, the ACELP path comprises an
ACELP excitation processor configured to provide a time-domain excitation signal on the

basis of the algebraic-code-excitation information and a synthesis filter configured to
perform a time-domain filtering, to provide a reconstructed signal on the basis of the time-
domain excitation signal and in dependence on linear-prediction-domain filter coefficients
obtained on the basis of the linear-prediction-domain parameter information. The
transform domain path is configured to selectively provide the aliasing-cancellation
synthesis signal for a portion of the audio content encoded in the transform-coded-
excitation linear-prediction-domain mode following a portion of the audio content encoded
in the ACELP mode and for a portion of the content encoded in the transfer-coded-
excitation-linear-prediction-domain mode preceding a portion of the audio content encoded
in the ACELP mode. It has been found that the aliasing-cancellation synthesis signal is
very well-suited for transitions between portions (for example, frames) encoded in the
transform-coded-excitation-linear-prediction-domain (in the following also briefly
designated as TCX-LPD) mode and the ACELP mode.
In a preferred embodiment, the aliasing-cancellation stimulus filter is configured to filter
the aliasing-cancellation stimulus signals in dependence on linear-prediction-domain filter
parameters which correspond to a left-sided aliasing folding point of the first frequency-
domain-to-time-domain converter for a portion of the audio content encoded in the TCX-
LPD mode following a portion of the audio content encoded in the ACELP mode. The
aliasing-cancellation stimulus filter is configured to filter the aliasing-cancellation stimulus
signal in dependence on linear-prediction-domain filter parameters which correspond to a
right-sided aliasing folding point of the second frequency-domain-to-time-domain
converter for a portion of the audio content encoded in the transform-coded-excitation-
linear-prediction-mode preceding a portion of the audio content encoded in the ACELP
mode. By applying linear-prediction-domain filter parameters, which correspond to the
aliasing folding points, an extremely efficient aliasing-cancellation can be obtained. Also,
the linear-prediction-domain filter parameters, which correspond to the aliasing folding
points, are typically easily obtainable as the aliasing folding points are often at the
transition from one frame to the next, such that the transmission of said linear-prediction-
domain filter parameters is required anyway. Accordingly, overheads are kept to a
minimum.
In a further embodiment, the audio signal decoder is configured to initialize memory
values of the aliasing-cancellation stimulus filter to zero for providing the aliasing-
cancellation synthesis signal, and to feed M samples of the aliasing-cancellation stimulus
signal into the aliasing-cancellation stimulus filter to obtain corresponding non-zero input
response samples of the aliasing-cancellation synthesis signal, and to further obtain a
plurality of zero-input response samples of the aliasing-cancellation synthesis signal. The

combiner is preferably configured to combine the time-domain representation of the audio
content with the non-zero input response samples and the subsequent zero-input response
samples, to obtain an aliasing-reduced time-domain signal at a transition from a portion of
the audio content encoded in the ACELP mode to a portion of the audio content encoded in
the TCX-LPD mode following the portion of the audio content encoded in the ACELP
mode. By exploiting both, the non-zero input response samples and the zero-input response
samples, a very good usage can be made of the aliasing-cancellation stimulus filter. Also, a
very smooth aliasing-cancellation synthesis signal can be obtained while keeping a number
of required samples of the aliasing-cancellation stimulus signal as small as possible.
Moreover, it has been found that a shape of the aliasing-cancellation synthesis signal is
very well-adapted to typical aliasing artifacts by using the above-mentioned concept. Thus,
a very good tradeoff between coding efficiency and aliasing-cancellation can be obtained.
In a preferred embodiment, the audio signal decoder is configured to combine a windowed
and folded version of at least a portion of a time-domain representation obtained using the
ACELP mode with a time-domain representation of a subsequent portion of the audio
content obtained using the TCX-LPD mode, to at least partially cancel an aliasing. It has
been found that the usage of such aliasing-cancellation mechanisms, in addition to the
generation of the aliasing cancellation synthesis signal, provides the possibility of
obtaining an aliasing-cancellation in a very bitrate efficient manner. In particular, the
required aliasing-cancellation stimulus signal can be encoded with high efficiency if the
aliasing-cancellation synthesis signal is supported, in the aliasing-cancellation, by the
windowed and folded version of at least a portion of a time-domain representation obtained
using the ACELP mode.
In a preferred embodiment, the audio signal decoder is configured to combine a windowed
version of a zero impulse response of the synthesis filter of the ACELP branch with a time-
domain representation of a subsequent portion of the audio content obtained using the
TCX-LPD mode, to at least partially cancel an aliasing. It has been found that the usage of
such a zero impulse response may also help to improve the coding efficiency of the
aliasing-cancellation stimulus signal, because the zero impulse response of the synthesis
filter of the ACELP branch typically cancels at least a part of the aliasing in the TCX-LPD-
encoded portion of the audio content. Accordingly, the energy of the aliasing-cancellation
synthesis signal is reduced, which, in turn, results in a reduction of the energy of the
aliasing-cancellation stimulus signal. However, encoding signals with a smaller energy is
typically possible with reduced bitrate requirements.

In a preferred embodiment, the audio signal decoder is configured to switch between a
TCX-LPD mode, in which a capped frequency-domain-to-time-domain transform is used,
a frequency-domain mode, in which a tapped frequency-domain-to time-domain transform
is used, as well as an algebraic-code-excited-linear-prediction mode. In this case, the audio
signal decoder is configured to at least partially cancel an aliasing at a transition between a
portion of the audio content encoded in the TCX-LPD mode and a portion of the audio
content encoded in the frequency-domain mode by performing an overlap-and-add
operation between time domain samples of subsequent overlapping portions of the audio
content. Also, the audio signal decoder is configured to at least partially cancel an aliasing
at a transition between a portion of the audio content encoded in the TCX-LPD mode and a
portion of the audio content encoded in the ACELP mode using the aliasing-cancellation
synthesis signal. It has been found that the audio signal decoder also is well-suited for
switching between different modes of operation, wherein the aliasing cancels very
efficiently.
In a preferred embodiment, the audio signal decoder is configured to apply a common gain
value for a gain scaling of a time-domain representation provided by the first frequency-
domain-to-time-domain converter of the transform domain path (for example, TCX-LPD
path) and for a gain scaling of the aliasing-cancellation stimulus signal or the aliasing-
cancellation synthesis signal. It has been found that a reuse of this common gain value both
for the scaling of the time-domain representation provided by the first frequency-domain-
to-time-domain converter and for the scaling of the aliasing-cancellation stimulus signal or
aliasing-cancellation synthesis signal allows for the reduction of bitrate required at a
transition between portions of the audio content encoded in different modes. This is very
important, as a bitrate requirement is increased by the encoding of the aliasing-cancellation
stimulus signal in the environment of a transition between portions of the audio content
encoded in the different modes.
In a preferred embodiment, the audio signal decoder is configured to apply, in addition to
the spectral shaping performed in dependence on at least the subset of linear-prediction-
domain parameters, a spectrum deshaping to at least a subset of the first set of spectral
coefficients. In this case, the audio signal decoder is configured to apply the spectrum de-
shaping to at least a subset of a set of aliasing-cancellation spectral coefficients from which
the aliasing-cancellation stimulus signal is derived. Applying a spectral deshaping both, to
the first set of spectral coefficients, and to the aliasing-cancellation spectral coefficients
from which the aliasing cancellation stimulus signal is derived, ensures that the aliasing
cancellation synthesis signal is well-adapted to the "main " audio content signal provided

by the first frequency-domain-to-time-domain converter. Again, the coding efficiency for
encoding the aliasing cancellation stimulus signal is improved.
In a preferred environment, the audio signal decoder comprises a second frequency-
domain-to-time-domain converter configured to obtain a time-domain representation of the
aliasing-cancellation stimulus signal in dependence on a set of spectral coefficients
representing the aliasing-cancellation stimulus signal. In this case, the first frequency-
domain-to-time-domain converter is configured to perform a lapped transform, which
comprises a time-domain aliasing. The second frequency-domain-to-time-domain
converter is configured to perform a non-lapped transform. Accordingly, a high coding
efficiency can be maintained by using the lapped transform for the "main " signal
synthesis. Nevertheless, the aliasing-cancellation achieved using an additional frequency-
domain-to-time-domain conversion, which is non-lapped. However, it has been found that
the combination of the lapped frequency-domain-to-time-domain conversion and the non-
lapped frequency-domain-to-time-domain conversion allows for a more efficient encoding
of transitions that a single non-lapped frequency-domain-to-time-domain transition.
An embodiment according to the invention creates an audio signal encoder for providing
an encoded representation of an audio content comprising a first set of spectral
coefficients, a representation of an aliasing-cancellation stimulus signal and a plurality of
linear-prediction-domain parameters on the basis of an input representation of the audio
content. The audio signal encoder comprises a time-domain-to-frequency-domain
converter configured to process the input representation of the audio content, to obtain a
frequency-domain representation of the audio content. The audio signal encoder also
comprises a spectral processor configured to apply a spectral shaping to a set of spectral
coefficients, or to a pre-processed version thereof, in dependence on a set of linear-
prediction-domain parameters for a portion of the audio content to be encoded in the
linear-prediction-domain, to obtain a spectrally-shaped frequency-domain representation of
the audio content. The audio signal encoder also comprises an aliasing-cancellation
information provider configured to provide a representation of an aliasing-cancellation
stimulus signal, such that a filtering of the aliasing-cancellation stimulus signal in
dependence on at least a subset of the linear prediction domain parameters results in an
aliasing-cancellation synthesis signal for cancelling aliasing artifacts in an audio signal
decoder.
The audio signal encoder discussed here is well-suited for cooperation with the audio
signal encoder described before. In particular, the audio signal encoder is configured to
provide a representation of the audio content in which a bitrate overhead required for

cancelling aliasing at transitions between portions (for example, frames or sub-frames) of
the audio content encoded in different modes is kept reasonably small.
Further embodiments according to the invention create a method for providing a decoded
representation of the audio content and a method for providing an encoded representation
of an audio content. Said methods are based on the same ideas as the apparatus discussed
above.
Embodiments according to the invention create computer programs for performing one of
said methods. The computer programs are also based on the same considerations.
Brief Description of the Figures
Embodiments according to the present invention will subsequently be described taking
reference to the enclosed figures, in which:
Fig. 1 shows a block schematic diagram of an audio signal encoder, according to
an embodiment of the invention;
Fig. 2 shows a block schematic diagram of an audio signal decoder, according to
an embodiment of the invention;
Fig. 3a shows a block schematic diagram of a reference audio signal decoder
according to working draft 4 of the Unified Speech and Audio Coding
(USAC) draft standard;
Fig. 3b shows a block schematic diagram of an audio signal decoder, according to
another embodiment of the invention;
Fig. 4 shows a graphical representation of a reference window transition according
to working draft 4 of the USAC draft standard;
Fig. 5 shows a schematic representation of window transitions which can be used
in an audio signal coding, according to an embodiment of the invention;
Fig. 6 shows a schematic representation providing an overview over all window
types used in an audio signal encoder according to an embodiment of the

invention or an audio signal decoder according to an embodiment of the
invention;
Fig. 7 shows a table representation of allowed window sequences, which may be
used in an audio signal encoder according to an embodiment of the
invention, or and audio signal decoder according to an embodiment of the
invention;
Fig. 8 shows a detailed block schematic diagram of an audio signal encoder,
according to an embodiment of the invention;
Fig. 9 shows a detailed block schematic diagram of an audio signal decoder
according to an embodiment of the invention;
Fig. 10 shows a schematic representation of forward-aliasing-cancellation (FAC)
decoding operations for transitions from and to ACELP;
Fig. 11 shows a schematic representation of a computation of an FAC target at an
encoder;
Fig. 12 shows a schematic representation of a quantization of an FAC target in the
context of a frequency-domain-noise-shaping (FDNS);
Table 1 shows conditions for the presence of a given LPC filter in a bitstream;
Fig. 13 shows a schematic representation of a principle of a weighted algebraic LPC
inverse quantizer;
Table 2 shows a representation of possible absolute and relative quantization modes
and corresponding bitstream signaling of "modelpc ";
Table 3 shows a table representation of coding modes for codebook numbers nk;
Table 4 shows a table representation of a normalization vector W for AVQ
quantization;
Table 5 shows a table representation of mapping for a mean excitation energy E ;

Table 6 shows a table representation of a number of spectral coefficients as a
function of "mod[];"
Fig. 14 shows a representation of a syntax of a frequency-domain channel stream
"fd_channel_stream()";
Fig. 15 shows a representation of a syntax of a linear-prediction-domain channel
stream "lpd_channel_stream()"; and
Fig. 16 shows a representation of a syntax of the forward aliasing-cancellation data
"fac_data()".
Detailed Description of the Embodiments
1. Audio Signal Decoder according to Fig. 1
Fig. 1 shows a block schematic diagram of an audio signal encoder 100, according to an
embodiment of the invention. The audio signal encoder 100 is configured to receive an
input representation 110 of an audio content and to provide, on the basis thereof, an
encoded representation 112 of the audio content. The encoded representation 112 of the
audio content comprises a first set 112a of spectral coefficients, a plurality of linear-
prediction-domain parameters 112b and a representation 112c of an aliasing-cancellation
stimulus signal.
The audio signal encoder 100 comprises a time-domain-to-frequency-domain converter
120 which is configured to process the input representation 110 of the audio content (or,
equivalently, a pre-processed version 110" thereof), to obtain a frequency-domain
representation 122 of the audio content (which may take the form of a set of spectral
coefficients).
The audio signal encoder 100 also comprises a spectral processor 130 which is configured
to apply a spectral shaping to the frequency-domain representation 122 of the audio
content, or to a pre-processed version 122" thereof, in dependence on a set 140 of linear-
prediction-domain parameters for a portion of the audio content to be encoded in the
linear-prediction-domain, to obtain a spectrally-shaped frequency-domain representation
132 of the audio content. The first set 112a of spectral coefficients may be equal to the
spectrally-shaped frequency-domain representation 132 of the audio content, or may be

derived from the spectrally-shaped frequency-domain representation 132 of the audio
content.
The audio signal encoder 100 also comprises an aliasing-cancellation information provider
150, which is configured to provide a representation 112c of an aliasing-cancellation
stimulus signal, such that a filtering of the aliasing-cancellation stimulus signal in
dependence on at least a subset of the linear-prediction-domain parameters 140 results in
an aliasing-cancellation synthesis signal for cancelling aliasing artifacts in an audio signal
decoder.
It should also be noted that the linear-prediction-domain parameters 112b may, for
example, be equal to the linear-prediction-domain parameters 140.
The audio signal encoder 110 provides information which is well-suited for a
reconstruction of the audio content, even if different portions (for example, frames or sub-
frames) of the audio content are encoded in different modes. For a portion of the audio
content encoded in the linear-prediction-domain, for example, in a transform-coded-
excitation linear-prediction-domain mode, the spectral shaping, which brings along a noise
shaping and therefore allows a quantization of the audio content with a comparatively
small bitrate, is performed after the time-domain-to-frequency-domain conversion. This
allows for an aliasing cancelling overlap-and-add of a portion of the audio content encoded
in the linear-prediction-domain with a preceding or subsequent portion of the audio content
encoded in a frequency-domain mode. By using the linear-prediction-domain parameters
140 for the spectral shaping, the spectral shaping is well-adapted to speech-like audio
contents, such that a particularly good coding efficiency can be obtained for speech-like
audio contents. Moreover, the representation of the aliasing-cancellation stimulus signal
allows for an efficient aliasing-cancellation at transitions from or towards a portion (for
example, frame or sub-frame) of the audio content encoded in the algebraic-code-excited-
linear-prediction mode. By providing the representation of the aliasing-cancellation
stimulus signal in dependence on the linear prediction domain parameters, a particularly
efficient representation of the aliasing-cancellation stimulus signal is obtained, which can
be decoded at the side of the decoder taking into consideration the linear-prediction-
domain parameters, wrhich are known at the decoder anyway.
To summarize, the audio signal encoder 100 is well-suited for enabling transitions between
portions of the audio content encoded in different coding modes and is capable of
providing an aliasing-cancellation information in a particularly compact form.

2, Audio Signal Decoder according to Fig. 2
Fig. 2 shows a block schematic diagram of an audio signal decoder 200 according to an
embodiment of the invention. The audio signal decoder 200 is configured to receive an
encoded representation 210 of the audio content and to provide, on the basis thereof, the
decoded representation 212 of the audio content, for example, in the form of an aliasing-
reduced-time-domain signal.
The audio signal decoder 200 comprises a transform domain path (for example, a
transform-coded-excitation linear-prediction-domain path) configured to obtain a time-
domain representation 212 of the audio content encoded in a transform domain mode on
the basis of a (first) set 220 of spectral coefficients, a representation 224 of an aliasing-
cancellation stimulus signal and a plurality of linear-prediction-domain parameters 222.
The transform domain path comprises a spectrum processor 230 configured to apply a
spectral shaping to the (first) set 220 of spectral coefficients in dependence on at least a
subset of the linear-prediction-domain parameters 222, to obtain a spectrally-shaped
version 232 of the first set 220 of spectral coefficients. The transform domain path also
comprises a (first) frequency-domain-to-time-domain converter 240 configured to obtain a
time-domain representation 242 of the audio content on the basis of the spectrally-shaped
version 232 of the (first) set 220 of spectral coefficients. The transform domain path also
comprises an aliasing-cancellation stimulus filter 250, which is configured to filter the
aliasing-cancellation stimulus signal (which is represented by the representation 224) in
dependence on at least a subset of the linear-prediction-domain parameters 222, to derive
an aliasing-cancellation synthesis signal 252 from the aliasing-cancellation stimulus signal.
The transform domain path also comprises a combiner 260 configured to combine the
time-domain representation 242 of the audio content (or, equivalently, a post-processed
version 242" thereof) with the aliasing-cancellation synthesis signal 252 (or, equivalently, a
post-processed version 252" thereof), to obtain the aliasing-reduced time-domain signal
212.
The audio signal decoder 200 may comprise an optional processing 270 for deriving the
setting of the spectrum processor 230, which performs, for example, a scaling and/or
frequency-domain noise shaping, from at least a subset of the linear-prediction-domain
parameters.
The audio signal decoder 200 also comprises an optional processing 280, which is
configured to derive the setting of the aliasing-cancellation stimulus filter 250, which may,

for example, perform a synthesis filtering for synthesizing the aliasing-cancellation
synthesis signal 252, from at least a subset of the linear-prediction-domain parameters 222.
The audio signal decoder 200 is configured to provide an aliasing-reduced time domain
signal 212, which is well-suited for a combination both, with a time-domain signal
representing an audio content and obtained in a frequency-domain mode of operation, and
to/in combination with a time-domain signal representing an audio content and encoded in
an ACELP mode of operation. Particularly good overlap-and-add characteristics exist
between portions (for example, frames) of the audio content decoded using a frequency-
domain mode of operation (using a frequency-domain path not shown in Fig. 2) and
portions (for example, a frame or sub-frame) of the audio content decoded using the
transform domain path of Fig. 2, as the noise shaping is performed by the spectrum
processor 230 in the frequency-domain, i.e. before the frequency-domain-to-time-domain
conversion 240. Moreover, particularly good aliasing-cancellations can also be obtained
between a portion (for example, a frame or sub-frame) of the audio content decoded using
the transform domain path of Fig. 2 and a portion (for example, a frame or sub-frame) of
the audio content decoded using an ACELP decoding path due to the fact that the aliasing-
cancellation synthesis signal 252 is provided on the basis of a filtering of an aliasing-
cancellation stimulus signal in dependence on linear-prediction-domain parameters. An
aliasing-cancellation synthesis signal 252, which is obtained in this manner, is typically
well-adapted to the aliasing artifacts which occur at the transition between a portion of the
audio content encoded in the TCX-LPD mode and a portion of the audio content encoded
in the ACELP mode. Further optional details regarding the operation of the audio signal
decoding will be described in the following.
3. Switched Audio Decoders according to Figs. 3a and 3b
In the following, the concept of a multi-mode audio signal decoder will briefly be
discussed taking reference to Figs. 3a and 3b.
3.1 Audio Signal Decoder 300 according to Fig. 3a
Fig. 3a shows a block schematic diagram of a reference multi-mode audio signal decoder,
and Fig. 3b shows a block schematic diagram of a multi-mode audio signal decoder,
according to an embodiment of the invention. In other words, Fig. 3a shows a basic
decoder signal flow of a reference system (for example, according to working draft 4 of the
USAC draft standard), and Fig. 3b shows a basic decoder signal flow of a proposed system
according to an embodiment of the invention.

The audio signal decoder 300 will be described first taking reference to Fig. 3a. The audio
signal decoder 300 comprises a bit multiplexer 310, which is configured to receive an input
bitstream and to provide the information included in the bitstream to the appropriate
processing units of the processing branches.
The audio signal decoder 300 comprises a frequency-domain mode path 320, which is
configured to receive a scale factor information 322 and an encoded spectral coefficient
information 324, and to provide, on the basis thereof, a time-domain representation 326 of
an audio frame encoded in the frequency-domain mode. The audio signal decoder 300 also
comprises a transform-coded-excitation-linear-prediction-domain path 330, which is
configured to receive an encoded transform-coded-excitation information 332 and a linear-
prediction coefficient information 334, (also designated as a linear-prediction coding
information, or as a linear-prediction-domain information or as a linear-prediction-coding
filter information) and to provide, on the basis thereof, a time-domain representation of an
audio frame or audio sub-frame encoded in the transform-coded-excitation-linear-
prediction-domain (TCX-LPD) mode. The audio signal decoder 300 also comprises an
algebraic-code-excited-linear-prediction (ACELP) path 340, which is configured to receive
an encoded excitation information 342 and a linear-prediction-coding information 344
(also designated as a linear prediction coefficient information or as a linear prediction
domain information or as a linear-prediction-coding filter information) and to provide, on
the basis thereof, a time-domain linear-prediction-coding information, to as representation
of an audio frame or audio sub-frame encoded in the ACELP mode. The audio signal
decoder 300 also comprises a transition windowing, which is configured to receive the
time-domain representations 326, 336, 346 of frames or sub-frames of the audio content
encoded in the different modes and to combine the time domain representation using a
transition windowing.
The frequency-domain path 320 comprises an arithmetic decoder 320a configured to
decode the encoded spectral representation 324, to obtain a decoded spectral representation
320b, an inverse quantizer 320d configured to provide an inversely quantized spectral
representation 320e on the basis of the decoded spectral representation 320b, a scaling
320e configured to scale the inversely quantized spectral representation 320d in
dependence on scale factors, to obtain a scaled spectral representation 320f and a (inverse)
modified discrete cosine transform 320g for providing a time-domain representation 326
on the basis of the scaled spectral representation 320f.

The TCX-LPD branch 330 comprises an arithmetic decoder 330a configured to provide a
decoded spectral representation 330b on the basis of the encoded spectral representation
332, an inverse quantizer 330c configured to provide an inversely quantized spectral
representation 330d on the basis of the decoded spectral representation 330b, a (inverse)
modified discrete cosine transform 330e for providing an excitation signal 3 3 Of on the
basis of the inversely quantized spectral representation 330d, and a linear-prediction-
coding synthesis filter 330g for providing the time-domain representation 336 on the basis
of the excitation signal 330f and the linear-prediction-coding filter coefficients 334 (also
sometimes designated as linear-prediction-domain filter coefficients).
The ACELP branch 340 comprises an ACELP excitation processor 340a configured to
provide an ACELP excitation signal 340b on the basis of the encoded excitation signal 342
and a linear-prediction-coding synthesis filter 340c for providing the time-domain
representation 346 on the basis of the ACELP excitation signal 340b and the linear-
prediction-coding filter coefficients 344.
3.2 Transition Windowing according to Fig. 4
Taking reference now to Fig. 4, the transition windowing 350 will be described in more
detail. First of all, the general framing structure of an audio signal decoder 300 will be
described. However, it should be noted that a very similar framing structure with only
minor differences, or even an identical general framing structure, will be used in the other
audio signal encoders or decoders described herein. It should also be noted that audio
frames typically comprise a length of N samples, wherein N may be equal to 2048.
Subsequent frames of the audio content may be overlapping by approximately 50%, for
example, by N/2 audio samples. An audio frame may be encoded in the frequency-domain,
such that the N time-domain samples of an audio frame are represented by a set of, for
example, N/2 spectral coefficients. Alternatively, the N time-domain samples of an audio
frame may also be represented by a plurality of, for example, eight sets of, for example,
128 spectral coefficients. Accordingly, a higher temporal resolution can be obtained.
If the N time-domain samples of an audio frame are encoded in the frequency-domain
mode using a single set of spectral coefficients, a single window such as, for example, a
so-called "STOP_START " window, a so-called "AAC Long " window, a so-called "AAC
Start " window, or a so-called "AAC Stop " window may be applied to window the time
domain samples 326 provided by the inverse modified discrete cosine transform 320g. In
contrast, a plurality of shorter windows, for example of the type "AAC Short ", may be
applied to window the time-domain representations obtained using different sets of spectral

coefficients, if the N time-domain samples of an audio frame are encoded using a plurality
of sets of spectral coefficients. For example, separate short windows may be applied to
time-domain representations obtained on the basis of individual sets of spectral coefficients
associated with a single audio frame.
An audio frame encoded in the linear-prediction-domain mode may be sub-divided into a
plurality of sub-frames, which are sometimes designated as "frames ". Each of the sub-
frames may be encoded either in the TCX-LPD mode or in the ACELP mode.
Accordingly, however, in the TCX-LPD mode, two or even four of the sub-frames may be
encoded together using a single set of spectral coefficients describing the transform
encoded excitation.
A sub-frame (or a group of two or four sub-frames) encoded in the TCX-LPD mode may
be represented by a set of spectral coefficients and one or more sets of linear-prediction-
coding filter coefficients. A sub-frame of the audio content encoded in the ACELP domain
may be represented by an encoded ACELP excitation signal and one or more sets of linear-
prediction-coding filter coefficients.
Taking reference now to Fig. 4, the implementation of transitions between frames or sub-
frames will be described. In the schematic representation of Fig. 4, abscissas 402a to 402i
describe a time in terms of audio samples, and ordinates 404a to 404i describe windows
and/or temporal regions for which time domain samples are provided.
At reference numeral 410, a transition between two overlapping frames encoded in the
frequency-domain is represented. At reference numeral 420, a transition from a sub-frame
encoded in the ACELP mode to a frame encoded in the frequency-domain mode is shown.
At reference numeral 430, a transition from a frame (or a sub-frame) encoded in the TCX-
LPD mode (also designated as "wLPT " mode) to a frame encoded in the frequency-
domain mode as illustrated. At reference numeral 440, a transition between a frame
encoded in the frequency-domain mode and a sub-frame encoded in the ACELP mode is
shown. At reference numeral 450, a transition between sub-frames encoded in the ACELP
mode is shown. At reference numeral 460, a transition from a sub-frame encoded in the
TCX-LPD mode to a sub-frame encoded in the ACELP mode is shown. At reference
numeral 470, a transition from a frame encoded in the frequency-domain mode to a sub-
frame encoded in the TCX-LPD mode is shown. At reference numeral 480, a transition
between a sub-frame encoded in the ACELP mode and a sub-frame encoded in the TCX-
LPD mode is shown. At reference numeral 490, a transition between sub-frames encoded
in the mode is shown.

Interestingly, the transition from the TCX-LPD mode to the frequency-domain mode,
which is shown at reference numeral 430, is somewhat inefficient or even TCX-LPD very
inefficient due to the fact that a part of the information transmitted to the decoder is
discarded. Similarly, transitions between the ACELP mode and the TCX-LPD mode,
which are shown at reference numerals 460 and 480, are implemented inefficiently due to
the fact that a part of the information transmitted to the decoder is discarded.
3.3 Audio Signal Decoder 360 according to Fig. 3b
In the following, the audio signal decoder 360, according to an embodiment of the
invention will be described.
The audio signal 360 comprises a bit multiplexer or bitstream parser 362, which is
configured to receive a bitstream representation 361 of an audio content and to provide, on
the basis thereof, information elements to a different branches of the audio signal decoder
360.
The audio signal decoder 360 comprises a frequency-domain branch 370 which receives an
encoded scale factor information 372 and an encoded spectral information 374 from the
bitstream multiplexer 362 and to provide, on the basis thereof, a time-domain
representation 376 of a frame encoded in the frequency-domain mode. The audio signal
decoder 360 also comprises a TCX-LPD path 380 which is configured to receive an
encoded spectral representation 382 and encoded linear-prediction-coding filter
coefficients 384 and to provide, on the basis thereof, a time-domain representation 386 of
an audio frame or audio sub-frame encoded in the TCX-LPD mode.
The audio signal decoder 360 comprises an ACELP path 390 which is configured to
receive an encoded ACELP excitation 392 and encoded linear-prediction-coding filter
coefficients 394 and to provide, on the basis thereof, a time-domain representation 396 of
an audio sub-frame encoded in the ACELP mode.
The audio signal decoder 360 also comprises a transition windowing 398, which is
configured to apply an appropriate transition windowing to the time-domain
representations 376, 386, 396 of the frames and sub-frames encoded in the different modes,
to derive a contiguous audio signal.

It should be noted here that the frequency-domain branch 370 may be identical in its
general structure and functionality to the frequency-domain branch 320, even though there
may be different or additional aliasing-cancellation mechanisms in the frequency-domain
branch 370. Moreover, the ACELP branch 390 may be identical to the ACELP branch 340
in its general structure and functionality, such that the above description also applies.
However, the TCX-LPD branch 380 differs from the TCX-LPD branch 330 in that the
noise-shaping is performed before the inverse-modified-discrete-cosine-transform in the
TCX-LPD branch 380. Also, the TCX-LPD branch 380 comprises additional aliasing
cancellation functionalities.
The TCX-LPD branch 380 comprises an arithmetic decoder 380a which is configured to
receive an encoded spectral representation 382 and to provide, on the basis thereof, a
decoded spectral representation 380b. The TCX-LPD branch 380 also comprises an inverse
quantizer 380c configured to receive the decoded spectral representation 380b and to
provide, on the basis thereof, an inversely quantized spectral representation 380d. The
TCX-LPD branch 380 also comprises a scaling and/or frequency-domain noise-shaping
380e which is configured to receive the inversely quantized spectral representation 380d
and a spectral shaping information 380f and to provide, on the basis thereof, a spectrally
shaped spectral representation 380g to an inverse modified-discrete-cosine-transform 380h,
which provides the time-domain representation 386 on the basis of the spectrally shaped
spectral representation 380g. The TCX-LPD branch 380 also comprises a linear-
prediction-coefficient-to-frequency-domain transformer 380i which is configured to
provide the spectral scaling information 380f on the basis of the linear-prediction-coding
filter coefficients 384.
Regarding the functionality of the audio signal decoder 360 it can be said that the
frequency-domain branch 370 and the TCX-LPD branch 380 are very similar in that each
of them comprises a processing chain having an arithmetic decoding, an inverse
quantization, a spectrum scaling and an inverse modified-discrete-cosine-transform in the
same processing order. Accordingly, the output signals 376, 386 of the frequency-domain
branch 370 and of the TCX-LPD branch 380 are very similar in that they may both be
unfiltered (with the exception of a transition windowing) output signals of the inverse
modified-discrete-cosine-transforms. Accordingly, the time-domain signals 376, 386 are
very well-suited for an overlap-and-add operation, wherein a time-domain aliasing-
cancellation is achieved by the overlap-and-add operation. Thus, transitions between an
audio frame encoded in the frequency-domain mode and an audio frame or audio sub-
frame encoded in the TCX-LPD mode can be efficiently performed by a simple overlap-

and-add operation without requiring any additional aliasing-cancellation information and
without discarding any information. Thus, a minimum amount of side information is
sufficient.
Moreover, it should be noted that the scaling of the inversely quantized spectral
representation, which is performed in the frequency-domain path 370 in dependence on a
scale factor information, effectively brings along a noise-shaping of the quantization noise
introduced by the encoder-sided quantization and the decoder-sided inverse quantization
320c, which noise-shaping is well-adapted to general audio signals such as, for example,
music signals. In contrast, the scaling and/or frequency-domain noise-shaping 380e, which
is performed in dependence on the linear-prediction-coding filter coefficients, effectively
brings along a noise-shaping of a quantization noise caused by an encoder-sided
quantization and the decoder-sided inverse quantization 380c, which is well-adapted to
speech-like audio signals. Accordingly, the functionality of the frequency-domain branch
370 and of the TCX-LPD branch 380 merely differs in that different noise-shaping is
applied in the frequency-domain, such that a coding efficiency (or audio quality) is
particularly good for general audio signals when using the frequency-domain branch 370,
and such that a coding efficiency or audio quality is particularly high for speech-like audio
signals when using the TCX-LPD branch 380.
It should be noted that the TCX-LPD branch 380 preferably comprises additional aliasing-
cancellation mechanisms for transitions between audio frames or audio sub-frames
encoded in the TCX-LPD mode and in the ACELP mode. Details will be described below.
3.4 Transition Windowing according to Fig. 5
Fig. 5 shows a graphic representation of an example of an envisioned windowing scheme,
which may be applied in the audio signal decoder 360 or in any other audio signal encoders
and decoders according to the present invention. Fig. 5 represents a windowing at possible
transitions between frames or sub-frames encoded in different of the nodes. Abscissas 502a
to 502i describe a time in terms of audio samples and ordinates 504a to 504i describe
windows or sub-frames for providing a time-domain representation of an audio content.
A graphical representation at reference numeral 510 shows a transition between subsequent
frames encoded in the frequency-domain mode. As can be seen, a time-domain samples
provided for a first right half of a frame (for example, by an inverse modified discrete
cosine transform (MDCT) 320g) are windowed by a right half 512 of a window, which
may, for example, be of window type "AAC Long " or of window type "AAC Stop ".

Similarly, the time-domain samples provided for a left half of a subsequent second frame
(for example, by the MDCT 320g) may be windowed using a left half 514 of a window,
which may, for example, be of window type "AAC Long " or "AAC Start ". The right half
512 may, for example, comprise a comparatively long right sided transition slope and the
left half 514 of the subsequent window may comprise a comparatively long left sided
transition slope. A windowed version of the time-domain representation of the first audio
frame (windowed using the right window half 512) and a windowed version of the time-
domain representation of the subsequent second audio frame (windowed using the left
window half 514) may be overlapped and added. Accordingly, aliasing, which arises from
the MDCT, may be efficiently cancelled.
A graphical representation at reference numeral 520 shows a transition from a sub-frame
encoded in the ACELP mode to a frame encoded in the frequency-domain mode. A
forward-aliasing-cancellation may be applied to reduce aliasing artifacts at such a
transition.
A graphical representation at reference numeral 530 shows a transition from a sub-frame
encoded in the TCX-LPD mode to a frame encoded in the frequency-domain mode. As can
be seen , a window 532 is applied to the time-domain samples provided by the inverse
MDCT 380h of the TCX-LPD path, which window 532 may, for example, be of window
type "TCX256 ", "TCX512 ", or "TCX1024 ". The window 532 may comprise a right-
sided transition slope 533 of length 128 time-domain samples. A window 534 is applied to
time-domain samples provided by the MDCT of the frequency-domain path 370 for the
subsequent audio frame encoded in the frequency-domain mode. The window 534 may, for
example, be of window type "Stop Start " or "AAC Stop ", and may comprise a left-sided
transition slope 535 having a length of, for example, 128 time-domain samples. The time-
domain samples of the TCX-LPD mode sub-frame which are windowed by the right-sided
transition slope 533 are overlapped and added with the time-domain samples of the
subsequent audio frame encoded in the frequency-domain mode which are windowed by
the left-sided transition slope 535. The transition slopes 533 and 535 are matched, such that
an aliasing-cancellation is obtained at the transition from the TCX-LPD-mode-encoded
sub-frame and the subsequent frequency-domain-mode-encoded sub-frame. The aliasing-
cancellation is made possible by the execution of the scaling/frequency-domain noise-
shaping 380e before the execution of the inverse MDCT 380h. In other words, the aliasing-
cancellation is caused by the fact that both, the inverse MDCT 320g of the frequency-
domain path 370 and the inverse MDCT 380h of the TCX-LPD path 380 are fed with
spectral coefficients to which the noise-shaping has already been applied (for example, in

the form of the scaling factor-dependent scaling and the LPC filter coefficient dependent
scaling).
A graphical representation at reference numeral 540 shows a transition from an audio
frame encoded in the frequency-domain mode to a sub-frame encoded in the ACELP
mode. As can be seen, a forward aliasing-cancellation (FAC) is applied in order to reduce,
or even eliminate, aliasing artifacts at said transition.
A graphical representation at reference numeral 550 shows a transition from an audio sub-
frame encoded in the ACELP mode to another audio sub-frame encoded in the ACELP
mode. No specific aliasing-cancellation processing is required here in some embodiments.
A graphical representation at reference numeral 560 shows a transition from a sub-frame
encoded in the TCX-LPD mode (also designated as wLPT mode) to an audio sub-frame
encoded in the ACELP mode. As can be seen, time-domain samples provided by the
MDCT 380h of the TCX-LPD branch 380 are windowed using a window 562, which may,
for example, be of window type "TCX256 ", "TCX512 " or "TCX1024 ". Window 562
comprises a comparatively short right-sided transition slope 563. Time-domain samples
provided for the subsequent audio sub-frame encoded in the ACELP mode comprise a
partial temporal overlap with audio samples provided for the preceding TCX-LPD-mode-
encoded audio sub-frame which are windowed by the right-sided transition slope 563 of
the window 562. Time-domain audio samples provided for the audio sub-frame encoded in
the ACELP mode are illustrated by a block at reference numeral 564.
As can be seen, a forward aliasing-cancellation signal 566 is added at the transition from
the audio frame encoded in the TCX-LPD mode to the audio frame encoded in the ACELP
mode in order to reduce or even eliminate aliasing artifacts. Details regarding the provision
of the aliasing-cancellation signal 566 will be described below.
A graphical representation at reference numeral 570 shows a transition from a frame
encoded in the frequency-domain mode to a subsequent frame encoded in the TCX-LPD
mode. Time-domain samples provided by the inverse MDCT 320g of the frequency-
domain branch 370 may be windowed by a window 572 having a comparatively short
right-sided transition slope 573, for example, by a window of type "Stop Start " or a
window of type "AAC Start ". A time-domain representation provided by the inverse
MDCT 380h of the TCX-LPD branch 380 for the subsequent audio sub-frame encoded in
the TCX-LPD mode may be windowed by a window 574 comprising a comparatively short
left-sided transition slope 575, which window 574 may, for example, be of window type

"TCX256 ", TCX512 ", or "TCX1024 ". Time-domain samples windowed by the right-
sided transition slope 573 and time-domain samples windowed by the left-sided transition
slope 575 are overlapped and added by the transition windowing 398, such that aliasing
artifacts are reduced, or even eliminated. Accordingly, no additional side information is
required for performing a transition from an audio frame encoded in the frequency-domain
mode to an audio sub-frame encoded in the TCX-LPD mode.
A graphical representation at reference numeral 580 shows a transition from an audio
frame encoded in the ACELP mode to an audio frame encoded in the TCX-LPD mode
(also designated as wLPT mode). A temporal region for which time-domain samples are
provided by the ACELP branch is designated with 582. A window 584 is applied to time-
domain samples provided by the inverse MDCT 380h of the TCX-LPD branch 380.
Window 584, which may be of type "TCX256 ", TCX512 ", or "TCX1024 ", may
comprise a comparatively short left-sided transition slope 585. The left-sided transition
slope 585 of the window 584 partially overlaps with the time-domain samples provided by
the ACELP branch, which are represented by the block 582. In addition, an aliasing-
cancellation signal 586 is provided to reduce, or even eliminate, aliasing artifacts which
occur at the transition from the audio sub-frame encoded in the ACELP mode to the audio
sub-frame encoded in the TCX-LPD mode. Details regarding the provision of the aliasing-
cancellation signal 586 will be discussed below.
A schematic representation at reference numeral 590 shows a transition from an audio sub-
frame encoded in the TCX-LPD mode to another audio sub-frame encoded in the TCX-
LPD mode. Time-domain samples of a first audio sub-frame encoded in the TCX-LPD
mode are windowed using a window 592, which may, for example, be of type "TCX256 ",
TCX512 ", or "TCX1024 ", and which may comprise a comparatively short right-sided
transition slope 593. Time-domain audio samples of a second audio sub-frame encoded in
the TCX-LPD mode, which are provided by the inverse MDCT 380h of the TCX-LPD
branch 380 are windowed, for example, using a window 594 which may be of the window
type "TCX256 ", TCX512 ", or "TCX1024 " and which may comprise a comparatively
short left-sided transition slope 595. Time-domain samples windowed using the right-sided
transitional slope 593 and time-domain samples windowed using the left-sided transition
slope 595 are overlapped and added by the transitional windowing 398. Accordingly,
aliasing, which is caused by the (inverse) MDCT 380h is reduced, or even eliminated.
4. Overview over all Window Types

In the following, an overview of all window types will be provided. For this purpose,
reference is made to Fig. 6, which shows a graphical representation of the different
window types and their characteristics. In the table of Fig. 6, a column 610 describes a left-
sided overlap length, which may be equal to a length of a left-sided transition slope. The
column 612 describes a transform length, i.e. a number of spectral coefficients used to
generate the time-domain representation which is windowed by the respective window.
The column 614 describes a right-sided overlap length, which may be equal to a length of a
right-sided transition slope. A column 616 describes a name of the window type. The
column 618 shows a graphical representation of the respective window.
A first row 630 shows the characteristics of a window of type "AAC Short". A second row
632 shows the characteristics of a window of type "TCX256 ". A third row 634 shows the
characteristics of a window of type "TCX512 ". A fourth row 636 shows the characteristics
of windows of types "TCX1024 " and "Stop Start ". A fifth row 638 shows the
characteristics of a window of type "AAC Long ". A sixth row 640 shows the
characteristics of a window of type "AAC Start ", and a seventh row 642 shows the
characteristics of a window of type "AAC Stop ".
Notably, the transition slopes of the windows of types "TCX256 ", TCX512 ", and
"TCX1024 " are adapted to the right-sided transition slope of the window of type "AAC
Start " and to the left-sided transition slope of the window of type "AAC Stop ", in order to
allow for a time-domain aliasing-cancellation by overlapping and adding time-domain
representations windowed using different types of windows. In a preferred embodiment,
the left-sided window slopes (transition slopes) of all of the window types having identical
left-sided overlap lengths may be identical, and the right-sided transition slopes of all
window types having identical right-sided overlap lengths may be identical. Also, left-
sided transition slopes and right-sided transition slopes having an identical overlap lengths
may be adapted to allow for an aliasing-cancellation, fulfilling the conditions for the
MDCT aliasing-cancellation.
5. Allowed Window Sequences
In the following, allowed window sequences will be described, taking reference to Fig. 7,
which shows a table representation of such allowed windowed sequences. As can be seen
from the table of Fig. 7, an audio frame encoded in the frequency-domain mode, the time-
domain samples of which are windowed using a window of type "AAC Stop ", may be
followed by an audio frame encoded in the frequency-domain mode, the time-domain

samples of which are windowed using a window of type "AAC Long " or a window of
type "AAC Start ".
An audio frame encoded in the frequency-domain mode, the time-domain samples of
which are windowed using a window of type "AAC Long " may be followed by an audio
frame encoded in the frequency-domain mode, the time-domain samples of which are
windowed using a window of type "AAC Long " or "AAC Start".
Audio frames encoded in the linear prediction mode, the time-domain samples of which
are windowed using a window of type "AAC Start ", using eight windows of type "AAC
Short " or using a window of type "AAC StopStart ", may be followed by an audio frame
encoded in the frequency-domain mode, the time-domain samples of which are windowed
using eight windows of type "AAC Short", using a window of type "AAC Short" or using
a window of type "AAC StopStart". Alternatively, audio frames encoded in the frequency-
domain mode, the time-domain samples of which are windowed using a window of type
"AAC Start ", using eight windows of type "AAC Short " or using a window of type
"AAC StopStart " may be followed by an audio frame or sub-frame encoded in the TCX-
LPD mode (also designated as LPD-TCX) or by an audio frame or audio sub-frame
encoded in the ACELP mode (also designated as LPD ACELP).
An audio frame or audio sub-frame encoded in the TCX-LPD mode may be followed by
audio frames encoded in the frequency-domain mode, the time-domain samples of which
are windowed using eight "AAC Short " windows, and using "AAC Stop " window or
using an "AAC StopStart " window, or by an audio frame or audio sub-frame encoded in
the TCX-LPD mode or by an audio frame or audio sub-frame encoded in the ACELP
mode.
An audio frame encoded in the ACELP mode may be followed by audio frames encoded in
the frequency-domain mode, the time-domain samples of which are windowed using eight
"AAC Short " windows, using an "AAC Stop " window, using an "AAC StopStart "
window, by an audio frame encoded in the TCX-LPD mode or by an audio frame encoded
in the ACELP mode.
For transitions from an audio frame encoded in the ACELP mode towards an audio frame
encoded in the frequency-domain mode or towards an audio frame encoded in the TCX-
LPD mode, a so-called forward-aliasing-cancellation (FAC) is performed. Accordingly, an
aliasing-cancellation synthesis signal is added to the time-domain representation at such a
frame transition, whereby aliasing artifacts are reduced, or even eliminated. Similarly, a

FAC is also performed when switching from a frame or sub-frame encoded in the
frequency-domain mode, or from a frame or sub-frame encoded in the TCX-LPD mode, to
a frame or sub-frame encoded in the ACELP mode.
Details regarding the FAC will be discussed below.
6. Audio Signal Encoder according to Fie. 8
In the following, a multi-mode audio signal encoder 800 will be described taking reference
to Fig. 8.
The audio signal encoder 800 is configured to receive an input representation 810 of an
audio content and to provide, on the basis thereof, a bitstream 812 representing the audio
content. The audio signal encoder 800 is configured to operate in different modes of
operation, namely a frequency-domain mode, a transform-coded-excitation-linear-
prediction-domain mode and an algebraic-code-excited-linear-prediction-domain mode.
The audio signal encoder 800 comprises and encoding controller 814 which is configured
to select one of the modes for encoding a portion of the audio content in dependence on
characteristics of the input representation 810 of the audio content and/or in dependence on
an achievable encoding efficiency or quality.
The audio signal encoder 800 comprises a frequency-domain branch 820 which is
configured to provide encoded spectral coefficients 822, encoded scale factors 824, and
optionally, encoded aliasing-cancellation coefficients 826, on the basis of the input
representation 810 of the audio content. The audio signal encoder 800 also comprises a
TCX-LPD branch 850 configured to provide encoded spectral coefficients 852, encoded
linear-prediction-domain parameters 854 and encoded aliasing-cancellation coefficients
856, in dependence on the input representation 810 of the audio content. The audio signal
decoder 800 also comprises an ACELP branch 880 which is configured to provide an
encoded ACELP excitation 882 and encoded linear-prediction-domain parameters 884 in
dependence on the input representation 810 of the audio content.
The frequency-domain branch 820 comprises a time-domain-to-frequency-domain
conversion 830 which is configured to receive the input representation 810 of the audio
content, or a pre-processed version thereof, and to provide, on the basis thereof, a
frequency-domain representation 832 of the audio content. The frequency-domain branch
820 also comprises a psychoacoustic analysis 834, which is configured to evaluate
frequency masking effects and/or temporal masking effects of the audio content, and to

provide, on the basis thereof, a scale factor information 836 describing scale factors. The
frequency-domain branch 820 also comprises a spectral processor 838 configured to
receive the frequency-domain representation 832 of the audio content and the scale factor
information 836 and to apply a frequency-dependent and time-dependent scaling to the
spectral coefficients of the frequency-domain representation 832 in dependence on the
scale factor information 836, to obtain a scaled frequency-domain representation 840 of the
audio content. The frequency-domain branch also comprises a quantization/encoding 842
configured to receive the scaled frequency-domain representation 840 and to perform a
quantization and an encoding in order to obtain the encoded spectral coefficients 822 on
the basis of the scaled frequency-domain representation 840. The frequency-domain
branch also comprises a quantization/encoding 844 configured to receive the scale factor
information 836 and to provide, on the basis thereof, an encoded scale factor information
824. Optionally, the frequency-domain branch 820 also comprises an aliasing-cancellation
coefficient calculation 846 which may be configured to provide the aliasing-cancellation
coefficients 826.
The TCX-LPD branch 850 comprises a time-domain-to-frequency-domain conversion 860,
which may be configured to receive the input representation 810 of the audio content, and
to provide on the basis thereof, a frequency-domain representation 861 of the audio
content. The TCX-LPD branch 850 also comprises a linear-prediction-domain-parameter
calculation 862 which is configured to receive the input representation 810 of the audio
content, or a pre-processed version thereof, and to derive one or more linear-prediction-
domain parameters (for example, linear-prediction-coding-filter-coefficients) 863 from the
input representation 810 of the audio content. The TCX-LPD branch 850 also comprises a
linear-prediction-domain-to-spectral domain conversion 864, which is configured to
receive the linear-prediction-domain parameters (for example, the linear-prediction-coding
filter coefficients) and to provide a spectral-domain representation or frequency-domain
representation 865 on the basis thereof. The spectral-domain representation or frequency-
domain representation of the linear-prediction-domain parameters may, for example,
represent a filter response of a filter defined by the linear-prediction-domain parameters in
a frequency-domain or spectral-domain. The TCX-LPD branch 850 also comprises a
spectral processor 866, which is configured to receive the frequency-domain representation
861, or a pre-processed version 861" thereof, and the frequency-domain representation or
spectral domain representation of the linear-prediction-domain parameters 863. The
spectral processor 866 is configured to perform a spectral shaping of the frequency-domain
representation 861, or of the pre-processed version 861" thereof, wherein the frequency-
domain representation or spectral domain representation 865 of the linear-prediction-
domain parameters 863 serves to adjust the scaling of the different spectral coefficients of

the frequency-domain representation 861 or of the pre-processed version 861" thereof.
Accordingly, the spectral processor 866 provides a spectrally shaped version 867 of the
frequency-domain representation 861 or of the pre-processed version 861" thereof, in
dependence on the linear-prediction-domain parameters 863. The TCX-LPD branch 850
also comprises a quantization/encoding 868 which is configured to receive the spectrally
shaped frequency-domain representation 867 and to provide, on the basis thereof, encoded
spectral coefficients 852. The TCX-LPD branch 850 also comprises another
quantization/encoding 869, which is configured to receive the linear-prediction-domain
parameters 863 and to provide, on the basis thereof, the encoded linear-prediction-domain
parameters 854.
The TCX-LPD branch 850 further comprises an aliasing-cancellation coefficient provision
which is configured to provide the encoded aliasing-cancellation coefficients 856. The
aliasing cancellation coefficient provision comprises an error computation 870 which is
configured to compute an aliasing error information 871 in dependence on the encoded
spectral coefficients, as well as in dependence on the input representation 810 of the audio
content. The error computation 870 may optionally take into consideration an information
872 regarding additional aliasing-cancellation components, which can be provided by other
mechanisms. The aliasing-cancellation coefficient provision also comprises an analysis
filter computation 873 which is configured to provide an information 873a describing an
error filtering in dependence on the linear-prediction-domain parameters 863. The aliasing-
cancellation coefficient provision also comprises an error analysis filtering 874, which is
configured to receive the aliasing error information 871 and the analysis filter
configuration information 873a, and to apply an error analysis filtering, which is adjusted
in dependence on the analysis filtering information 873a, to the aliasing error information
871, to obtain a filtered aliasing error information 874a. The aliasing-cancellation
coefficient provision also comprises a time-domain-to-frequency-domain conversion 875,
which may take the functionality of a discrete cosine transform of type IV, and which is
configured to receive the filtered aliasing error information 874a and to provide, on the
basis thereof, a frequency-domain representation 875a of the filtered aliasing error
information 874a. The aliasing-cancellation coefficient provision also comprises a
quantization/encoding 876 which is configured to receive the frequency-domain
representation 875a and, to provide on the basis thereof, encoded aliasing-cancellation
coefficients 856, such that the encoded aliasing-cancellation coefficients 856 encode the
frequency-domain representation 875a.
The aliasing-cancellation coefficient provision also comprises an optional computation 877
of an ACELP contribution to an aliasing-cancellation. The computation 877 may be

configured to compute or estimate a contribution to an aliasing-cancellation which can be
derived from an audio sub-frame encoded in the ACELP mode which precedes an audio
frame encoded in the TCX-LPD mode. The computation of the ACELP contribution to the
aliasing-cancellation may comprise a computation of a post-ACELP synthesis, a
windowing of the post-ACELP synthesis and a folding of the windowed post-ACELP
synthesis, to obtain the information 872 regarding the additional aliasing-cancellation
components, which may be derived from a preceding audio sub-frame encoded in the
ACELP mode. In addition, or alternatively, the computation 877 may comprise a
computation of a zero-input response of a filter initialized by a decoding of a preceding
audio sub-frame encoded in the ACELP mode and a windowing of said zero-input
response, to obtain the information 872 about the additional aliasing-cancellation
components.
In the following, the ACELP branch 880 will briefly be discussed. The ACELP branch 880
comprises a linear-prediction-domain parameter calculation 890 which is configured to
compute linear-prediction-domain parameters 890a on the basis of the input representation
810 of the audio content. The ACELP branch 880 also comprises an ACELP excitation
computation 892 configured to compute an ACELP excitation information 892 in
dependence on the input representation 810 of the audio content and the linear-prediction-
domain parameters 890a. The ACELP branch 880 also comprises an encoding 894
configured to encode the ACELP excitation information 892, to obtain the encoded
ACELP excitation 882. In addition, the ACELP branch 880 also comprises a
quantization/encoding 896 configured to receive the linear-prediction-domain parameters
890a and to provide, on the basis thereof, the encoded linear-prediction-domain parameters
884.
The audio signal decoder 800 also comprises a bitstream formatter 898 which is configured
to provide the bitstream 812 on the basis of the encoded spectral coefficients 822, the
encoded scale factor information 824, the aliasing-cancellation coefficients 826, the
encoded spectral coefficients 852, the encoded linear-prediction-domain parameters 852,
the encoded aliasing-cancellation coefficients 856, the encoded ACELP excitation 882, and
the encoded linear-prediction-domain parameters 884.
Details regarding the provision of the encoded aliasing-cancellation coefficients 852 will
be described below.
7. Audio Signal Decoder according to Fig. 9

In the following, an audio signal decoder 900 according to Fig. 9 will be described.
The audio signal decoder 900 according to Fig. 9 is similar to the audio signal decoder 200
according to Fig. 2 and also to the audio signal decoder 360 according to Fig. 3b, such that
the above explanations also hold.
The audio signal decoder 900 comprises a bit multiplexer 902 which is configured to
receive a bitstrearn and to provide information extracted from the bitstream to the
corresponding processing paths.
The audio signal decoder 900 comprises a frequency-domain branch 910, which is
configured to receive encoded spectral coefficients 912 and an encoded scale factor
information 914. The frequency-domain branch 910 is optionally configured to also
receive encoded aliasing-cancellation coefficients, which allow for a so-called forward-
aliasing-cancellation, for example, at a transition between an audio frame encoded in the
frequency-domain mode and an audio frame encoded in the ACELP mode. The frequency-
domain path 910 provides a time-domain representation 918 of the audio content of the
audio frame encoded in the frequency-domain mode.
The audio signal decoder 900 comprises a TCX-LPD branch 930, which is configured to
receive encoded spectral coefficients 932, encoded linear-prediction-domain parameters
934 and encoded aliasing-cancellation coefficients 936, and to provide, on the basis
thereof, a time-domain representation of an audio frame or a sub-frame encoded in the
TCX-LPD mode. The audio signal decoder 900 also comprises an ACELP branch 980,
which is configured to receive an encoded ACELP excitation 982 and encoded linear-
prediction-domain parameters 984, and to provide, on the basis thereof, a time-domain
representation 986 of an audio frame or audio sub-frame encoded in the ACELP mode.
7.1 Frequency Domain Path
In the following, details regarding the frequency domain path 910 will be described. It
should be noted that the frequency-domain path is similar to the frequency-domain path
320 of the audio decoder 300, such that reference is made to the above description. The
frequency-domain branch 910 comprises an arithmetic decoding 920, which receives the
encoded spectral coefficients 912 and provides, on the basis thereof, the coded spectral
coefficients 920a, and an inverse quantization 921 which receives the decoded spectral
coefficients 920a, and provides, on the basis thereof, inversely quantized spectral
coefficients 921a. The frequency-domain branch 910 also comprises a scale factor

decoding 922, which receives the encoded scale factor information and provides, on the
basis thereof, a decoded scale factor information 922a. The frequency-domain branch
comprises a scaling 923 which receives the inversely quantized spectral coefficients 921 a
and scales the inversely quantized spectral coefficients in accordance with the scale factors
922a, to obtain scaled spectral coefficients 923 a. For example, scale factors 922a may be
provided for a plurality of frequency bands, wherein a plurality of frequency bins of the
spectral coefficients 921a are associated to each frequency-band. Accordingly, frequency
band-wise scaling of the spectral coefficients 921a may be performed. Thus, a number of
scale factors associated with an audio frame is typically smaller than a number of spectral
coefficients 921a associated with the audio frame. The frequency-domain branch 910 also
comprises an inverse MDCT 924, which is configured to receive the scaled spectral
coefficients 923 a and to provide, on the basis thereof, a time-domain representation 924a
of the audio content of the current audio frame. The frequency domain branch 910 also,
optionally, comprises a combining 925, which is configured to combine the time-domain
representation 924a with an aliasing-cancellation synthesis signal 929a, to obtain the time-
domain representation 918. However, in some other embodiments the combining 925 may
be omitted, such that the time-domain representation 924a is provided as the time-domain
representation 918 of the audio content.
In order to provide the aliasing-cancellation synthesis signal 929a, the frequency-domain
path comprises a decoding 926a, which provides decoded aliasing-cancellation coefficients
926b, on the basis of the encoded aliasing-cancellation coefficients 916, and a scaling 926c
of aliasing-cancellation coefficients, which provides scaled aliasing-cancellation
coefficients 926d on the basis of the decoded aliasing-cancellation coefficients 926b. The
frequency-domain path also comprises an inverse discrete-cosine-transform of type IV
927, which is configured to receive the scaled aliasing-cancellation coefficients 926d, and
to provide, on the basis thereof, an aliasing-cancellation stimulus signal 927a, which is
input into a synthesis filtering 927b. The synthesis filtering 927b is configured to perform a
synthesis filtering operation on the basis of the aliasing-cancellation stimulus signal 927a
and in dependence on synthesis filtering coefficients 927c, which are provided by a
synthesis filter computation 927d, to obtain, as a result of the synthesis filtering, the
aliasing-cancellation signal 929a. The synthesis filter computation 927d provides the
synthesis filter coefficients 927c in dependence on the linear-prediction-domain
parameters, which may be derived, for example, from linear-prediction-domain parameters
provided in the bitstream for a frame encoded in the TCX-LPD mode, or for a frame
provided in the ACELP mode (or may be equal to such linear-prediction-domain
parameters).

Accordingly, the synthesis filtering 927b is capable of providing the aliasing-cancellation
synthesis signal 929a, which may be equivalent to the aliasing-cancellation synthesis signal
522 shown in Fig. 5, or to the aliasing-cancellation synthesis signal 542 shown in Fig. 5.
7.2 TCX-LPD Path
In the following, the TCX-LPD path of the audio signal decoder 900 will briefly be
discussed. Further details will be provided below.
The TCX-LPD path 930 comprises a main signal synthesis 940 which is configured to
provide a time-domain representation 940a of the audio content of an audio frame or audio
sub-frame on the basis of the encoded spectral coefficients 932 and the encoded linear-
prediction-domain parameters 934. The TCX-LPD branch 930 also comprises an aliasing-
cancellation processing which will be described below.
The main signal synthesis 940 comprises an arithmetic decoding 941 of spectral
coefficients, wherein the decoded spectral coefficients 941a are obtained on the basis of the
encoded spectral coefficients 932. The main signal synthesis 940 also comprises an inverse
quantization 942, which is configured to provide inversely quantized spectral coefficients
942a on the basis of the decoded spectral coefficients 941a. An optional noise filling 943
may be applied to the inversely quantized spectral coefficients 942a to obtain noise-filled
spectral coefficients. The inversely quantized and noise-filled spectral coefficient 943a
may also be designated with r[i]. The inversely quantized and noise-filled spectral
coefficients 943a, r[i] may be processed by a spectrum de-shaping 944, to obtain spectrum
de-shaped spectral coefficients 944a, which are also sometimes designated with r[i]. A
scaling 945 may be configured as a frequency-domain noise shaping 945. In the frequency-
domain noise-shaping 945, a spectrally shaped set of spectral coefficients 945a are
obtained, which are also designated with rr[i]. In the frequency-domain noise-shaping 945,
contributions of the spectrally de-shaped spectral coefficients 944a onto the spectrally
shaped spectral coefficients 945a are determined by frequency-domain noise-shaping
parameters 945b, which are provided by a frequency-domain noise-shaping parameter
provision which will be discussed in the following. By means of the frequency-domain
noise-shaping 945, spectral coefficients of the spectrally de-shaped set of spectral
coefficients 944a are given a comparatively large weight, if a frequency-domain response
of a linear-prediction filter described by the linear-prediction-domain parameters 934 takes
a comparatively small value for the frequency associated with the respective spectral
coefficient (out of the set 944a of spectral coefficients) under consideration.

Documents

Orders

Section	Controller	Decision Date

Application Documents

#	Name	Date
1	923-kolnp-2012-(19-04-2012)-SPECIFICATION.pdf	2012-04-19
1	923-KOLNP-2012-FORM-27 [05-08-2024(online)].pdf	2024-08-05
1	923-KOLNP-2012-Response to office action [11-02-2025(online)].pdf	2025-02-11
2	923-kolnp-2012-(19-04-2012)-PCT SEARCH REPORT & OTHERS.pdf	2012-04-19
2	923-KOLNP-2012-FORM-27 [05-08-2024(online)].pdf	2024-08-05
2	923-KOLNP-2012-RELEVANT DOCUMENTS [26-09-2023(online)].pdf	2023-09-26
3	923-kolnp-2012-(19-04-2012)-INTERNATIONAL PUBLICATION.pdf	2012-04-19
3	923-KOLNP-2012-RELEVANT DOCUMENTS [07-09-2023(online)].pdf	2023-09-07
3	923-KOLNP-2012-RELEVANT DOCUMENTS [26-09-2023(online)].pdf	2023-09-26
4	923-KOLNP-2012-RELEVANT DOCUMENTS [29-08-2023(online)].pdf	2023-08-29
4	923-KOLNP-2012-RELEVANT DOCUMENTS [07-09-2023(online)].pdf	2023-09-07
4	923-kolnp-2012-(19-04-2012)-FORM-5.pdf	2012-04-19
5	923-KOLNP-2012-RELEVANT DOCUMENTS [29-08-2023(online)].pdf	2023-08-29
5	923-KOLNP-2012-PROOF OF ALTERATION [24-05-2023(online)].pdf	2023-05-24
5	923-kolnp-2012-(19-04-2012)-FORM-3.pdf	2012-04-19
6	923-KOLNP-2012-PROOF OF ALTERATION [24-05-2023(online)].pdf	2023-05-24
6	923-KOLNP-2012-IntimationOfGrant17-06-2021.pdf	2021-06-17
6	923-kolnp-2012-(19-04-2012)-FORM-2.pdf	2012-04-19
7	923-KOLNP-2012-PatentCertificate17-06-2021.pdf	2021-06-17
7	923-KOLNP-2012-IntimationOfGrant17-06-2021.pdf	2021-06-17
7	923-kolnp-2012-(19-04-2012)-FORM-1.pdf	2012-04-19
8	923-kolnp-2012-(19-04-2012)-DRAWINGS.pdf	2012-04-19
8	923-KOLNP-2012-Information under section 8(2) [16-06-2021(online)].pdf	2021-06-16
8	923-KOLNP-2012-PatentCertificate17-06-2021.pdf	2021-06-17
9	923-kolnp-2012-(19-04-2012)-DESCRIPTION (COMPLETE).pdf	2012-04-19
9	923-KOLNP-2012-Further evidence [15-06-2021(online)].pdf	2021-06-15
9	923-KOLNP-2012-Information under section 8(2) [16-06-2021(online)].pdf	2021-06-16
10	923-kolnp-2012-(19-04-2012)-CORRESPONDENCE.pdf	2012-04-19
10	923-KOLNP-2012-Further evidence [15-06-2021(online)].pdf	2021-06-15
10	923-KOLNP-2012-Information under section 8(2) [15-01-2021(online)].pdf	2021-01-15
11	923-kolnp-2012-(19-04-2012)-CLAIMS.pdf	2012-04-19
11	923-KOLNP-2012-Information under section 8(2) [11-12-2020(online)]-1.pdf	2020-12-11
11	923-KOLNP-2012-Information under section 8(2) [15-01-2021(online)].pdf	2021-01-15
12	923-kolnp-2012-(19-04-2012)-ABSTRACT.pdf	2012-04-19
12	923-KOLNP-2012-Information under section 8(2) [11-12-2020(online)]-1.pdf	2020-12-11
12	923-KOLNP-2012-Information under section 8(2) [11-12-2020(online)].pdf	2020-12-11
13	923-KOLNP-2012-Written submissions and relevant documents [01-09-2020(online)].pdf	2020-09-01
13	923-KOLNP-2012-Information under section 8(2) [11-12-2020(online)].pdf	2020-12-11
13	923-KOLNP-2012-FORM-18.pdf	2012-05-24
14	923-KOLNP-2012-(03-08-2012)-PA.pdf	2012-08-03
14	923-KOLNP-2012-Written submissions and relevant documents [01-09-2020(online)].pdf	2020-09-01
14	923-KOLNP-2012-Written submissions and relevant documents [31-08-2020(online)].pdf	2020-08-31
15	923-KOLNP-2012-(03-08-2012)-CORRESPONDENCE.pdf	2012-08-03
15	923-KOLNP-2012-Correspondence to notify the Controller [10-08-2020(online)].pdf	2020-08-10
15	923-KOLNP-2012-Written submissions and relevant documents [31-08-2020(online)].pdf	2020-08-31
16	923-KOLNP-2012-(03-08-2012)-ASSIGNMENT.pdf	2012-08-03
16	923-KOLNP-2012-Correspondence to notify the Controller [10-08-2020(online)].pdf	2020-08-10
16	923-KOLNP-2012-US(14)-HearingNotice-(HearingDate-18-08-2020).pdf	2020-07-27
17	923-KOLNP-2012-(25-10-2012)-CORRESPONDENCE.pdf	2012-10-25
17	923-KOLNP-2012-Information under section 8(2) [30-06-2020(online)].pdf	2020-06-30
17	923-KOLNP-2012-US(14)-HearingNotice-(HearingDate-18-08-2020).pdf	2020-07-27
18	923-KOLNP-2012-(25-10-2012)-ANNEXURE TO FORM 3.pdf	2012-10-25
18	923-KOLNP-2012-Information under section 8(2) [25-06-2020(online)].pdf	2020-06-25
18	923-KOLNP-2012-Information under section 8(2) [30-06-2020(online)].pdf	2020-06-30
19	923-KOLNP-2012-(14-11-2012)-FORM-5.pdf	2012-11-14
19	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [19-07-2019(online)].pdf	2019-07-19
19	923-KOLNP-2012-Information under section 8(2) [25-06-2020(online)].pdf	2020-06-25
20	923-KOLNP-2012-(14-11-2012)-FORM-13.pdf	2012-11-14
20	923-KOLNP-2012-CLAIMS [04-06-2018(online)].pdf	2018-06-04
20	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [19-07-2019(online)].pdf	2019-07-19
21	923-KOLNP-2012-COMPLETE SPECIFICATION [04-06-2018(online)].pdf	2018-06-04
21	923-KOLNP-2012-CLAIMS [04-06-2018(online)].pdf	2018-06-04
21	923-KOLNP-2012-(14-11-2012)-FORM-1.pdf	2012-11-14
22	923-KOLNP-2012-(14-11-2012)-CORRESPONDENCE.pdf	2012-11-14
22	923-KOLNP-2012-COMPLETE SPECIFICATION [04-06-2018(online)].pdf	2018-06-04
22	923-KOLNP-2012-CORRESPONDENCE [04-06-2018(online)].pdf	2018-06-04
23	923-KOLNP-2012-CORRESPONDENCE [04-06-2018(online)].pdf	2018-06-04
23	923-KOLNP-2012-FER_SER_REPLY [04-06-2018(online)].pdf	2018-06-04
23	Other Patent Document [21-07-2016(online)].pdf	2016-07-21
24	Other Patent Document [14-09-2016(online)].pdf	2016-09-14
24	923-KOLNP-2012-FORM-26 [04-06-2018(online)].pdf	2018-06-04
24	923-KOLNP-2012-FER_SER_REPLY [04-06-2018(online)].pdf	2018-06-04
25	923-KOLNP-2012-FORM-26 [04-06-2018(online)].pdf	2018-06-04
25	923-KOLNP-2012-OTHERS [04-06-2018(online)].pdf	2018-06-04
25	Other Patent Document [25-01-2017(online)].pdf	2017-01-25
26	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [18-07-2017(online)].pdf	2017-07-18
26	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [21-03-2018(online)].pdf	2018-03-21
26	923-KOLNP-2012-OTHERS [04-06-2018(online)].pdf	2018-06-04
27	923-KOLNP-2012-FER.pdf	2017-12-06
27	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [11-08-2017(online)].pdf	2017-08-11
27	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [21-03-2018(online)].pdf	2018-03-21
28	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [11-08-2017(online)].pdf	2017-08-11
28	923-KOLNP-2012-FER.pdf	2017-12-06
29	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [11-08-2017(online)].pdf	2017-08-11
29	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [18-07-2017(online)].pdf	2017-07-18
29	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [21-03-2018(online)].pdf	2018-03-21
30	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [18-07-2017(online)].pdf	2017-07-18
30	923-KOLNP-2012-OTHERS [04-06-2018(online)].pdf	2018-06-04
30	Other Patent Document [25-01-2017(online)].pdf	2017-01-25
31	923-KOLNP-2012-FORM-26 [04-06-2018(online)].pdf	2018-06-04
31	Other Patent Document [14-09-2016(online)].pdf	2016-09-14
31	Other Patent Document [25-01-2017(online)].pdf	2017-01-25
32	923-KOLNP-2012-FER_SER_REPLY [04-06-2018(online)].pdf	2018-06-04
32	Other Patent Document [14-09-2016(online)].pdf	2016-09-14
32	Other Patent Document [21-07-2016(online)].pdf	2016-07-21
33	923-KOLNP-2012-(14-11-2012)-CORRESPONDENCE.pdf	2012-11-14
33	923-KOLNP-2012-CORRESPONDENCE [04-06-2018(online)].pdf	2018-06-04
33	Other Patent Document [21-07-2016(online)].pdf	2016-07-21
34	923-KOLNP-2012-COMPLETE SPECIFICATION [04-06-2018(online)].pdf	2018-06-04
34	923-KOLNP-2012-(14-11-2012)-FORM-1.pdf	2012-11-14
34	923-KOLNP-2012-(14-11-2012)-CORRESPONDENCE.pdf	2012-11-14
35	923-KOLNP-2012-(14-11-2012)-FORM-1.pdf	2012-11-14
35	923-KOLNP-2012-(14-11-2012)-FORM-13.pdf	2012-11-14
35	923-KOLNP-2012-CLAIMS [04-06-2018(online)].pdf	2018-06-04
36	923-KOLNP-2012-(14-11-2012)-FORM-5.pdf	2012-11-14
36	923-KOLNP-2012-Information under section 8(2) (MANDATORY) [19-07-2019(online)].pdf	2019-07-19
36	923-KOLNP-2012-(14-11-2012)-FORM-13.pdf	2012-11-14
37	923-KOLNP-2012-(25-10-2012)-ANNEXURE TO FORM 3.pdf	2012-10-25
37	923-KOLNP-2012-Information under section 8(2) [25-06-2020(online)].pdf	2020-06-25
37	923-KOLNP-2012-(14-11-2012)-FORM-5.pdf	2012-11-14
38	923-KOLNP-2012-(25-10-2012)-ANNEXURE TO FORM 3.pdf	2012-10-25
38	923-KOLNP-2012-(25-10-2012)-CORRESPONDENCE.pdf	2012-10-25
38	923-KOLNP-2012-Information under section 8(2) [30-06-2020(online)].pdf	2020-06-30
39	923-KOLNP-2012-(03-08-2012)-ASSIGNMENT.pdf	2012-08-03
39	923-KOLNP-2012-(25-10-2012)-CORRESPONDENCE.pdf	2012-10-25
39	923-KOLNP-2012-US(14)-HearingNotice-(HearingDate-18-08-2020).pdf	2020-07-27
40	923-KOLNP-2012-(03-08-2012)-ASSIGNMENT.pdf	2012-08-03
40	923-KOLNP-2012-(03-08-2012)-CORRESPONDENCE.pdf	2012-08-03
40	923-KOLNP-2012-Correspondence to notify the Controller [10-08-2020(online)].pdf	2020-08-10
41	923-KOLNP-2012-(03-08-2012)-CORRESPONDENCE.pdf	2012-08-03
41	923-KOLNP-2012-(03-08-2012)-PA.pdf	2012-08-03
41	923-KOLNP-2012-Written submissions and relevant documents [31-08-2020(online)].pdf	2020-08-31
42	923-KOLNP-2012-(03-08-2012)-PA.pdf	2012-08-03
42	923-KOLNP-2012-FORM-18.pdf	2012-05-24
42	923-KOLNP-2012-Written submissions and relevant documents [01-09-2020(online)].pdf	2020-09-01
43	923-kolnp-2012-(19-04-2012)-ABSTRACT.pdf	2012-04-19
43	923-KOLNP-2012-FORM-18.pdf	2012-05-24
43	923-KOLNP-2012-Information under section 8(2) [11-12-2020(online)].pdf	2020-12-11
44	923-kolnp-2012-(19-04-2012)-ABSTRACT.pdf	2012-04-19
44	923-kolnp-2012-(19-04-2012)-CLAIMS.pdf	2012-04-19
44	923-KOLNP-2012-Information under section 8(2) [11-12-2020(online)]-1.pdf	2020-12-11
45	923-kolnp-2012-(19-04-2012)-CLAIMS.pdf	2012-04-19
45	923-kolnp-2012-(19-04-2012)-CORRESPONDENCE.pdf	2012-04-19
45	923-KOLNP-2012-Information under section 8(2) [15-01-2021(online)].pdf	2021-01-15
46	923-KOLNP-2012-Further evidence [15-06-2021(online)].pdf	2021-06-15
46	923-kolnp-2012-(19-04-2012)-DESCRIPTION (COMPLETE).pdf	2012-04-19
46	923-kolnp-2012-(19-04-2012)-CORRESPONDENCE.pdf	2012-04-19
47	923-kolnp-2012-(19-04-2012)-DESCRIPTION (COMPLETE).pdf	2012-04-19
47	923-kolnp-2012-(19-04-2012)-DRAWINGS.pdf	2012-04-19
47	923-KOLNP-2012-Information under section 8(2) [16-06-2021(online)].pdf	2021-06-16
48	923-kolnp-2012-(19-04-2012)-DRAWINGS.pdf	2012-04-19
48	923-kolnp-2012-(19-04-2012)-FORM-1.pdf	2012-04-19
48	923-KOLNP-2012-PatentCertificate17-06-2021.pdf	2021-06-17
49	923-kolnp-2012-(19-04-2012)-FORM-1.pdf	2012-04-19
49	923-kolnp-2012-(19-04-2012)-FORM-2.pdf	2012-04-19
49	923-KOLNP-2012-IntimationOfGrant17-06-2021.pdf	2021-06-17
50	923-kolnp-2012-(19-04-2012)-FORM-2.pdf	2012-04-19
50	923-kolnp-2012-(19-04-2012)-FORM-3.pdf	2012-04-19
50	923-KOLNP-2012-PROOF OF ALTERATION [24-05-2023(online)].pdf	2023-05-24
51	923-kolnp-2012-(19-04-2012)-FORM-3.pdf	2012-04-19
51	923-kolnp-2012-(19-04-2012)-FORM-5.pdf	2012-04-19
51	923-KOLNP-2012-RELEVANT DOCUMENTS [29-08-2023(online)].pdf	2023-08-29
52	923-kolnp-2012-(19-04-2012)-FORM-5.pdf	2012-04-19
52	923-kolnp-2012-(19-04-2012)-INTERNATIONAL PUBLICATION.pdf	2012-04-19
52	923-KOLNP-2012-RELEVANT DOCUMENTS [07-09-2023(online)].pdf	2023-09-07
53	923-kolnp-2012-(19-04-2012)-INTERNATIONAL PUBLICATION.pdf	2012-04-19
53	923-kolnp-2012-(19-04-2012)-PCT SEARCH REPORT & OTHERS.pdf	2012-04-19
53	923-KOLNP-2012-RELEVANT DOCUMENTS [26-09-2023(online)].pdf	2023-09-26
54	923-kolnp-2012-(19-04-2012)-PCT SEARCH REPORT & OTHERS.pdf	2012-04-19
54	923-kolnp-2012-(19-04-2012)-SPECIFICATION.pdf	2012-04-19
54	923-KOLNP-2012-FORM-27 [05-08-2024(online)].pdf	2024-08-05
55	923-KOLNP-2012-Response to office action [11-02-2025(online)].pdf	2025-02-11
55	923-kolnp-2012-(19-04-2012)-SPECIFICATION.pdf	2012-04-19
56	923-KOLNP-2012-PROOF OF ALTERATION [30-07-2025(online)].pdf	2025-07-30
57	923-KOLNP-2012-POWER OF AUTHORITY [30-07-2025(online)].pdf	2025-07-30
58	923-KOLNP-2012-FORM-16 [30-07-2025(online)].pdf	2025-07-30
59	923-KOLNP-2012-ASSIGNMENT WITH VERIFIED COPY [30-07-2025(online)].pdf	2025-07-30

Search Strategy

1	search(79)_27-09-2017.pdf