Audio Encoder And Decoder Having A Flexible Configuration

< Back

Audio Encoder And Decoder Having A Flexible Configuration Functionality

Abstract: An audio decoder for decoding an encoded audio signal (10), the encoded audio signal (10) comprising a first channel element (52a) and a second channel element (52b) in a payload section (52) of a data stream and first decoder configuration data (50c) for the first channel element (52a) and second decoder configuration data (50d) for the second channel element (52b) in a configuration section (50) of the data stream, comprises: a data stream reader (12) for reading the configuration data for each channel element in the configuration section and for reading the payload data for each channel element in the payload section; a configurable decoder (16) for decoding the plurality of channel elements; and a configuration controller (14) for configuring the configurable decoder (16) so that the configurable decoder (16) is configured in accordance with the first decoder configuration data when decoding the first channel element and in accordance with the second decoder configuration data when decoding the second channel element.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

23 September 2013

Publication Number

01/2014

Publication Type

INA

Invention Field

COMMUNICATION

Status

Parent Application

Patent Number

Legal Status

Grant Date

2020-05-18

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Hansastraße 27c, 80686 München, GERMANY

DOLBY INTERNATIONAL AB

Apollo Buiding, 3E Herikerbergweg 1-35, 1101 CN Amsterdam Zuid-Oost, NETHERLANDS

KONINKLIJKE PHILIPS N.V.

High Tech Campus 5, 5656 AE Eindhoven, NETHERLANDS

Inventors

1. NEUENDORF, Max

Theatergasse 17, 90402 Nürnberg, GERMANY

2. MULTRUS, Markus

Etzlaubweg 7, 90469 Nürnberg, GERMANY

3. DÖHLA, Stefan

Hartmannstr. 47a, 91052 Erlangen, GERMANY

4. PURNHAGEN, Heiko,

Gjuteribacken 17, 17265 Sundbyberg, SWEDEN

5. DE BONT, Frans,

De Hasselt 7, 5561 CC Riethoven, NETHERLANDS

Specification

Audio Encoder and Decoder having a Flexible Configuration Functionality
Specification
The present invention relates to audio coding and particularly to high quality and low bitrate
coding such as known from the so-called USAC coding (USAC = Unified Speech and
Audio Coding).
The USAC coder is defined in ISO/IEC CD 23003-3. This standard named "Information
technology - MPEG audio technologies - Part 3: Unified speech and audio coding" de¬
scribes in detail the functional blocks of a reference model of a call for proposals on uni¬
fied speech and audio coding.
Figs. 10a and 10b illustrate encoder and decoder block diagrams. The block diagrams of
the USAC encoder and decoder reflect the structure of MPEG-D USAC coding. The gen¬
eral structure can be described like this: First there is a common pre/post-processing con¬
sisting of an MPEG Surround (MPEGS) functional unit to handle stereo or multi-channel
processing and an enhanced SBR (eSBR) unit which handles the parametric representation
of the higher audio frequencies in the input signal. Then there are two branches, one con¬
sisting of a modified Advanced Audio Coding (AAC) tool path and the other consisting of
a linear prediction coding (LP or LPC domain) based path, which in turn features either a
frequency domain representation or a time domain representation of the LPC residual. All
transmitted spectra for both, AAC and LPC, are represented in MDCT domain following
quantization and arithmetic coding. The time domain representation uses an ACELP exci¬
tation coding scheme.
The basic structure of the MPEG-D USAC is shown in Figure 10a and Figure 10b. The
data flow in this diagram is from left to right, top to bottom. The functions of the decoder
are to find the description of the quantized audio spectra or time domain representation in
the bitstream payload and decode the quantized values and other reconstruction informa¬
tion.
In case of transmitted spectral information the decoder shall reconstruct the quantized
spectra, process the reconstructed spectra through whatever tools are active in the bitstream
payload in order to arrive at the actual signal spectra as described by the input bitstream
payload, and finally convert the frequency domain spectra to the time domain. Following
the initial reconstruction and scaling of the spectrum reconstruction, there are optional
tools that modify one or more of the spectra in order to provide more efficient coding.
In case of transmitted time domain signal representation, the decoder shall reconstruct the
quantized time signal, process the reconstructed time signal through whatever tools are
active in the bitstream payload in order to arrive at the actual time domain signal as de¬
scribed by the input bitstream payload.
For each of the optional tools that operate on the signal data, the option to "pass through" is
retained, and in a l cases where the processing is omitted, the spectra or time samples at its
input are passed directly through the tool without modification.
In places where the bitstream changes its signal representation from time domain to fre¬
quency domain representation or from LP domain to non-LP domain or vice versa, the
decoder shall facilitate the transition from one domain to the other by means of an appro¬
priate transition overlap-add windowing.
eSBR and MPEGS processing is applied in the same manner to both coding paths after
transition handling.
The input to the bitstream payload demultiplexer tool is the MPEG-D USAC bitstream
payload. The demultiplexer separates the bitstream payload into the parts for each tool, and
provides each of the tools with the bitstream payload information related to that tool.
The outputs from the bitstream payload demultiplexer tool are:
Depending on the core coding type in the current frame either:
o the quantized and noiselessly coded spectra represented by
o scale factor information
o arithmetically coded spectral lines
• or: linear prediction (LP) parameters together with an excitation signal represented
by either:
o quantized and arithmetically coded spectral lines (transform coded excitation,
TCX) or
o ACELP coded time domain excitation
The spectral noise filling information (optional)
• The M S decision information (optional)
• The temporal noise shaping (TNS) information (optional)
• The filterbank control information
• The time unwarping (TW) control information (optional)
· The enhanced spectral bandwidth replication (eSBR) control information (optional)
• The MPEG Surround (MPEGS) control information
The scale factor noiseless decoding tool takes information from the bitstream payload de¬
multiplexer, parses that information, and decodes the Huffman and DPCM coded scale
factors.
The input to the scale factor noiseless decoding tool is:
• The scale factor information for the noiselessly coded spectra
The output of the scale factor noiseless decoding tool is:
• The decoded integer representation of the scale factors:
The spectral noiseless decoding tool takes information from the bitstream payload demul¬
tiplexer, parses that information, decodes the arithmetically coded data, and reconstructs
the quantized spectra. The input to this noiseless decoding tool is:
• The noiselessly coded spectra
The output of this noiseless decoding tool is:
• The quantized values of the spectra
The inverse quantizer tool takes the quantized values for the spectra, and converts the inte¬
ger values to the non-scaled, reconstructed spectra. This quantizer is a companding quantizer,
whose companding factor depends on the chosen core coding mode.
The input to the Inverse Quantizer tool is:
• The quantized values for the spectra
The output of the inverse quantizer tool is:
• The un-scaled, inversely quantized spectra
The noise filling tool is used to fill spectral gaps in the decoded spectra, which occur when
spectral value are quantized to zero e.g. due to a strong restriction on bit demand in the
encoder. The use of the noise filling tool is optional.
The inputs to the noise filling tool are:
• The un-scaled, inversely quantized spectra
• Noise filling parameters
· The decoded integer representation of the scale factors
The outputs to the noise filling tool are:
• The un-scaled, inversely quantized spectral values for spectral lines which were
previously quantized to zero.
• Modified integer representation of the scale factors
The rescaling tool converts the integer representation of the scale factors to the actual val¬
ues, and multiplies the un-scaled inversely quantized spectra by the relevant scale factors.
The inputs to the scale factors tool are:
• The decoded integer representation of the scale factors
• The un-scaled, inversely quantized spectra
The output from the scale factors tool is:
• The scaled, inversely quantized spectra
For an overview over the M/S tool please refer to ISO/IEC 14496-3 :2009, 4.1.1.2.
For an overview over the temporal noise shaping (TNS tool , please refer to ISO/IEC
14496-3:2009, 4.1.1.2.
The filterbank / block switching tool applies the inverse of the frequency mapping that was
carried out in the encoder. An inverse modified discrete cosine transform (IMDCT) is used
for the filterbank tool. The IMDCT can be configured to support 120, 128, 240, 256, 480,
512, 960 or 1024 spectral coefficients.
The inputs to the filterbank tool are:
• The (inversely quantized) spectra
· The filterbank control information
The output(s) from the filterbank tool is (are):
• The time domain reconstructed audio signal(s).
The time-warped filterbank / block switching tool replaces the normal filterbank / block
switching tool when the time warping mode is enabled. The filterbank is the same
(IMDCT) as for the normal filterbank, additionally the windowed time domain samples are
mapped from the warped time domain to the linear time domain by time-varying resam¬
pling.
The inputs to the time-warped filterbank tools are:
• The inversely quantized spectra
· The filterbank control information
• The time-warping control information
The output(s) from the filterbank tool is (are):
· The linear time domain reconstructed audio signal(s).
The enhanced SBR (eSBR) tool regenerates the highband of the audio signal. It is based on
replication of the sequences of harmonics, truncated during encoding. It adjusts the spectral
envelope of the generated highband and applies inverse filtering, and adds noise and sinusoidal
components in order to recreate the spectral characteristics of the original signal.
The input to the eSBR tool is:
• The quantized envelope data
· Misc. control data
• a time domain signal from the frequency domain core decoder or the ACELP/TCX
core decoder
The output of the eSBR tool is either:
• a time domain signal or
• a QMF-domain representation of a signal, e.g. in the MPEG Surround tool is used.
The MPEG Surround (MPEGS) tool produces multiple signals from one or more input
signals by applying a sophisticated upmix procedure to the input signal(s) controlled by
appropriate spatial parameters. In the USAC context MPEGS is used for coding a multi¬
channel signal, by transmitting parametric side information alongside a transmitted downmixed
signal.
The input to the MPEGS tool is:
• a downmixed time domain signal or
· a QMF-domain representation of a downmixed signal from the eSBR tool
The output of the MPEGS tool is:
• a multi-channel time domain signal
The Signal Classifier tool analyses the original input signal and generates from it control
information which triggers the selection of the different coding modes. The analysis of the
input signal is implementation dependent and will try to choose the optimal core coding
mode for a given input signal frame. The output of the signal classifier can (optionally)
also be used to influence the behavior of other tools, for example MPEG Surround, en¬
hanced SBR, time-warped filterbank and others.
The input to the signal Classifier tool is:
· the original unmodified input signal
• additional implementation dependent parameters
The output of the Signal Classifier tool is:
· a control signal to control the selection of the core codec (non-LP filtered fre¬
quency domain coding, LP filtered frequency domain or LP filtered time domain
coding)
The ACELP tool provides a way to efficiently represent a time domain excitation signal by
combining a long term predictor (adaptive codeword) with a pulse-like sequence (innova¬
tion codeword). The reconstructed excitation is sent through an LP synthesis filter to form
a time domain signal.
The input to the ACELP tool is:
• adaptive and innovation codebook indices
• adaptive and innovation codes gain values
· other control data
• inversely quantized and interpolated LPC filter coefficients
The output of the ACELP tool is:
· The time domain reconstructed audio signal
The MDCT based TCX decoding tool is used to turn the weighted LP residual representa¬
tion from an MDCT-domain back into a time domain signal and outputs a time domain
signal including weighted LP synthesis filtering. The IMDCT can be configured to support
256, 512, or 1024 spectral coefficients.
The input to the TCX tool is:
• The (inversely quantized) MDCT spectra
· inversely quantized and interpolated LPC filter coefficients
The output of the TCX tool is:
The time domain reconstructed audio signal
The technology disclosed in ISO/IEC CD 23003-3, which is incorporated herein by reference
allows the definition of channel elements which are, for example, single channel ele¬
ments only containing payload for a single channel or channel pair elements comprising
payload for two channels or LFE (Low-Frequency Enhancement) channel elements com¬
prising payload for an LFE channel.
A five-channel multi-channel audio signal can, for example, be represented by a single
channel element comprising the center channel, a first channel pair element comprising the
left channel and the right channel, and a second channel pair element comprising the left
surround channel (Ls) and the right surround channel (Rs). These different channel ele¬
ments which together represent the multi-channel audio signal are fed into a decoder and
are processed using the same decoder configuration. In accordance with the prior art, the
decoder configuration sent in the USAC specific config element was applied by the de¬
coder to all channel elements and therefore the situation exists that elements of the configuration
valid for all channel elements could not be selected for an individual channel
element in an optimum way, but had to be set for all channel elements simultaneously. On
the other hand, however, it has been found out that the channel elements for describing a
straightforward five-channel multi-channel signal are very different from each other. The
center channel being the single channel element has significantly different characteristics
from the channel pair elements describing the left/right channels and the left surround/right
surround channels, and additionally the characteristics of the two channel pair elements are
also significantly different due to the fact that surround channels comprise information
which is heavily different from the information comprised in the left and right channels.
The selection of configuration data for all channel elements together, made it necessary to
make compromises so that a configuration has to be selected which is non-optimum for all
channel elements, but which represents a compromise between all channel elements. Al¬
ternatively, the configuration has been selected to be optimum for one channel element, but
this inevitably led to the situation that the configuration was non-optimum for the other
channel elements. This, however, results in an increased bitrate for the channel elements
having the non-optimum configuration or alternatively or additionally results in a reduced
audio quality for these channel elements which do not have the optimum configuration
settings.
It is therefore the object of the present invention to provide an improved audio cod¬
ing/decoding concept.
This object is achieved by an audio decoder in accordance with claim 1, a method of audio
decoding in accordance with claim 14, an audio encoder in accordance with claim 15, a
method of audio encoding in accordance with claim 16, a computer program in accordance
with claim 17 and an encoded audio signal in accordance with claim 18.
The present invention is based on the finding that an improved audio encoding/decoding
concept is obtained when the decoder configuration data for each individual channel ele¬
ment is transmitted. In accordance with the present invention, the encoded audio signal
therefore comprises a first channel element and a second channel element in a payload section
of a data stream and first decoder configuration data for the first channel element and
second decoder configuration data for the second channel element in a configuration sec¬
tion of the data stream. Hence, the payload section of the data stream where the payload
data for the channel elements is located, is separated from the configuration data for the
data stream, where the configuration data for the channel elements is located. It is preferred
that the configuration section is a contiguous portion of a serial bitstream, where all bits
belonging to this payload section or contiguous portion of the bitstream are configuration
data. Preferably, the configuration data section is followed by the payload section of the
data stream, where the payload for the channel elements is located. The inventive audio
decoder comprises a data stream reader for reading the configuration data for each channel
element in the configuration section and for reading the payload data for each channel ele¬
ment in the payload section. Furthermore, the audio decoder comprises a configurable de¬
coder for decoding the plurality of channel elements and a configuration controller for con¬
figuring the configurable decoder so that the configurable decoder is configured in accor¬
dance with the first decoder configuration data when decoding the first channel element
and in accordance with the second decoder configuration data when decoding the second
channel element.
Thus, it is made sure that for each channel element the optimum configuration can be se¬
lected. This allows to optimally account for the different characteristics of the different
channel elements.
An audio encoder in accordance with the present invention is arranged for encoding a
multi-channel audio signal having, for example, at least two, three or preferably more than
three channels. The audio encoder comprises a configuration processor for generating first
configuration data for a first channel element and second configuration data for a second
channel element and a configurable encoder for encoding the multi-channel audio signal to
obtain a first channel element and a second channel element using the first and the second
configuration data, respectively. Furthermore, the audio encoder comprises a data stream
generator for generating a data stream representing the encoded audio signal, the data
stream having a configuration section having the first and the second configuration data
and a payload section comprising the first channel element and the second channel ele¬
ment.
Now, the encoder as well as the decoder are in the position to determine an individual and
preferably optimum configuration data for each channel element.
This makes sure that the configurable decoder for each channel element is configured in
such a way that for each channel element the optimum with respect to audio quality and
bitrate can be obtained and compromises do not have to be made anymore.
Subsequently, preferred embodiments of the present invention are described with respect to
the accompanying drawings, in which:
Fig. 1 is a block diagram of a decoder;
Fig. 2 is a block diagram of an encoder;
Figs. 3a and 3b represent a table outlining channel configurations for different speaker set¬
ups;
Figs. 4a and 4b identify and graphically illustrate different speaker setups;
Figs. 5a to 5d illustrate different aspects of the encoded audio signal having a configura¬
tion section and the payload section;
Fig. 6a illustrates the syntax of the UsacConfig element;
Fig. 6b illustrates the syntax of the UsacChannelConfig element;
Fig. 6c illustrates the syntax of the UsacDecoderConfig;
Fig. 6d illustrates the syntax of UsacSingleChannelElementConfig;
Fig. 6e illustrates the syntax of UsacChannelPairElementConfig;
Fig. 6f illustrates the syntax of UsacLfeElementConfig;
Fig. 6g illustrates the syntax of UsacCoreConfig;
Fig. 6h illustrates the syntax of SbrConfig;
Fig. 6i illustrates the syntax of SbrDfltHeader;
Fig. 6j illustrates the syntax of Mps2 12Config;
Fig. 6k illustrates the syntax of UsacExtElementConfig;
Fig. 6 1 illustrates the syntax of UsacConfigExtension;
illustrates the syntax of escapedValue;
illustrates different alternatives for identifying and configuring different
encoder/decoder tools for a channel element individually;
illustrates a preferred embodiment of a decoder implementation having parallely
operating decoder instances for generating a 5.1 multi-channel audio
signal;
illustrates a preferred implementation of the decoder of Fig. 1 in a flowchart
form;
illustrates the block diagram of the USAC encoder; and
Fig. 10b illustrates the block diagram of the USAC decoder.
High level information, like sampling rate, exact channel configuration, about the con¬
tained audio content is present in the audio bitstream. This makes the bitstream more self
contained and makes transport of the configuration and payload easier when embedded in
transport schemes which may have no means to explicitly transmit this information.
The configuration structure contains a combined frame length and SBR sampling rate ratio
index (coreSbrFrameLengthlndex)). This guarantees efficient transmission of both values
and makes sure that non-meaningful combinations of frame length and SBR ratio cannot
be signaled. The latter simplifies the implementation of a decoder.
The configuration can be extended by means of a dedicated configuration extension
mechanism. This will prevent bulky and inefficient transmission of configuration exten¬
sions as known from the MPEG-4 Audio SpecificConfigQ.
Configuration allows free signaling of loudspeaker positions associated with each transmit¬
ted audio channel. Signaling of commonly used channel to loudspeaker mappings can be
efficiently signaled by means of a channelConfigurationlndex.
Configuration of each channel element is contained in a separate structure such that each
channel element can be configured independently.
SBR configuration data (the "SBR header") is split into an SbrInfo() and an SbrHeader().
For the SbrHeader() a default version is defined (SbrDfltHeader()), which can be efficiently
referenced in the bitstream. This reduces the bit demand in places where re¬
transmission of SBR configuration data is needed.
More commonly applied configuration changes to SBR can be efficiently signaled with the
help of the SbrlnfoQ syntax element.
The configuration for the parametric bandwidth extension (SBR) and the parametric stereo
coding tools (MPS212, aka. MPEG Surround 2-1-2) is tightly integrated into the USAC
configuration structure. This represents much better the way that both technologies are
actually employed in the standard.
The syntax features an extension mechanism which allows transmission of existing and
future extensions to the codec.
The extensions may be placed (i.e. interleaved) with the channel elements in any order.
This allows for extensions which need to be read before or after a particular channel ele¬
ment which the extension shall be applied on.
A default length can be defined for a syntax extension, which makes transmission of con¬
stant length extensions very efficient, because the length of the extension payload does not
need to be transmitted every time.
The common case of signaling a value with the help of an escape mechanism to extend the
range of values if needed was modularized into a dedicated genuine syntax element (escapedValue())
which is flexible enough to cover all desired escape value constellations and
bit field extensions.
Bitstream Configuration
UsacConfigO (Fig. 6a)
The UsacConfigO was extended to contain information about the contained audio content
as well as everything needed for the complete decoder set-up. The top level information
about the audio (sampling rate, channel configuration, output frame length) is gathered at
the beginning for easy access from higher (application) layers.
channelConfigurationlndex, UsacChannelConfigO (Fig. 6b)
These elements give information about the contained bitstream elements and their mapping
to loudspeakers. The channelConfigurationlndex allows for an easy and convenient way of
signaling one out of a range of predefined mono, stereo or multi-channel configurations
which were considered practically relevant.
For more elaborate configurations which are not covered by the channelConfigurationln¬
dex the UsacChannelConfigO allows for a free assignment of elements to loudspeaker po¬
sition out of a list of 32 speaker positions, which cover all currently known speaker positions
in all known speaker set-ups for home or cinema sound reproduction.
This list of speaker positions is a superset of the list featured in the MPEG Surround stan¬
dard (see Table 1 and Figure 1 in ISO/IEC 23003-1). Four additional speaker positions
have been added to be able to cover the lately introduced 22.2 speaker set-up (see Figs. 3a,
3b, 4a and 4b).
UsacDecoderConfigO (Fig. 6c)
This element is at the heart of the decoder configuration and as such it contains all further
information required by the decoder to interpret the bitstream.
In particular the structure of the bitstream is defined here by explicitly stating the number
of elements and their order in the bitstream.
A loop over all elements then allows for configuration of all elements of all types (single,
pair, lfe, extension).
UsacConfigExtensionO (Fig. 61)
In order to account for future extensions, the configuration features a powerful mechanism
to extend the configuration for yet non-existent configuration extensions for USAC.
UsacSingleChannelEIementConfigO ( - d)
This element configuration contains all information needed for configuring the decoder to
decode one single channel. This is essentially the core coder related information and if
SBR is used the SBR related information.
UsacChannelPairElementConfigO (Fig. 6e)
In analogy to the above this element configuration contains all information needed for con¬
figuring the decoder to decode one channel pair. In addition to the above mentioned core
config and SBR configuration this includes stereo-specific configurations like the exact
kind of stereo coding applied (with or without MPS212, residual etc.). Note that this element
covers all kinds of stereo coding options available in USAC.
UsacLfeElementConfigO (Fig. 6f)
The LFE element configuration does not contain configuration data as an LFE element has
a static configuration.
UsacExtEIementConfigO (Fig. 6k)
This element configuration can be used for configuring any kind of existing or future ex¬
tensions to the codec. Each extension element type has its own dedicated ID value. A
length field is included in order to be able to conveniently skip over configuration extensions
unknown to the decoder. The optional definition of a default payload length further
increases the coding efficiency of extension payloads present in the actual bitstream.
Extensions which are already envisioned to be combined with USAC include: MPEG Sur¬
round, SAOC, and some sort of FIL element as known from MPEG-4 AAC.
UsacCoreConfigO (Fig. 6g)
This element contains configuration data that has impact on the core coder set-up. Cur¬
rently these are switches for the time warping tool and the noise filling tool.
SbrConfigO (Fig. 6h)
In order to reduce the bit overhead produced by the frequent re-transmission of the
sbr_header(), default values for the elements of the sbr_header() that are typically kept
constant are now carried in the configuration element SbrDfltHeader(). Furthermore, static
SBR configuration elements are also carried in SbrConfig(). These static bits include flags
for en- or disabling particular features of the enhanced SBR, like harmonic transposition or
inter TES.
SbrDfltHeaderO (Fig. 6i)
This carries elements of the sbr_header() that are typically kept constant. Elements affect¬
ing things like amplitude resolution, crossover band, spectrum preflattening are now car¬
ried in SbrInfo() which allows them to be efficiently changed on the fly.
Mps212ConfigO (Fig. 6j)
Similar to the above SBR configuration, all set-up parameters for the MPEG Surround 2-1-
2 tools are assembled in this configuration. All elements from SpatialSpecificConfig() that
are not relevant or redundant in this context were removed.
Bitstream Payload
UsacFrameO
This is the outermost wrapper around the USAC bitstream payload and represents a USAC
access unit. It contains a loop over all contained channel elements and extension elements
as signaled in the config part. This makes the bitstream format much more flexible in terms
of what it can contain and is future proof for any future extension.
UsacSingleChannelEIementO
This element contains all data to decode a mono stream. The content is split in a core coder
related part and an eSBR related part. The latter is now much more closely connected to
the core, which reflects also much better the order in which the data is needed by the de¬
coder.
UsacChannelPairEIementO
This element covers the data for all possible ways to encode a stereo pair. In particular, all
flavors of unified stereo coding are covered, ranging from legacy M/S based coding to
fully parametric stereo coding with the help of MPEG Surround 2-1-2. stereoConfiglndex
indicates which flavor is actually used. Appropriate eSBR data and MPEG Surround 2-1-2
data is sent in this element.
UsacLfeElementO
The former lfe_channel_element() is renamed only in order to follow a consistent naming
scheme.
UsacExtElementO
The extension element was carefully designed to be able to be maximally flexible but at
the same time maximally efficient even for extensions which have a small payload (or fre¬
quently none at all). The extension payload length is signaled for nescient decoders to skip
over it. User-defined extensions can be signaled by means of a reserved range of extension
types. Extensions can be placed freely in the order of elements. A range of extension ele¬
ments has already been considered including a. mechanism to write fill bytes.
UsacCoreCoderDataO
This new element summarizes all information affecting the core coders and hence also
contains fd_channel_stream()'s and lpd_channel_stream()'s.
StereoCoreToolInfoO
In order to ease the readability of the syntax, all stereo related information was captured in
this element. It deals with the numerous dependencies of bits in the stereo coding modes.
UsacSbrDataO
CRC functionality and legacy description elements of scalable audio coding were removed
from what used to be the sbr_extension_data() element. In order to reduce the overhead
caused by frequent re-transmission of SBR info and header data, the presence of these can
be explicitly signaled.
SbrlnfoO
SBR configuration data that is frequently modified on the fly. This includes elements controlling
things like amplitude resolution, crossover band, spectrum preflattening, which
previously required the transmission of a complete sbr_header(). (see 6.3 in [N11660],
"Efficiency").
SbrHeaderO
In order to maintain the capability of SBR to change values in the sbr_header() on the fly,
it is now possible to carry an SbrHeader() inside the UsacSbrDataO in case other values
than those sent in SbrDfltHeader() should be used. The bs_header_extra mechanism was
maintained in order to keep overhead as low as possible for the most common cases.
sbr_data0
Again, remnants of SBR scalable coding were removed because they are not applicable in
the USAC context. Depending on the number of channels the sbr_data() contains one
sbr_single_channel_elementO or one sbr_channel_pair_element().
usacSampIingFrequencyIndex
This table is a superset of the table used in MPEG-4 to signal the sampling frequency of
the audio codec. The table was further extended to also cover the sampling rates that are
currently used in the USAC operating modes. Some multiples of the sampling frequencies
were also added.
channelConfigurationlndex
This table is a superset of the table used in MPEG-4 to signal the channelConfiguration. It
was further extended to allow signaling of commonly used and envisioned future loud¬
speaker setups. The index into this table is signaled with 5 bits to allow for future exten¬
sions.
usacElementType
Only 4 element types exist. One for each of the four basic bitstream elements: Usac-
SingleChannelElement(), UsacChannelPairElement(), UsacLfeElement(), UsacExtElement().
These elements provide the necessary top level structure while maintaining all
needed flexibility.
usacExtElementType
Inside of UsacExtElement(), this element allows to signal a plethora of extensions. In order
to be future proof the bit field was chosen large enough to allow for all conceivable exten¬
sions. Out of the currently known extensions already few are proposed to be considered:
fill element, MPEG Surround, and SAOC.
usacConfigExtType
Should it at some point be necessary to extend the configuration then this can be handled
by means of the UsacConfigExtensionO which would then allow to assign a type to each
new configuration. Currently the only type which can be signaled is a fill mechanism for
the configuration.
coreSbrFrameLengthlndex
This table shall signal multiple configuration aspects of the decoder. In particular these are
the output frame length, the SBR ratio and the resulting core coder frame length (ccfl). At
the same time it indicates the number of QMF analysis and synthesis bands used in SBR
stereoConfiglndex
This table determines the inner structure of a UsacChannelPairElement(). It indicates the
use of a mono or stereo core, use of MPS212, whether stereo SBR is applied, and whether
residual coding is applied in MPS2 12.
By moving large parts of the eSB header fields to a default header which can be refer¬
enced by means of a default header flag, the bit demand for sending eSBR control data was
greatly reduced. Former sbr_header() bit fields that were considered to change most likely
in a real world system were outsourced to the sbrInfo() element instead which now consists
only of 4 elements covering a maximum of 8 bits. Compared to the sbr_header(), which
consists of at least 18 bits this is a saving of 10 bit.
It is more difficult to assess the impact of this change on the overall bitrate because it de¬
pends heavily on the rate of transmission of eSBR control data in sbrInfo(). However, al¬
ready for the common use case where the sbr crossover is altered in a bitstream the bit sav¬
ing can be as high as 22 bits per occurrence when sending an sbrInfo() instead of a fully
transmitted sbr_header().
The output of the USAC decoder can be further processed by MPEG Surround (MPS)
(ISO/IEC 23003-1) or SAOC (ISO/IEC 23003-2). If the SBR tool in USAC is active, a
USAC decoder can typically be efficiently combined with a subsequent MPS/SAOC de¬
coder by connecting them in the QMF domain in the same way as it is described for HEAAC
in ISO/IEC 23003-1 4.4. If a connection in the QMF domain is not possible, they
need to be connected in the time domain.
If MPS/SAOC side information is embedded into a USAC bitstream by means of the
usacExtElement mechanism (with usacExtElementType being ID_EXT_ELE_MPEGS or
ID_EXT_ELE_SAOC), the time-alignment between the USAC data and the MPS/SAOC
data assumes the most efficient connection between the USAC decoder and the
MPS/SAOC decoder. If the SBR tool in USAC is active and if MPS/SAOC employs a 64
band QMF domain representation (see ISO/IEC 23003-1 6.6.3), the most efficient connec¬
tion is in the QMF domain. Otherwise, the most efficient connection is in the time domain.
This corresponds to the time-alignment for the combination of HE-AAC and MPS as de¬
fined in ISO/IEC 23003-1 4.4, 4.5, and 7.2.1.
The additional delay introduced by adding MPS decoding after USAC decoding is given
by ISO/IEC 23003-1 4.5 and depends on whether HQ MPS or LP MPS is used, and
whether MPS is connected to USAC in the QMF domain or in the time domain.
ISO/IEC 23003-1 4.4 clarifies the interface between USAC and MPEG Systems. Every
access unit delivered to the audio decoder from the systems interface shall result in a corre¬
sponding composition unit delivered from the audio decoder to the systems interface, i.e.,
the compositor. This shall include start-up and shut-down conditions, i.e., when the access
unit is the first or the last in a finite sequence of access units.
For an audio composition unit, ISO/IEC 14496-1 7.1.3.5 Composition Time Stamp (CTS)
specifies that the composition time applies to the n-th audio sample within the composition
unit. For USAC, the value of n is always 1. Note that this applies to the output of the
USAC decoder itself. In the case that a USAC decoder is, for example, being combined
with an MPS decoder needs to be taken into account for the composition units delivered at
the output of the MPS decoder.
Features of USAC bitstream payload syntax
Table - Syntax of UsacSingleChannelElementQ
Syntax No. of bits Mnemonic
UsacSingleChannelElement(indepFlag)
{
UsacCoreCoderData (1, indepFlag);
if (sbrRatiolndex > 0) {
UsacSbrData(1, indepFlag);
}
}
Table - Syntax of UsacExtElementQ
Syntax No. of bits Mnemonic
UsacExtElement(indepFlag)
{
usacExtElementUseDefaultLength;
if (usacExtElementUseDefaultLength) {
usacExtElementPayload Length = usacExtElementDefaultLength;
} else {
usacExtElementPayloadLength = escapedValue(8,16,0);
}
if (usacExtElementPayloadLength>0) {
if (usacExtElementPayload Frag) {
usacExtElementStart;
usacExtElementStop;
} else {
usacExtElementStart = 1;
usacExtElementStop = ;
for (i=0; i0) | |
(last_lpd_mode>0 && mod[k]==0) ) {
fac_data(0, ccfl/8);
}
} if (
mod[k] == 0) {
acelp_coding(acelp_core_mode);
last_lpd_mode=0;
k += 1;
}
else {
tcx_coding( lg(mod[k]) , first_tcx_flag, indepFlag);
lastJpd_mode=mod[k];
k += ( 1 « (mod[k]-1) ) ;
first_tcx_flag=FALSE;
}
}
lpc_data(first_lpd_flag);
if (core_mode_last==0 && fac_data_present==1) {
short_fac_flag;
facjength = short_fac_flag ? ccfl/16 : ccfl/8;
fac_data(1 , facjength);
Features of enhanced SBR payload syntax
Table - Syntax of UsacSbrDataQ
Syntax No. of bits Mnemonic
UsacSbrData(harmonicSBR, numberSbrChannels, indepFlag)
{
if (indepFlag) {
sbrlnfoPresent = 1;
sbrHeaderPresent = 1;
} else {
sbrlnfoPresent; uimsbf
if (sbrlnfoPresent) {
sbrHeaderPresent; uimsbf
} else {
sbrHeaderPresent = 0 ;
}
} if (
sbrlnfoPresent) {
SbrlnfoO;
} if (
sbrHeaderPresent) {
sbrUseDfltHeader; uimsbf
if (sbrUseDfltHeader) {
/* copy all SbrDfltHeader() elements
dlft_xxx_yyy to bs_xxx_yyy */
} else {
SbrHeader();
}
}
sbr_data(harmonicSBR, bs_amp_res, numberSbrChannels, indep¬
Flag);
Table - Syntax of Sbrlnfo
Syntax No. of bits Mnemonic
SbrlnfoO
{
bs_amp_res; 1
bs_xover_band; 4 Uimsbf
bs_sbr_preprocessing; 1 Uimsbf
if (bs_pvc) {
bs_pvc_mode; uimsbf
}
Table - Syntax of SbrHeader
Syntax No. of bits Mnemonic
SbrHeader()
{
bs_start_freq; 4 uimsbf
bs_stop_freq; 4 uimsbf
bs_header_extra1 ; 1 uimsbf
bs_header_extra2; 1 uimsbf
if (bs_header_extra1 == 1) {
Table - Syntax of sbr dataQ
Syntax No. of bits Mnemonic
sbr_data(harmonicSBR, bs_amp_res, numberSbrChannels, indepFlag)
{
switch (numberSbrChannels) {
case 1:
sbr_single_channel_element(harmonicSBR, bs_amp_res,
indepFlag);
break;
case 2:
sbr_channel_pair_element(harmonicSBR, bs_amp_res,
indepFlag);
break;
}
5
Table - Syntax of sbr envelopeO
Syntax No. of bits Mnemonic
sbr envelope(ch, bs coupling, bs amp res)
{
if (bs coupling) {
if (ch)
if (bs_amp_res) {
t_huff = t_huffman_env_bal_3_0dB;
f_huff = f_huffman_env_bal_3_0dB;
} else {
t_huff = t_huffman_env_bal_1_5dB;
f huff = f huffman env bal 1 5dB;
}
} else {
if (bs_amp_res) {
t_huff = t_huffman_env_3_0dB;
f_huff = f_huffman_env_3_0dB;
} else {
t_huff = t_huffman_env_1_5dB;
f_huff = f_huffman_env_1_5dB;
Note 2: sbr_huff_dec() is defined in ISO/IEC 14496-3:2009, 4.A.6.1.
Table - Syntax of FraminglnfoQ
Syntax No. of bits Mnemonic
Framinglnfo()
{
if (bsHighRateMode) {
bsFramingType; 1 uimsbf
bsNumParamSets; 3 uimsbf
} else {
bsFramingType = 0;
bsNumParamSets = 1;
}
numParamSets = bsNumParamSets + 1;
nBitsParamSlot = ceil(log2(numSlots));
if (bsFramingType) {
for (ps=0; ps 0 the index unambiguously defines the
number of channels, channel elements and associated loud¬
speaker mapping according to Table Y. The names of the
loudspeaker positions, the used abbreviations and the general
position of the available loudspeakers can be deduced from
Figs. 3a, 3b and Figs. 4a and 4b.
bsOutputChannelPos This index describes loudspeaker positions which are associ¬
ated to a given channel according to Fig. 4a. Fig. 4b indicates
the loudspeaker position in the 3D environment of the lis¬
tener. In order to ease the understanding of loudspeaker posi¬
tions Fig. 4a also contains loudspeaker positions according to
IEC 100/1706/CDV which are listed here for information to
the interested reader.
Table - Values of coreCoderFrameLength, sbrRatio, outputFrameLength and numSlots deendin
on coreSbrFrameLengthlndex
usacConfigExtensionPresent Indicates the presence of extensions to the configuration
numOutChannels If the value of channelConfigurationlndex indicates that none
of the pre-defined channel configurations is used then this
element determines the number of audio channels for which a
specific loudspeaker position shall be associated.
numElements This field contains the number of elements that will follow in
the loop over element types in the UsacDecoderConfigQ
usacElementTypefelemldx] defines the USAC channel element type of the element at
position elemldx in the bitstream. Four element types exist,
one for each of the four basic bitstream elements: Usac-
SingleChannelElement(), UsacChannelPairElement(),
UsacLfeElement(),UsacExtElement(). These elements pro¬
vide the necessary top level structure while maintaining all
needed flexibility. The meaning of usacElementType is de¬
fined in Table A.
Table A - Value of usacElementT e
stereoConfiglndex This element determines the inner structure of a UsacChannelPairElement().
It indicates the use of a mono or stereo
core, use of MPS212, whether stereo SBR is applied, and
whether residual coding is applied in MPS212 according to
Table ZZ. This element also defines the values of the helper
elements bsStereoSbr and bsResidualCoding.
Table ZZ - Values of stereoConfiglndex and its meaning and implicit assignment of bsStereoSbr and
tvv dct This flag signals the usage of the time-warped MDCT in this
stream.
noiseFilling This flag signals the usage of the noise filling of spectral
holes in the FD core coder.
harmonicSBR This flag signals the usage of the harmonic patching for the
SBR.
bs interTes This flag signals the usage of the inter-TES tool in SBR.
dflt start freq This is the default value for the bitstream element
bs_start_freq, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt_stop_freq This is the default value for the bitstream element
bs_stop_freq, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt header extral This is the default value for the bitstream element
bs_header_extral, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt_header_extra2 This is the default value for the bitstream element
bs_header_extra2, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt_freq_scale This is the default value for the bitstream element
bs_freq_scale, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
d£lt_alter_scale This is the default value for the bitstream element
bs_alter_scale, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt noise bands This is the default value for the bitstream element
bs_noise_bands, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeaderO
elements shall be assumed.
dflt_limiter_bands This is the default value for the bitstream element
bs_limiter_bands, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt_limiter_gains This is the default value for the bitstream element
bs_limiter_gains, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt_interpoI_freq This is the default value for the bitstream element
bs_interpol_freq, which is applied in case the flag sbrUseDfltHeader
indicates that default values for the SbrHeader()
elements shall be assumed.
dflt_smoothing_mode This is the default value for the bitstream element
bs_smoothing_mode, which is applied in case the flag
sbrUseDfltHeader indicates that default values for the
SbrHeader() elements shall be assumed.
usacExtElementType this element allows to signal bitstream extensions types. The
meaning of usacExtElementType is defined in Table B.
Table B - Value of usacExtElementT e
usacExtEIementConfigLength signals the length of the extension configuration in
bytes (octets).
usacExtElementDefaultLengthPresent This flag signals whether a usacExtElement-
DefaultLength is conveyed in the UsacExtElementConfig().
usacExtElementDefaultLength signals the default length of the extension element in
bytes. Only if the extension element in a given access unit
deviates from this value, an additional length needs to be
transmitted in the bitstream. If this element is not explicitly
transmitted (usacExtElementDefaultLengthPresent==0) then
the value of usacExtElementDefaultLength shall be set to
zero.
usacExtElementPayloadFrag This flag indicates whether the payload of this exten¬
sion element may be fragmented and send as several seg¬
ments in consecutive USAC frames.
numConfigExtensions If extensions to the configuration are present in the UsacConfig()
this value indicates the number of signaled configuration
extensions.
confExtldx Index to the configuration extensions.
usacConfigExtType This element allows to signal configuration extension types.
The meaning of usacExtElementType is defined in Table D.
usacConfigExtLength signals the length of the configuration extension in bytes (oc
tets).
bsPseudoLr This flag signals that an inverse mid/side rotation should be
applied to the core signal prior to Mps212 processing.
Table - bsPseudoLr
bsStereoSbr This flag signals the usage of the stereo SBR in combination
with MPEG Surround decoding.
Table - bsStereoSbr
bsResidualCoding indicates whether residual coding is applied according to the
Table below. The value of bsResidualCoding is defined by
stereoConfiglndex (see X).
Table - bsResidualCoding
bsResidualCoding Meaning
0 no residual coding, core coder is mono
1 residual coding, core coder is stereo
sbrRatioIndex indicates the ratio between the core sampling rate and the
sampling rate after eSBR processing. At the same time it in
dicates the number of QMF analysis and synthesis bands used
in SBR according to the Table below.
Table - Definition of sbrRatioIndex
elemldx Index to the elements present in the UsacDecoderConfig()
and the UsacFrame().
UsacConfigO
The UsacConfigO contains information about output sampling frequency and channel configuration.
This information shall be identical to the information signaled outside of this
element, e.g. in an MPEG-4 AudioSpecificConfig().
Usac Output Sampling Frequency
If the sampling rate is not one of the rates listed in the right column in Table 1, the sampling
frequency dependent tables (code tables, scale factor band tables etc.) must be de¬
duced in order for the bitstream payload to be parsed. Since a given sampling frequency is
associated with only one sampling frequency table, and since maximum flexibility is de¬
sired in the range of possible sampling frequencies, the following table shall be used to
associate an implied sampling frequency with the desired sampling frequency dependent
tables.
Table 1 - Sampling frequency mapping
Frequency range (in Hz) Use tables for sampling frequency (in Hz)
f >= 92017 96000
92017 > f >= 75132 88200
75132 > f >= 55426 64000
55426 > f >= 46009 48000
46009 > f >= 37566 44100
37566 > f >= 27713 32000
27713 > f >= 23004 24000
23004 > f >= 18783 22050
18783 > f >= 13856 16000
13856 > f >= 11502 12000
1 502 > f >= 9391 11025
9391 > f 8000
UsacChannelConfig
The channel configuration table covers most common loudspeaker positions. For further
flexibility channels can be mapped to an overall selection of 32 loudspeaker positions
found in modern loudspeaker setups in various applications (see Figs. 3a, 3b)
For each channel contained in the bitstream the UsacChannelConfig() specifies the associ¬
ated loudspeaker position to which this particular channel shall be mapped. The loud¬
speaker positions which are indexed by bsOutputChannelPos are listed in Fig. 4a. In case
of multiple channel elements the index i of bsOutputChannelPosfi] indicates the position in
which the channel appears in the bitstream. Figure Y gives an overview over the loud¬
speaker positions in relation to the listener.
More precisely the channels are numbered in the sequence in which they appear in the bitstream
starting with 0 (zero). In the trivial case of a UsacSingleChannelElement() or
UsacLfeElement() the channel number is assigned to that channel and the channel count is
increased by one. In case of a UsacChannelPairElement() the first channel in that element
(with index ch==0) is numbered first, whereas the second channel in that same element
(with index ch=l) receives the next higher number and the channel count is increased by
two.
It follows that numOutChannels shall be equal to or smaller than the accumulated sum of
all channels contained in the bitstream. The accumulated sum of all channels is equivalent
to the number of all UsacSingleChannelElement()s plus the number of all UsacLfeElement()
s plus two times the number of all UsacChannelPairElement()s.
All entries in the array bsOutputChannelPos shall be mutually distinct in order to avoid
double assignment of loudspeaker positions in the bitstream.
In the special case that channelConfigurationlndex is 0 and numOutChannels is smaller
than the accumulated sum of all channels contained in the bitstream, then the handling of
the non-assigned channels is outside of the scope of this specification. Information about
this can e.g. be conveyed by appropriate means in higher application layers or by specifically
designed (private) extension payloads.
UsacDecoderConfigO
The UsacDecoderConfigO contains all further information required by the decoder to in¬
terpret the bitstream. Firstly the value of sbrRatioIndex determines the ratio between core
coder frame length (ccfl) and the output frame length. Following the sbrRatioIndex is a
loop over all channel elements in the present bitstream. For each iteration the type of ele¬
ment is signaled in usacElementType[], immediately followed by its corresponding con¬
figuration structure. The order in which the various elements are present in the UsacDe¬
coderConfigO shall be identical to the order of the corresponding payload in the
UsacFrame().
Each instance of an element can be configured independently. When reading each channel
element in UsacFrame(), for each element the corresponding configuration of that instance,
i.e. with the same elemldx, shall be used.
UsacSingleChannelElementConfigO
The UsacSingleChannelElementConfigO contains all information needed for configuring
the decoder to decode one single channel. SBR configuration data is only transmitted if
SBR is actually employed.
UsacChannelPairEIementConfigO
The UsacChannelPairEIementConfigO contains core coder related configuration data as
well as SBR configuration data depending on the use of SBR. The exact type of stereo cod¬
ing algorithm is indicated by the stereoConfiglndex. In USAC a channel pair can be encoded
in various ways. These are:
1. Stereo core coder pair using traditional joint stereo coding techniques, extended by
the possibility of complex prediction in the MDCT domain
2. Mono core coder channel in combination with MPEG Surround based MPS212 for
fully parametric stereo coding. Mono SBR processing is applied on the core signal.
3. Stereo core coder pair in combination with MPEG Surround based MPS212, where
the first core coder channel carries a downmix signal and the second channel car¬
ries a residual signal. The residual may be band limited to realize partial residual
coding. Mono SBR processing is applied only on the downmix signal before
MPS212 processing.
4. Stereo core coder pair in combination with MPEG Surround based MPS212, where
the first core coder channel carries a downmix signal and the second channel carries
a residual signal. The residual may be band limited to realize partial residual
coding. Stereo SBR is applied on the reconstructed stereo signal after MPS212
processing.
Option 3 and 4 can be further combined with a pseudo LR channel rotation after the core
decoder.
UsacLfeElementConfigO
Since the use of the time warped MDCT and noise filling is not allowed for LFE channels,
there is no need to transmit the usual core coder flag for these tools. They shall be set to
zero instead.
Also the use of SBR is not allowed nor meaningful in an LFE context. Thus, SBR configu¬
ration data is not transmitted.
UsacCoreConfigO
The UsacCoreConfigO only contains flags to en- or disable the use of the time warped
MDCT and spectral noise filling on a global bitstream level. If tw_mdct is set to zero, time
warping shall not be applied. If noiseFilling is set to zero the spectral noise filling shall not
be applied.
SbrConfigO
The SbrConfigO bitstream element serves the purpose of signaling the exact eSBR setup
parameters. On one hand the SbrConfigO signals the general employment of eSBR tools.
On the other hand it contains a default version of the SbrHeaderO, the SbrDfltHeaderOThe
values of this default header shall be assumed if no differing SbrHeaderO is transmit¬
ted in the bitstream. The background of this mechanism is, that typically only one set of
SbrHeaderO values are applied in one bitstream. The transmission of the SbrDfltHeaderO
then allows to refer to this default set of values very efficiently by using only one bit in the
bitstream. The possibility to vary the values of the SbrHeader on the fly is still retained by
allowing the in-band transmission of a new SbrHeader in the bitstream itself.
SbrDfltHeaderO
The SbrDfltHeaderO is what may be called the basic SbrHeader() template and should con¬
tain the values for the predominantly used eSBR configuration. In the bitstream this con¬
figuration can be referred to by setting the sbrUseDfltHeader flag. The structure of the
SbrDfltHeader() is identical to that of SbrHeader(). In order to be able to distinguish between
the values of the SbrDfltHeaderO and SbrHeader(), the bit fields in the
SbrDfltHeaderO are prefixed with "dflt_" instead of "bs_". If the use of the
SbrDfltHeaderO is indicated, then the SbrHeader() bit fields shall assume the values of the
corresponding SbrDfltHeaderO, i.e.
bs_start_f req = df lt_start_f req;
bs_stop_freq = df lt_stop_f req;
etc.
(continue for all elements i n SbrHeader () , like:
bs_xxx_yyy = df lt_xxx_yyy;
Mps212ConfigO
The Mps212ConfigO resembles the SpatialSpecificConfig() of MPEG Surround and was in
large parts deduced from that. It is however reduced in extent to contain only information
relevant for mono to stereo upmixing in the USAC context. Consequently MPS212 configures
only one OTT box.
UsacExtElementConfigO
The UsacExtElementConfigO is a general container for configuration data of extension
elements for USAC. Each USAC extension has a unique type identifier, usacExtElement-
Type, which is defined in Fig. 6k. For each UsacExtElementConfigO the length of the con¬
tained extension configuration is transmitted in the variable usacExtElementConfigLength
and allows decoders to safely skip over extension elements whose usacExtElementType is
unknown.
For USAC extensions which typically have a constant payload length, the UsacExtEle¬
mentConfigO allows the transmission of a usacExtElementDefaultLength. Defining a de¬
fault payload length in the configuration allows a highly efficient signaling of the
usacExtElementPayloadLength inside the UsacExtElement(), where bit consumption needs
to be kept low.
In case of USAC extensions where a larger amount of data is accumulated and transmitted
not on a per frame basis but only every second frame or even more rarely, this data may be
transmitted in fragments or segments spread over several USAC frames. This can be helpful
in order to keep the bit reservoir more equalized. The use of this mechanism is signaled
by the flag usacExtElementPayloadFrag flag. The fragmentation mechanism is further ex¬
plained in the description of the usacExtElement in 6.2.X.
UsacConfigExtensionO
The UsacConfigExtensionO is a general container for extensions of the UsacConfig(). It
provides a convenient way to amend or extend the information exchanged at the time of
the decoder initialization or set-up. The presence of config extensions is indicated by
usacConfigExtensionPresent. If config extensions are present (usacConfigExtensionPresent==
l), the exact number of these extensions follows in the bit field numConfigExtensions.
Each configuration extension has a unique type identifier, usacConfigExtType. For
each UsacConfigExtension the length of the contained configuration extension is transmit¬
ted in the variable usacConfigExtLength and allows the configuration bitstream parser to
safely skip over configuration extensions whose usacConfigExtType is unknown.
Top level payloads for the audio object type USAC
Terms and definitions
UsacFrameO This block of data contains audio data for a time period of
one USAC frame, related information and other data. As sig¬
naled in UsacDecoderConfigO, the UsacFrameO contains
numElements elements. These elements can contain audio
data, for one or two channels, audio data for low frequency
enhancement or extension payload.
UsacSingleChannelElementO Abbreviation SCE. Syntactic element of the bitstream
containing coded data for a single audio channel. A single_
channel_element() basically consists of the UsacCore-
CoderData(), containing data for either FD or LPD core
coder. In case SBR is active, the UsacSingleChannelElement
also contains SBR data.
UsacChannelPairElementO Abbreviation CPE. Syntactic element of the bitstream payload
containing data for a pair of channels. The channel pair
can be achieved either by transmitting two discrete channels
or by one discrete channel and related Mps212 payload. This
is signaled by means of the stereoConfiglndex. The UsacChannelPairElement
further contains SBR data in case SBR
is active.
UsacLfeElementQ Abbreviation LFE. Syntactic element that contains a low
sampling frequency enhancement channel. LFEs are always
encoded using the fd_channel_stream() element.
UsacExtElementQ Syntactic element that contains extension payload. The length
of an extension element is either signaled as a default length
in the configuration (USACExtElementConfig()) or signaled
in the UsacExtElement() itself. If present, the extension payload
is of type usacExtElementType, as signaled in the con¬
figuration.
usacIndependencyFlag indicates if the current UsacFrame() can be decoded entirely
without the knowledge of information from previous frames
according to the Table below
Table - Meanin of usacInde endenc Fla
NOTE: Please refer to X.Y for recommendations on the use
of the usacIndependencyFlag.
usacExtElementUseDefaultLength indicates whether the length of the extension element
corresponds to usacExtElementDefaultLength, which was de¬
fined in the UsacExtElementConfig().
usacExtElementPayloadLength shall contain the length of the extension element in
bytes. This value should only be explicitly transmitted in the
bitstream if the length of the extension element in the present
access unit deviates from the default value, usacExtEIement-
DefaultLength.
usacExtElementStart Indicates if the present usacExtElementSegmentData begins a
data block.
usacExtElementStop Indicates if the present usacExtElementSegmentData ends a
data block.
usacExtElementSegmentData The concatenation of all usacExtElementSegmentData
from UsacExtElement() of consecutive USAC frames, start¬
ing from the UsacExtElement() with usacExtElement¬
Start— 1 up to and including the UsacExtElement() with
usacExtElementStop==l forms one data block. In case a
complete data block is contained in one UsacExtElement(),
usacExtElementStart and usacExtElementStop shall both be
set to 1. The data blocks are interpreted as a byte aligned ex¬
tension payload depending on usacExtElementType accord¬
ing to the following Table:
Table - Inter retation of data blocks for USAC extension a load decodin
fill byte Octet of bits which may be used to pad the bitstream with bits
that carry no information. The exact bit pattern used for
fi byte should be '10100101'.
Helper Elements
nrCoreCoderChannels In the context of a channel pair element this variable indi¬
cates the number of core coder channels which form the basis
for stereo coding. Depending on the value of stereoConfiglndex
this value shall be 1 or 2.
nrSbrChannels In the context of a channel pair element this variable indi¬
cates the number of channels on which SBR processing is
applied. Depending on the value of stereoConfiglndex this
value shall be 1 or 2.
Subsidiary payloads for USAC
Terms and Definitions
UsacCoreCoderDataQ This block of data contains the core-coder audio data. The
payload element contains data for one or two core-coder
channels, for either FD or LPD mode. The specific mode is
signaled per channel at the beginning of the element.
StereoCoreToolInfoQ All stereo related information is captured in this element. It
deals with the numerous dependencies of bits fields in the
stereo coding modes.
Helper Elements
commonCoreMode in a CPE this flag indicates if both encoded core coder chan¬
nels use the same mode.
Mps212Data() This block of data contains payload for the Mps212 stereo
module. The presence of this data is dependent on the stereo¬
Configlndex.
common window indicates if channel 0 and channel 1 of a CPE use identical
window parameters.
common tw indicates if channel 0 and channel 1 of a CPE use identical
parameters for the time warped MDCT.
Decoding of UsacFrameO
One UsacFrameO forms one access unit of the USAC bitstream. Each UsacFrame decodes
into 768, 1024, 2048 or 4096 output samples according to the outputFrameLength deter¬
mined from a Table.
The first bit in the UsacFrame() is the usacIndependencyFlag, which determines if a given
frame can be decoded without any knowledge of the previous frame. If the usacIndepend¬
encyFlag is set to 0, then dependencies to the previous frame may be present in the payload
of the current frame.
The UsacFrame() is further made up of one or more syntactic elements which shall appear
in the bitstream in the same order as their corresponding configuration elements in the
UsacDecoderConfig(). The position of each element in the series of all elements is indexed
by elemldx. For each element the corresponding configuration, as transmitted in the
UsacDecoderConfigO, of that instance, i.e. with the same elemldx, shall be used.
These syntactic elements are of one of four types, which are listed in a Table. The type of
each of these elements is determined by usacElementType. There may be multiple ele¬
ments of the same type. Elements occurring at the same position elemldx in different
frames shall belong to the same stream.
Table - Exam les of sim le ossible bitstream a loads
If these bitstream payloads are to be transmitted over a constant rate channel then they
might include an extension payload element with an usacExtEIementType of
ID_EXT_ELE_FILL to adjust the instantaneous bitrate. In this case an example of a coded
stereo signal is:
Table - Examples of simple stereo bitstream
with extension a load for writin fill bits.
Decoding of UsacSingleChannelEIementO
The simple structure of the UsacSingleChannelEIementO is made up of one instance of a
UsacCoreCoderData() element with nrCoreCoderChannels set to 1. Depending on the
sbrRatioIndex of this element a UsacSbrData() element follows with nrSbrChannels set to
1 as well.
Decoding of UsacExtElementO
UsacExtElementO structures in a bitstream can be decoded or skipped by a USAC decoder.
Every extension is identified by a usacExtElementType, conveyed in the UsacExtElement()'
s associated UsacExtElementConfigO. For each usacExtElementType a specific
decoder can be present.
If a decoder for the extension is available to the USAC decoder then the payload of the
extension is forwarded to the extension decoder immediately after the UsacExtElementO
has been parsed by the USAC decoder.
If no decoder for the extension is available to the USAC decoder, a minimum of structure
is provided within the bitstream, so that the extension can be ignored by the USAC de¬
coder.
The length of an extension element is either specified by a default length in octets, which
can be signaled within the corresponding UsacExtElementConfigO and which can be over¬
ruled in the UsacExtElementO, or by an explicitly provided length information in the
UsacExtElementO, which is either one or three octets long, using the syntactic element
escapedValue().
Extension payloads that span one or more UsacFrame()s can be fragmented and their payload
be distributed among several UsacFrame()s. In this case the usacExtElementPayload-
Frag flag is set to 1 and a decoder must collect all fragments from the UsacFrameO with
usacExtElementStart set to 1 up to and including the UsacFrame() with usacExtElement-
Stop set to 1. When usacExtElementStop is set to 1 then the extension is considered to be
complete and is passed to the extension decoder.
Note that integrity protection for a fragmented extension payload is not provided by this
specification and other means should be used to ensure completeness of extension payloads.
Note, that all extension payload data is assumed to be byte-aligned.
Each UsacExtElement() shall obey the requirements resulting from the use of the usaclndependencyFlag.
Put more explicitly, if the usacIndependencyFlag is set (==1) the
UsacExtElement() shall be decodable without knowledge of the previous frame (and the
extension payload that may be contained in it).
Decoding Process
The stereoConfiglndex, which is transmitted in the UsacChannelPairElementConfig(), de¬
termines the exact type of stereo coding which is applied in the given CPE. Depending on
this type of stereo coding either one or two core coder channels are actually transmitted in
the bitstream and the variable nrCoreCoderChannels needs to be set accordingly. The syntax
element UsacCoreCoderData() then provides the data for one or two core coder chan¬
nels.
Similarly the there may be data available for one or two channels depending on the type of
stereo coding and the use of eSBR (ie. if sbrRatioIndex>0). The value of nrSbrChannels
needs to be set accordingly and the syntax element UsacSbrData() provides the eSBR data
for one or two channels.
Finally Mps212Data() is transmitted depending on the value of stereoConfiglndex.
Low frequency enhancement (LFE) channel element, UsacLfeElementO
General
In order to maintain a regular structure in the decoder, the UsacLfeElement() is defined as
a standard fd_channel_stream(0,0,0,0,x) element, i.e. it is equal to a UsacCoreCoderData()
using the frequency domain coder. Thus, decoding can be done using the standard procedure
for decoding a UsacCoreCoderData()-e.lement.
In order to accommodate a more bitrate and hardware efficient implementation of the LFE
decoder, however, several restrictions apply to the options used for the encoding of this
element:
· The window_sequence field is always set to 0 (ONLY_LONG_SEQUENCE)
• Only the lowest 24 spectral coefficients of any LFE may be non-zero
• No Temporal Noise Shaping is used, i.e. tns_data_present is set to 0
• Time warping is not active
• No noise filling is applied
UsacCoreCoderDataO
The UsacCoreCoderDataO contains all information for decoding one or two core coder
channels.
The order of decoding is:
• get the core_mode[] for each channel
• in case of two core coded channels (nrChannels— 2), parse the StereoCore¬
ToolInfoO and determine all stereo related parameters
· Depending on the signaled core_modes transmit an lpd_channel_stream() or an
fd_channel_stream() for each channel
As can be seen from the above list, the decoding of one core coder channel (nrChannels==
l) results in obtaining the core_mode bit followed by one lpd_channel_stream or
fd_channel_stream, depending on the core_mode.
In the two core coder channel case, some signaling redundancies between channels can be
exploited in particular if the corejtnode of both channels is 0. See 6.2.X (Decoding of
StereoCoreToolInfoO) for details
StereoCoreToolInfoO
The StereoCoreToolInfoO allows to efficiently code parameters, whose values may be
shared across core coder channels of a CPE in case both channels are coded in FD mode
(core_mode[0,l]==0). In particular the following data elements are shared, when the appropriate
flag in the bitstream is set to 1.
common_xxx flag is set to 1 channels 0 and 1 share the following
elements:
common_window ics_info()
common_window && common_max_sfb max_sfb
common_tw tw_data()
common_tns tns_data()
If the appropriate flag is not set then the data elements are transmitted individually for each
core coder channel either in StereoCoreToolInfo() (max_sfb, max sfbl) or in the
fd_channel_stream() which follows the StereoCoreToolInfo() in the UsacCoreCoderData()
element.
In case of common_window==l the StereoCoreToolInfo() also contains the information
about M/S stereo coding and complex prediction data in the MDCT domain (see 7.7.2).
UsacSbrDataQ This block of data contains payload for the SBR bandwidth
extension for one or two channels. The presence of this data
is dependent on the sbrRatioIndex.
SbrlnfoQ This element contains SBR control parameters which do not
require a decoder reset when changed.
SbrHeaderQ This element contains SBR header data with SBR configura¬
tion parameters, that typically do not change over the dura¬
tion of a bitstream.
SBR payload for USAC
In USAC the SBR payload is transmitted in UsacSbrData(), which is an integral part of
each single channel element or channel pair element. UsacSbrDataO follows immediately
UsacCoreCoderDataQ. There is no SBR payload for LFE channels.
numSlots The number of time slots in an Mps212Data frame.
Fig. 1 illustrates an audio decoder for decoding an encoded audio signal provided at an
input 10. On the input line 10, there is provided the encoded audio signal which is, for ex¬
ample, a data stream or, even more exemplarily, a serial data stream. The encoded audio
signal comprises a first channel element and a second channel element in the payload section
of the data stream and first decoder configuration data for the first channel element
and second decoder configuration data for the second channel element in a configuration
section of the data stream. Typically, the first decoder configuration data will be different
from the second decoder configuration data, since the first channel element will also typi¬
cally be different from the second channel element.
The data stream or encoded audio signal is input into a data stream reader 12 for reading
the configuration data for each channel element and forwarding same to a configuration
controller 1 via a connection line 13. Furthermore, the data stream reader is arranged for
reading the payload data for each channel element in the payload section and this payload
data comprising the first channel element and the second channel element is provided to a
configurable decoder 16 via a connection line 15. The configurable decoder 1 is arranged
for decoding the plurality of channel elements in order to output data for the individual
channel elements as indicated at output lines 18a, 18b. Particularly, the configurable de¬
coder 16 is configured in accordance with the first decoder configuration data when decoding
the first channel element and in accordance with the second configuration data when
decoding the second channel element. This is indicated by the connection lines 17a, 17b,
where connection line 17a transports the first decoder configuration data from the configu¬
ration controller 14 to the configurable decoder and connecting line 17b transports the sec¬
ond decoder configuration data from the configuration controller to the configurable decoder.
The configuration controller will be implemented in any way in order to make the
configurable decoder to operate in accordance with the decoder configuration signaled in
the corresponding decoder configuration data or on the corresponding line 17a, 17b.
Hence, the configuration controller 4 can be implemented as an interface between the data
stream reader 12 which actually gets the configuration data from the data stream and the
configurable decoder 16 which is configured by the actually read configuration data.
Fig. 2 illustrates a corresponding audio encoder for encoding a multi-channel input audio
signal provided at an input 20. The input 20 is illustrated as comprising three different lines
20a, 20b, 20c, where line 20a carries, for example, a center channel audio signal, line 20b
carries a left channel audio signal and line 20c carries a right channel audio signal. All
three channel signals are input into a configuration processor 22 and a configurable en¬
coder 24. The configuration processor is adapted for generating first configuration data on
line 21a and second configuration data on line 21b for a first channel element, for example
comprising only the center channel so that the first channel element is a single channel
element, and for a second channel element which is, for example, a channel pair element
carrying the left channel and the right channel. The configurable encoder 24 is adapted for
encoding the multi-channel audio signal 20 to obtain the first channel element 23a and the
second channel element 23b using the first configuration data 21a and the second configu¬
ration data 21b. The audio encoder additionally comprises a data stream generator 26
which receives, at input lines 25a and 25b, the first configuration data and the second con¬
figuration data and which receives, additionally, the first channel element 23a and the sec¬
ond channel element 23b. The data stream generator 26 is adapted for generating a data
stream 27 representing an encoded audio signal, the data stream having a configuration
section having the first and the second configuration data and a payload section comprising
the first channel element and the second channel element.
In this context, it is outlined that the first configuration data and the second configuration
data can be identical to the first decoder configuration data or the second decoder configu¬
ration data or can be different. In the latter case, the configuration controller 14 is config¬
ured to transform the configuration data in the data stream, when the configuration data is
an encoder-directed data, into corresponding decoder-directed data by applying, for exam¬
ple, unique functions or lookup tables or so. However, it is preferred that the configuration
data written into the data stream is already a decoder configuration data so that the config¬
urable encoder 24 or the configuration processor 22 have, for example, a functionality for
deriving encoder configuration data from calculated decoder configuration data or for cal¬
culating or determining decoder configuration data from calculated encoder configuration
data again by applying unique functions or lookup tables or other pre-knowledge.
Fig. 5a illustrates a general illustration of the encoded audio signal input into the data
stream reader 12 of Fig. 1 or output by the data stream generator 26 of Fig. 2. The data
stream comprises a configuration section 50 and a payload section 52. Fig. 5b illustrates a
more detailed implementation of the configuration section 50 in Fig. 5a. The data stream
illustrated in Fig. 5b which is typically a serial data stream carrying one bit after the other
comprises, at its first portion 50a, general configuration data relating to higher layers of the
transport structure such as an MPEG-4 file format. Alternatively or additionally, the con¬
figuration data 50a, which may be there or may not be there comprises additional general
configuration data included in the UsacChannelConfig illustrated at 50b.
Generally, the configuration data 50a can also comprise the data from UsacConfig illus¬
trated in Fig. 6a, and item 50b comprises the elements implemented and illustrated in the
UsacChannelConfig of Fig. 6b. Particularly, the same configuration for all channel elements
may, for example, comprise the output channel indication illustrated and described
in the context of Figs. 3a, 3b and Figs. 4a, 4b.
Then, the configuration section 50 of the bitstream is followed by the UsacDecoderConfig
element which is, in this example, formed by a first configuration data 50c, a second con¬
figuration data 50d and a third configuration data 50e. The first configuration data 50c is
for the first channel element, the second configuration data 50d is for the second channel
element, and the third configuration data 50e is for the third channel element.
Particularly, as outlined in Fig. 5b, each configuration data for the channel element com¬
prises an identifier element type idx which is, with respect to its syntax, used in Fig. 6c.
Then, the element type index idx which has two bits is followed by the bits describing the
channel element configuration data found in Fig. 6c and further explained in Fig. 6d for the
single channel element, Fig. 6e for the channel pair element, Fig. 6f for the LFE element
and Fig. 6k for the extension element which are all channel elements that can typically be
included in the USAC bitstream.
Fig. 5c illustrates a USAC frame comprised in the payload section 52 of a bitstream illus¬
trated in Fig. 5a. When the configuration section in Fig. 5b forms the configuration section
50 of Fig. 5a, i.e., when the payload section comprises three channel elements, then the
payload section 52 will be implemented as outlined in Fig. 5c, i.e., that the payload data for
the first channel element 52a is followed by the payload data for the second channel ele¬
ment indicated by 52b which is followed by the payload data 52c for the third channel
element. Hence, in accordance with the present invention, the configuration section and the
payload section are organized in such a way that the configuration data is in the same order
with respect to the channel elements as the payload data with respect to the channel ele¬
ments in the payload section. Hence, when the order in the UsacDecoderConfig element is
configuration data for the first channel element, configuration data for the second channel
element, configuration data for the third channel element, then the order in the payload
section is the same, i.e., there is the payload data for the first channel element, then follows
the payload data for the second channel element and then follows the payload data for the
third channel element in a serial data or bit stream.
This parallel structure in the configuration section and the payload section is advantageous
due to the fact that it allows an easy organization with extremely low overhead signaling
regarding which configuration data belongs to which channel element. In the prior art, any
ordering was not required since the individual configuration data for channel elements did
not exist. However, in accordance with the present invention individual configuration data
for individual channel elements is introduced in order to make sure that the optimum con¬
figuration data for each channel element can be optimally selected.
Typically, a USAC frame comprises data for 20 to 40 milliseconds worth of time. When a
longer data stream is considered, as illustrated in Fig. 5d, then there is a configuration sec¬
tion 60a followed by payload sections or frames 62a, 62b, 62c, 62e, then a configura¬
tion section 62d is, again, included in the bitstream.
The order of configuration data in the configuration section is, as discussed with respect to
Figs. 5b and 5c, the same as the order of the channel element payload data in each of the
frames 62a to 62e. Therefore, also the order of the payload data for the individual channel
elements is exactly the same in each frame 62a to 62e.
Generally, when the encoded signal is a single file stored on a hard disk, for example, then
a single configuration section 50 is sufficient at the beginning of the whole audio track
such as a 10 minutes or 20 minutes or so track. Then, the single configuration section is
followed by a high number of individual frames and the configuration is valid for each
frame and the order of the channel element data (configuration or payload) is also the same
in each frame and in the configuration section.
However, when the encoded audio signal is a stream of data, it is necessary to introduce
configuration sections between individual frames in order to provide access points so that a
decoder can start decoding even when an earlier configuration section has already been
transmitted and has not been received by the decoder since the decoder was not yet
switched on to receive the actual data stream. The number n of frames between different
configuration sections, however, is arbitrarily selectable but when one would like to
achieve an access point each second, then the number of frames between two configuration
sections will be between 25 and 50.
Subsequently, Fig. 7 illustrates a straightforward example for encoding and decoding a 5.1
multi-channel signal.
Preferably, four channel elements are used, where the first channel element is a single
channel element comprising the center channel, the second channel element is a channel
pair element CPEl comprising the left channel and the right channel and the third channel
element is a second channel pair element CPE2 comprising the left surround channel and
the right surround channel. Finally, the fourth channel element is an LFE channel element.
In an embodiment, for example, the configuration data for the single channel element
would be so that the noise filling tool is on while, for example, for the second channel pair
element comprising the surround channels, the noise filling tool is off and the parametric
stereo coding procedure is applied which is a low quality, but low bitrate stereo coding
procedure resulting in a low bitrate but the quality loss may not be problematic due to the
fact that the channel pair element has the surround channels.
On the other hand, the left and right channels comprise a significant amount of information
and, therefore, a high quality stereo coding procedure is signaled by the MPS212 configu¬
ration. The M/S stereo coding is advantageous in that it provides a high quality but is problematic
in that the bitrate is quite high. Therefore, M/S stereo coding is preferable for the
CPE1 but is not preferable for the CPE2. Furthermore, depending on the implementation,
the noise filling feature can be switched on or off and is preferably switched on due to the
fact that a high emphasis is made to have a good and high quality representation of the left
and right channels as well as for the center channel where the noise filling is on as well.
However, when the core bandwidth of the channel element C is, for example, quite low
and the number of successive lines quantized to zero in the center channel is also low, then
it can also be useful to switch off noise filling for the center channel single channel ele¬
ment due to the fact that the noise filling does not provide additional quality gains and the
bits required for transmitting the side information for the noise filling tool can then be
saved in view of no or only a minor quality increase.
Generally, the tools signaled in the configuration section for a channel element are the
tools mentioned in, for example, Fig. 6d, 6e, 6f, 6g, 6h, 6i, 6j and additionally comprise the
elements for the extension element configuration in Figs. 6k, 6 1 and 6m. As outlined in Fig.
6e, the MPS212 configuration can be different for each channel element.
MPEG surround uses a compact parametric representation of the human's auditory cues for
spatial perception to allow for a bit-rate efficient representation of a multi-channel signal.
In addition to CLD and ICC parameters, IPD parameters can be transmitted. The OPD pa¬
rameters are estimated with given CLD and IPD parameters for efficient representation of
phase information. IPD and OPD parameters are used to synthesize the phase difference to
further improve stereo image.
In addition to the parametric mode, residual coding can be employed with the residual hav¬
ing a limited or full bandwidth. In this procedure, two output signals are generated by mix¬
ing a mono input signal and a residual signal using the CLD, ICC and IPD parameters.
Additionally, all the parameters mentioned in Fig. 6j can be individually selected for each
channel element. The individual parameters are, for example, explained in detail in
ISO/IEC CD 23003-3 dated September 24, 2010 which has been incorporated herein by
reference.
Additionally, as outlined in Figs. 6f and 6g, core features such as the time warping feature
and the noise filling feature can be switched on or off for each channel element individu¬
ally. The time warping tool described under the term "time-warped filter bank and block
switching" in the above referenced document replaces the standard filter bank and block
switching. In addition to the IMDCT, the tool contains a time-domain to time-domain
mapping from an arbitrarily spaced grid to the normal linearly spaced time grid and a cor¬
responding adaption of the window shapes.
Additionally, as outlined in Fig. 7, the noise filling tool can be switched on or off for each
channel element individually. In low bitrate coding, noise filling can be used for two purposes.
Course quantization of spectral values in low bitrate audio coding might lead to very
sparse spectra after inverse quantization, as many spectral lines might have been quantized
to zero. The sparse populated spectra will result in the decoded signal sounding sharp or
unstable (birdies). By replacing the zero lines with the "small" values in the decoder it is
possible to mask or reduce these very obvious artifacts without adding obvious new noise
artifacts.
If there are noise like signal parts in the original spectrum, a perceptually equivalent repre¬
sentation of these noisy signal parts can be reproduced in the decoder based on only few
parametric information like the energy of the noises signal part. The parametric information
can be transmitted with few bits compared to the number of bits needed to transmit the
coded wave form. Specifically, the data elements needed to transmit are the noise-offset
element which is an additional offset to modify the scale factor of bands quantized to zero
and the noise-level which is an integer representing the quantization noise to be added for
every spectral line quantized to zero.
As outlined in Fig. 7 and Fig. 6f and 6g, this feature can be switched on and off for each
channel element individually.
Additionally, there are SBR features which can now be signaled for each channel element
individually.
As outlined in Fig. 6h, these SBR elements comprise the switching on/off of different tools
in SBR. The first tool to be switched on or off for each channel element individually is
harmonic SBR. When harmonic SBR is switched on, the harmonic SBR pitching is per¬
formed while, when harmonic SBR is switched off, a pitching with consecutive lines as
known from MPEG-4 (high efficiency) is used.
Furthermore, the PVC or "predictive vector coding" decoding process can be applied. In
order to improve the subjective quality of the eSBR tool, in particular for speech content at
low bitrates, predictive vector coding (PVC is added to the eSBR tool). Generally, for a
speech signal, there is a relatively high correlation between the spectral envelopes of low
frequency bands and high frequency bands. In the PVC scheme this is exploited by the
prediction of the spectral envelopes in high frequency bands from the spectral envelopes in
low frequency bands, where the coefficient matrices for the prediction are coded by means
of vector quantization. The HF envelope adjuster is modified to process the envelopes gen¬
erated by the PVC decoder.
The PVC tool can therefore be particularly useful for the single channel element where
there is, for example, speech in the center channel, while the PVC tool is not useful, for
example, for the surround channels of CPE2 or the left and right channels of CPE1.
Furthermore, the inter time envelope shaping feature (inter-Tes) can be switched on or off
for each channel element individually. The inter-subband-sample temporal envelope shap¬
ing (inter-Tes) processes the QMF subband samples subsequent to the envelope adjuster.
This module shapes the temporal envelope of the higher frequency bandwidth finer tempo¬
ral granularity than that of the envelop adjuster. By applying a gain factor to each QMF
subband sample in an SBR envelope, inter-Tes shapes the temporal envelope among the
QMF subband samples. Inter-Tes consist of three modules, i.e., lower frequency intersubband
sample temporal envelope calculator, inter-subband-sample temporal envelope
adjuster and inter-subband-sample temporal envelope shaper. Due to the fact that this tool
requires additional bits, there will be channel elements where this additional bit consump¬
tion is not justified in view of the quality gain and where this additional bit consumption is
justified in view of the quality gain. Therefore, in accordance with the present invention, a
channel-element wise activation/deactivation of this tool is used.
Furthermore, Fig. 6i illustrates the syntax of the SBR default header and all SBR parame¬
ters in SBR default header mentioned in Fig. 6i can be selected different for each channel
element. This, for example, relates to the start frequency or stop frequency actually setting
the cross-over frequency, i.e., the frequency at which the reconstruction of the signal
changes away from mode into parametric mode. Other features such as the frequency resolution
and the noise band resolution etc., are also available for setting for each individual
channel element selectively.
Hence, as outlined in Fig. 7, it is preferred to individually set configuration data for stereo
features, for core coder features and for SBR features. Individual setting of elements not
only refers to the SBR parameters in the SBR default header as illustrated in Fig. 6i but
also applies to all parameters in SbrConfig as outlined in Fig. 6h.
Subsequently, reference is made to Fig. 8 for illustrating an implementation of the decoder
of Fig. 1.
In particular, the functionalities of the data stream reader 12 and the configuration control¬
ler 14 are similar as discussed in the context of Fig. 1. However, the configurable decoder
16 is now implemented, for example, for individual decoder instances where each decoder
instance has an input for configuration data C provided by the configuration controller 14
and an input for data D for receiving the corresponding channel elements data from the
data stream reader 1 .
In particular, the functionality of Fig. 8 is so that, for each individual channel element, an
individual decoder instant is provided. Hence, the first decoder instance is configured by
the first configuration data as, for example, a single channel element for the center channel.
Furthermore, the second decoder instance is configured in accordance with the second de¬
coder configuration data for the left and right channels of a channel pair element. Furthermore,
the third decoder instance 16c is configured for a further channel pair element com¬
prising the left surround channel and the right surround channel. Finally, the fourth de¬
coder instance is configured for the LFE channel. Hence, the first decoder instance pro¬
vides, as an output, a single channel C. The second and third decoder instances 16b, 16c,
however, each provide two output channels, i.e., left and right on the one hand and left
surround and right surround on the other hand. Finally, the fourth decoder instance 16d
provides, as an output, the LFE channel. All these six channels of the multi-channel signal
are forwarded to an output interface 19 by the decoder instances and are then finally sent
out for storage, for example, or for replay in a 5.1 loudspeaker setup, for example. It is
clear that different decoder instances and a different number of decoder instances are required
when the loudspeaker setup is a different loudspeaker setup.
Fig. 9 illustrates a preferred implementation of the method for performing decoding an
encoded audio signal in accordance with an embodiment of the present invention.
In step 90, the data stream reader 12 starts reading the configuration section 50 of Fig. 5a.
Then, based on the channel element identification in the corresponding configuration data
block 50c, the channel element is identified as indicated in step 92. In step 94 the configuration
data for this identified channel element is read and used for actually configuring the
decoder or for storing to be used later for configuring the decoder when the channel ele¬
ment is later processed. This is outlined in step 94.
In step 96, the next channel element is identified using the element type identifier of the
second configuration data in portion 50d of Fig. 5b. This is indicated in step 96 of Fig. 9.
Then, in step 98, the configuration data is read and either used to configure the actually
decoder or decoder instance or is read in order to alternatively store the configuration data
for the time when the payload for this channel element is to be decoded.
Then, in step 100 it is looped over the whole configuration data, i.e., the identification of
the channel element and the reading of the configuration data for the channel element is
continued until all configuration data is read.
Then, in steps 102, 104, 106 the payload data for each channel elements are read and are
finally decoded in step 108 using the configuration data C, where the payload data is indi¬
cated by D. The result of the step 108 are the data output by, for example, blocks 16a to
16d which can then, for example, be directly sent out to loudspeakers or which are to be
synchronized, amplified, further processed or digital/analog converted to be finally sent to
the corresponding loudspeakers.
Although some aspects have been described in the context of an apparatus, it is clear that
these aspects also represent a description of the corresponding method, where a block or
device corresponds to a method step or a feature of a method step. Analogously, aspects
described in the context of a method step also represent a description of a corresponding
block or item or feature of a corresponding apparatus.
Depending on certain implementation requirements, embodiments of the invention can be
implemented in hardware or in software. The implementation can be performed using a
digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an
EPROM, an EEPROM or a FLASH memory, having electronically readable control sig¬
nals stored thereon, which cooperate (or are capable of cooperating) with a programmable
computer system such that the respective method is performed.
Some embodiments according to the invention comprise a non-transitory data carrier hav¬
ing electronically readable control signals, which are capable of cooperating with a pro¬
grammable computer system, such that one of the methods described herein is performed.
The encoded audio signal can be transmitted via a wireline or wireless transmission me¬
dium or can be stored on a machine readable carrier or on a non-transitory storage medium.
Generally, embodiments of the present invention can be implemented as a computer pro¬
gram product with a program code, the program code being operative for performing one
of the methods when the computer program product runs on a computer. The program code
may for example be stored on a machine readable carrier.
Other embodiments comprise the computer program for performing one of the methods
described herein, stored on a machine readable carrier.
In other words, an embodiment of the inventive method is, therefore, a computer program
having a program code for performing one of the methods described herein, when the
computer program runs on a computer.
A further embodiment of the inventive methods is, therefore, a data carrier (or a digital
storage medium, or a computer-readable medium) comprising, recorded thereon, the com¬
puter program for performing one of the methods described herein.
A further embodiment of the inventive method is, therefore, a data stream or a sequence of
signals representing the computer program for performing one of the methods described
herein. The data stream or the sequence of signals may for example be configured to be
transferred via a data communication connection, for example via the Internet.
A further embodiment comprises a processing means, for example a computer, or a programmable
logic device, configured to or adapted to perform one of the methods described
herein.
A further embodiment comprises a computer having installed thereon the computer pro¬
gram for performing one of the methods described herein.
In some embodiments, a programmable logic device (for example a field programmable
gate array) may be used to perform some or all of the functionalities of the methods de¬
scribed herein. In some embodiments, a field programmable gate array may cooperate with
a microprocessor in order to perform one of the methods described herein. Generally, the
methods are preferably performed by any hardware apparatus.
The above described embodiments are merely illustrative for the principles of the present
invention. It is understood that modifications and variations of the arrangements and the
details described herein will be apparent to others skilled in the art. It is the intent, there¬
fore, to be limited only by the scope of the impending patent claims and not by the specific
details presented by way of description and explanation of the embodiments herein.
Claims
1. Audio decoder for decoding an encoded audio signal (10), the encoded audio signal
(10) comprising a first channel element (52a) and a second channel element (52b)
in a payload section (52) of a data stream and first decoder configuration data (50c)
for the first channel element (52a) and second decoder configuration data (50d) for
the second channel element (52b) in a configuration section (50) of the data stream,
comprising:
a data stream reader (12) for reading the configuration data for each channel ele¬
ment in the configuration section and for reading the payload data for each channel
element in the payload section;
a configurable decoder (16) for decoding the plurality of channel elements; and
a configuration controller (14) for configuring the configurable decoder (16) so that
the configurable decoder (16) is configured in accordance with the first decoder
configuration data when decoding the first channel element and in accordance with
the second decoder configuration data when decoding the second channel element.
2. Audio decoder in accordance with claim 1,
wherein the first channel element is a single channel element comprising payload
data for a first output channel, and
wherein the second channel element is a channel pair element comprising payload
data for a second output channel and the third output channel,
wherein the configurable decoder (16) is arranged for generating a single output
channel, when decoding the first channel element and for generating two output
channels when decoding the second channel element, and
wherein the audio decoder is configured for outputting (19) the first output channel,
the second output channel and the third output channel for a simultaneous output
via three different audio output channels.
3. Audio decoder of claims 1 or 2,
wherein the first channel is a center channel and wherein the second channel and
the third channel are a left channel and a right channel or a left surround channel
and a right surround channel.
4. Audio decoder in accordance with claim 1,
wherein the first channel element is a first channel pair element comprising data for
a first and the second output channel and wherein the second channel element is a
second channel pair element comprising payload data for a third output channel and
the fourth output channel,
wherein the configurable decoder (16) is configured for generating a first and the
second output channel when decoding the first channel element and for generating a
third output channel and a fourth output channel when decoding the second channel
element, and
wherein the audio decoder is configured for outputting (19) the first output channel,
the second output channel, the third output channel and the fourth output channel
for a simultaneous output wire for different audio output channels.
5. Audio decoder in accordance with claim 4,
wherein the first channel is a left channel, the second channel is a right channel, the
third channel is a left surround channel and the fourth channel is a right surround
channel.
6. Audio decoder in accordance with one of the preceding claims,
wherein the encoded audio signal additionally comprises, in the configuration sec¬
tion of the data stream, a general configuration section (50a, 50b) having informa¬
tion for the first channel element and the second channel element and wherein the
configuration controller (14) is arranged to configure the configurable decoder (16)
for the first and the second channel element with the configuration information
from the general configuration section (50a, 50b).
7. Audio decoder in accordance with one of the preceding claims,
wherein the first configuration section (50c) is different from the second configura¬
tion section (50d), and
wherein the configuration controller is arranged to configure the configurable de¬
coder (16) for decoding the second channel element different from a configuration
used when decoding the first channel element.
Audio decoder in accordance with one of the preceding claims,
wherein the first decoder configuration data (50c) and the second decoder configu¬
ration data (50d) comprise information on a stereo decoding tool, a core decoding
tool or an SBR decoding tool, and
wherein the configurable decoder (16) comprises the SBR decoding tool, the core
decoding tool and the stereo decoding tool.
Audio decoder in accordance with one of the preceding claims,
wherein the payload section (52) comprises a sequence of frames, each frame com¬
prising the first channel element and the second channel element and
wherein the first decoder configuration data for the first channel element and the
second decoder configuration data for the second channel element is associated to
the sequence of frames (62a to 62e),
wherein the configuration controller (14) is configured to configure the configur¬
able decoder (16) for each of the frames of the sequence of frames so that the first
channel element in each frame is decoded using the first decoder configuration data
and the second channel element in each frame is decoded using the second decoder
configuration data.
Audio decoder in accordance with one of the preceding claims,
wherein the data stream is a serial data stream and the configuration section (50)
comprises decoder configuration data for a plurality of channel elements in order,
and
wherein the payload section (52) comprises payload data for the plurality of chan¬
nel elements in the same order.
11. Audio decoder in accordance with one of the preceding claims,
wherein the configuration section (50) comprises a first channel element identifica¬
tion followed by the first decoder configuration data and a second channel element
identification followed by the second decoder configuration data, wherein the data
stream reader (12) is arranged to loop over all elements (92, 94, 96, 98) by sequen¬
tially passing the first channel element identification (92) and subsequently reading
the first decoder configuration data (94) for the channel element and subsequently
passing the second channel element identification (96) and subsequently reading the
second decoder configuration data (98).
12. Audio decoder in accordance with one of the preceding claims,
wherein the configurable decoder (16) comprises a plurality of parallel decoder in¬
stances (16a, 16b, 16c, 16d),
wherein the configuration controller (14) is arranged to configure a first decoder in¬
stance (16a) using the first decoder configuration data, and to configure the second
decoder instance (16b) using the second decoder configuration data, and
wherein the data stream reader (12) is arranged for forwarding payload data for the
first channel element to the first decoder instance (16a) and to forward payload data
for the second channel element to the second decoder instance (16b).
13. Audio decoder in accordance with claim 12,
wherein the payload section comprises a sequence of payload frames (62a to 62e),
and
wherein the data stream reader (12) is configured to forward the data for each chan¬
nel element from the currently processed frame only to the corresponding decoder
instance configured by the configuration data for this channel element.
14. Method of decoding an encoded audio signal (10), the encoded audio signal (10)
comprising a first channel element (52a) and a second channel element (52b) in a
payload section (52) of a data stream and first decoder configuration data (50c) for
the first channel element (52a) and second decoder configuration data (50d) for the
second channel element (52b) in a configuration section (50) of the data stream,
comprising:
reading the configuration data for each channel element in the configuration section
and for reading the payload data for each channel element in the payload section;
decoding the plurality of channel elements by a configurable decoder(16); and
configuring the configurable decoder (16) so that the configurable decoder (16) is
configured in accordance with the first decoder configuration data when decoding
the first channel element and in accordance with the second decoder configuration
data when decoding the second channel element.
Audio encoder for encoding a multi-channel audio signal (20), comprising:
a configuration processer (22) for generating first configuration data (25b) for a
first channel element (23 a) and second configuration data (25a) for a second chan¬
nel element (23b);
a configurable encoder (24) for encoding the multi-channel audio signal (20) to ob¬
tain the first channel element (23a) and the second channel element (23b) using the
first configuration data (25b) and the second configuration data (25a); and
a data stream generator (26) for generating a data stream representing an encoded
audio signal (27), the data stream (27) having a configuration section (50) having
the first configuration data (50c) and the second configuration data (50d) and a payload
section (52) comprising the first channel element (52a) and the second channel
element (52b).
Method of encoding a multi-channel audio signal (20), comprising:
generating first configuration data (25b) for a first channel element (23a) and sec¬
ond configuration data (25a) for a second channel element (23b);
encoding the multi-channel audio signal (20) by a configurable encoder (24) to ob¬
tain the first channel element (23a) and the second channel element (23b) using the
first configuration data (25b) and the second configuration data (25a); and
generating a data stream (27) representing an encoded audio signal (27), the data
stream (27) having a configuration section (50) having the first configuration data
(50c) and the second configuration data (50d) and a payload section (52) compris¬
ing the first channel element (52a) and the second channel element (52b).
Computer program for performing, when running on a computer, the method of
claim 14 or claim 16.
Encoded audio signal (27) comprising:
a configuration section (50) having first decoder configuration data (50c) for a first
channel element (52a) and second decoder configuration data (50d) for a second
channel element (52b), a channel element being an encoded representation of a sin¬
gle channel or two channels of a multichannel audio signal; and
a payload section (52) comprising payload data for the first channel element (52a)
and the second channel element (52b).

Documents

Application Documents

#	Name	Date
1	2803-KOLNP-2013-(23-09-2013)-PCT SEARCH REPORT & OTHERS.pdf	2013-09-23
1	2803-KOLNP-2013-FORM-27 [31-07-2024(online)].pdf	2024-07-31
2	2803-KOLNP-2013-(23-09-2013)-GPA.pdf	2013-09-23
2	2803-KOLNP-2013-RELEVANT DOCUMENTS [25-09-2023(online)].pdf	2023-09-25
3	2803-KOLNP-2013-RELEVANT DOCUMENTS [07-09-2023(online)].pdf	2023-09-07
3	2803-KOLNP-2013-(23-09-2013)-FORM-5.pdf	2013-09-23
4	2803-KOLNP-2013-RELEVANT DOCUMENTS [07-08-2023(online)].pdf	2023-08-07
4	2803-KOLNP-2013-(23-09-2013)-FORM-3.pdf	2013-09-23
5	2803-KOLNP-2013-PROOF OF ALTERATION [23-05-2023(online)].pdf	2023-05-23
5	2803-KOLNP-2013-(23-09-2013)-FORM-2.pdf	2013-09-23
6	2803-KOLNP-2013-RELEVANT DOCUMENTS [27-09-2022(online)].pdf	2022-09-27
6	2803-KOLNP-2013-(23-09-2013)-FORM-1.pdf	2013-09-23
7	2803-KOLNP-2013-RELEVANT DOCUMENTS [13-09-2022(online)].pdf	2022-09-13
7	2803-KOLNP-2013-(23-09-2013)-CORRESPONDENCE.pdf	2013-09-23
8	2803-KOLNP-2013.pdf	2013-10-03
8	2803-KOLNP-2013-RELEVANT DOCUMENTS [09-09-2022(online)].pdf	2022-09-09
9	2803-KOLNP-2013-(05-12-2013)-PA.pdf	2013-12-05
9	2803-KOLNP-2013-RELEVANT DOCUMENTS [07-09-2021(online)].pdf	2021-09-07
10	2803-KOLNP-2013-(05-12-2013)-CORRESPONDENCE.pdf	2013-12-05
10	2803-KOLNP-2013-IntimationOfGrant18-05-2020.pdf	2020-05-18
11	2803-KOLNP-2013-(05-12-2013)-ASSIGNMENT.pdf	2013-12-05
11	2803-KOLNP-2013-PatentCertificate18-05-2020.pdf	2020-05-18
12	2803-KOLNP-2013-FORM-18.pdf	2014-01-02
12	2803-KOLNP-2013-Written submissions and relevant documents [25-04-2020(online)].pdf	2020-04-25
13	2803-KOLNP-2013-(26-04-2016)-OTHERS.pdf	2016-04-26
13	2803-KOLNP-2013-Written submissions and relevant documents [23-03-2020(online)].pdf	2020-03-23
14	2803-KOLNP-2013-(26-04-2016)-CORRESPONDENCE.pdf	2016-04-26
14	2803-KOLNP-2013-PETITION UNDER RULE 137 [06-03-2020(online)].pdf	2020-03-06
15	2803-KOLNP-2013-Correspondence to notify the Controller [08-02-2020(online)].pdf	2020-02-08
15	Other Patent Document [23-06-2016(online)].pdf	2016-06-23
16	2803-KOLNP-2013-HearingNoticeLetter-(DateOfHearing-21-02-2020).pdf	2020-02-07
16	Other Patent Document [19-09-2016(online)].pdf	2016-09-19
17	Other Patent Document [12-12-2016(online)].pdf	2016-12-12
17	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [18-12-2019(online)].pdf	2019-12-18
18	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [13-07-2019(online)].pdf	2019-07-13
18	Other Patent Document [04-04-2017(online)].pdf	2017-04-04
19	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [20-03-2019(online)].pdf	2019-03-20
19	Information under section 8(2) [29-06-2017(online)].pdf	2017-06-29
20	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [19-07-2017(online)].pdf	2017-07-19
20	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [20-12-2018(online)].pdf	2018-12-20
21	2803-KOLNP-2013-CLAIMS [23-10-2018(online)].pdf	2018-10-23
21	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [11-08-2017(online)].pdf	2017-08-11
22	2803-KOLNP-2013-CORRESPONDENCE [23-10-2018(online)].pdf	2018-10-23
22	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [15-11-2017(online)].pdf	2017-11-15
23	2803-KOLNP-2013-DRAWING [23-10-2018(online)].pdf	2018-10-23
23	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [14-03-2018(online)].pdf	2018-03-14
24	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [04-04-2018(online)]_16.pdf	2018-04-04
24	2803-KOLNP-2013-FER_SER_REPLY [23-10-2018(online)].pdf	2018-10-23
25	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [04-04-2018(online)].pdf	2018-04-04
25	2803-KOLNP-2013-OTHERS [23-10-2018(online)].pdf	2018-10-23
26	2803-KOLNP-2013-FER.pdf	2018-04-25
26	2803-KOLNP-2013-PETITION UNDER RULE 137 [23-10-2018(online)].pdf	2018-10-23
27	2803-KOLNP-2013-FER.pdf	2018-04-25
27	2803-KOLNP-2013-PETITION UNDER RULE 137 [23-10-2018(online)].pdf	2018-10-23
28	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [04-04-2018(online)].pdf	2018-04-04
28	2803-KOLNP-2013-OTHERS [23-10-2018(online)].pdf	2018-10-23
29	2803-KOLNP-2013-FER_SER_REPLY [23-10-2018(online)].pdf	2018-10-23
29	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [04-04-2018(online)]_16.pdf	2018-04-04
30	2803-KOLNP-2013-DRAWING [23-10-2018(online)].pdf	2018-10-23
30	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [14-03-2018(online)].pdf	2018-03-14
31	2803-KOLNP-2013-CORRESPONDENCE [23-10-2018(online)].pdf	2018-10-23
31	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [15-11-2017(online)].pdf	2017-11-15
32	2803-KOLNP-2013-CLAIMS [23-10-2018(online)].pdf	2018-10-23
32	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [11-08-2017(online)].pdf	2017-08-11
33	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [19-07-2017(online)].pdf	2017-07-19
33	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [20-12-2018(online)].pdf	2018-12-20
34	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [20-03-2019(online)].pdf	2019-03-20
34	Information under section 8(2) [29-06-2017(online)].pdf	2017-06-29
35	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [13-07-2019(online)].pdf	2019-07-13
35	Other Patent Document [04-04-2017(online)].pdf	2017-04-04
36	Other Patent Document [12-12-2016(online)].pdf	2016-12-12
36	2803-KOLNP-2013-Information under section 8(2) (MANDATORY) [18-12-2019(online)].pdf	2019-12-18
37	2803-KOLNP-2013-HearingNoticeLetter-(DateOfHearing-21-02-2020).pdf	2020-02-07
37	Other Patent Document [19-09-2016(online)].pdf	2016-09-19
38	2803-KOLNP-2013-Correspondence to notify the Controller [08-02-2020(online)].pdf	2020-02-08
38	Other Patent Document [23-06-2016(online)].pdf	2016-06-23
39	2803-KOLNP-2013-(26-04-2016)-CORRESPONDENCE.pdf	2016-04-26
39	2803-KOLNP-2013-PETITION UNDER RULE 137 [06-03-2020(online)].pdf	2020-03-06
40	2803-KOLNP-2013-(26-04-2016)-OTHERS.pdf	2016-04-26
40	2803-KOLNP-2013-Written submissions and relevant documents [23-03-2020(online)].pdf	2020-03-23
41	2803-KOLNP-2013-FORM-18.pdf	2014-01-02
41	2803-KOLNP-2013-Written submissions and relevant documents [25-04-2020(online)].pdf	2020-04-25
42	2803-KOLNP-2013-(05-12-2013)-ASSIGNMENT.pdf	2013-12-05
42	2803-KOLNP-2013-PatentCertificate18-05-2020.pdf	2020-05-18
43	2803-KOLNP-2013-(05-12-2013)-CORRESPONDENCE.pdf	2013-12-05
43	2803-KOLNP-2013-IntimationOfGrant18-05-2020.pdf	2020-05-18
44	2803-KOLNP-2013-(05-12-2013)-PA.pdf	2013-12-05
44	2803-KOLNP-2013-RELEVANT DOCUMENTS [07-09-2021(online)].pdf	2021-09-07
45	2803-KOLNP-2013-RELEVANT DOCUMENTS [09-09-2022(online)].pdf	2022-09-09
45	2803-KOLNP-2013.pdf	2013-10-03
46	2803-KOLNP-2013-RELEVANT DOCUMENTS [13-09-2022(online)].pdf	2022-09-13
46	2803-KOLNP-2013-(23-09-2013)-CORRESPONDENCE.pdf	2013-09-23
47	2803-KOLNP-2013-RELEVANT DOCUMENTS [27-09-2022(online)].pdf	2022-09-27
47	2803-KOLNP-2013-(23-09-2013)-FORM-1.pdf	2013-09-23
48	2803-KOLNP-2013-PROOF OF ALTERATION [23-05-2023(online)].pdf	2023-05-23
48	2803-KOLNP-2013-(23-09-2013)-FORM-2.pdf	2013-09-23
49	2803-KOLNP-2013-RELEVANT DOCUMENTS [07-08-2023(online)].pdf	2023-08-07
49	2803-KOLNP-2013-(23-09-2013)-FORM-3.pdf	2013-09-23
50	2803-KOLNP-2013-RELEVANT DOCUMENTS [07-09-2023(online)].pdf	2023-09-07
50	2803-KOLNP-2013-(23-09-2013)-FORM-5.pdf	2013-09-23
51	2803-KOLNP-2013-(23-09-2013)-GPA.pdf	2013-09-23
51	2803-KOLNP-2013-RELEVANT DOCUMENTS [25-09-2023(online)].pdf	2023-09-25
52	2803-KOLNP-2013-(23-09-2013)-PCT SEARCH REPORT & OTHERS.pdf	2013-09-23
52	2803-KOLNP-2013-FORM-27 [31-07-2024(online)].pdf	2024-07-31

Search Strategy

1	2803kolnp2013_25-01-2018.pdf