Sign In to Follow Application
View All Documents & Correspondence

Audio Encoder Device And An Audio Decoder Device Having Efficient Gain Coding In Dynamic Range Control

Abstract: Audio encoder device comprising: an audio encoder (2) configured for producing an encoded audio bitstream (ABS) from an audio signal (AS) comprising consecutive audio frames (AFP, AFR, AFS); a dynamic range control encoder (3) configured for producing an encoded dynamic range control bitstream (DBS) from an dynamic range control sequence (DS) corresponding to the audio signal (AS) and comprising consecutive dynamic range control frames (DFP, DFR, DFS), wherein each dynamic range control frame (DFP, DFR, DFS) of the dynamic range control frames (DFP, DFR, DFS) comprises one or more nodes (A0 … A5; B0 … B2; C0), wherein each node of the one or more nodes (A0 … A5; B0 … B2; C0) comprises gain information (GA0 … GA5; GB0 … GB2; GC0) for the audio signal (AS) and time information (TA0 … TA5; TB0 … TB2; TC0) indicating to which point in time the gain information (GA0 … GA5; GB0 … GB2; GC0) corresponds; wherein the dynamic range control encoder (3) is configured in such way that the encoded dynamic range control bitstream (DBS) comprises for each dynamic range control frame (DFP, DFR, DFS) of the dynamic range control frames (DFP, DFR, DFS) a corresponding bitstream portion (DFP’, DFR’, DFS’); wherein the dynamic range control encoder (2) is configured for executing a shift procedure, wherein one or more nodes (B1, B2) of the nodes (B0 … B2) of one reference dynamic range control frame (DFR) of the dynamic range control frames (DFP, DFR, DFS) are selected as shifted nodes (B1, B2), wherein a bit representation (B’1, B’2) of each of the one or more shifted nodes (B1, B2) of the one reference dynamic range control frame (DFR) is embedded in the bitstream portion (DFS’) corresponding to the dynamic range control frame (DFS) subsequent to the one reference dynamic range control frame (DFR), wherein a bit representation (B’0) of each remaining node (B0) of the nodes (B0 … B2) of the one reference dynamic range control frame (DFR) of the dynamic range control frames (DFP, DFR, DFS) is embedded into the bitstream portion (DFR’) corresponding to the one reference dynamic range control frame (DFR); wherein the one or more nodes (A0 … A5; B0 … B2; C0) of one of the dynamic range control frame (DFP, DFR, DFS) are selected from a uniform time grid.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
27 March 2025
Publication Number
23/2025
Publication Type
INA
Invention Field
ELECTRONICS
Status
Email
Parent Application

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
Hansastraße 27c 80686 München Germany

Inventors

1. KÜCH, Fabian
Schützenweg 13 91054 Erlangen Germany
2. UHLE, Christian
Hoher Rain 28 92289 Ursensollen Germany
3. KRATSCHMER, Michael
An der Leiten 10 90765 Fürth Germany
4. NEUGEBAUER, Bernhard
Eisenstraße 31 91054 Buckenhof Germany
5. MEIER, Michael
Höchstadter Straße 13 91086 Aurachtal Germany
6. SCHREINER, Stephan
Betzenberg 23 92262 Birgland Germany

Specification

. APPLICANT (S)
Dynamic range control (DRC) in lhe context of this document refers to a digital signal processing technique to reduce the dynamic range of audio signals
in a controlled way ['l]. The desired reduction of the dynamic range is
achieved by reducing the level of loud sound components and/or amplifylng
soft parts of the audio srgnals
A typrcal applicatron for DRC is to adapt the dynamic properlies of an audio
signalto a listening envrronmenl. For example, when listening to music in
noisy environment, the dynamic range should be reduced in order to allow for
an overall signal amplillcation without drlving the resulting amplified signal
into clipping ln this case, high signal peaks should be attenuated, e.g by
means of a limiter Additionally, soft sgnal components should be amplified
relative to the loud parts tn order lo improve their inlelligibihty in a noisy listenrng environment.
It's an object of the present invention to provide an enhanced concept for
dynamic range control in the context of audio lransmission.
This object is achieved by an audio encoder device comprisingl
an audro encoder configured for producing an encoded audio bitstream from
an audio signal comprising consecutive audio frames;
a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from an dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range
control frames, wherern each dynamic range conkol frame of the dynamic
2
range controlframes comprises one or more nodes, wherein each node of
the one or more nodes comprises gain information for the audio signal and
time information indicating to which point in time the gain informalion corresponds;
wherein lhe dynamrc range conlrol encoder rs configured in such way that the
encoded dynamic range control bitstream comprises for each dynamic range
control frame of the dynamic range conlrol frames a corresponding bitstream
portion;
wherein the dynamic range conkol encoder is configured for executing a shitt
procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as
shifted nodes, wherein a bit representation of each of the one or more shifted
nodes of the one reference dynamic range control frame is embedded in the
bitstream portion corresponding to the dynamic range controlframe subsequent to the one reference dynamic range controlframe, wherein a bit representation of each remaining node of the nodes o, the one reference dynamic
range control frame of the dynamrc range control frames is embedded into
the bitstream porlion corresponding lo the one reference dynamic range controlframe.
The rnvention addresses the situation of audio transmission using coding of
the audio signal, wherein the gain information is not directly applied to the
audio signal, but also encoded and transmitted together with the encoded
audio signal. At the decoder, both, the audio slgnal and the gain information,
may be decoded and the gain informahon may be applied to the corresponding audio signal As explained more detailed below, the invention achieves
an efficrent coding of the gain information More precisely, it avoids bitrate
peaks in the encoded dynamic range control bitstream
3
The process of applyrng dynamic range controlto an audio signalcan be expressed by a simple multiplication of lhe audio signal x(k) by a time-variant
gain value g(k)l
y〔 た〕= ′(々 )χ(々 〕
where k denoles a sample time index The value of the gain g(k) may be
computed e g. based on a short-term estimate ofthe root-mean square o,
the audio signal x(k) lvlore details about strategies to determine suitable
gains values are discussed in [1] In the following we refer to the time-variant
gains g(k) as a gain sequence.
ln the following, the coding of dynamic range control gain sequences is explained. First, ihe dynamic range control gain sequence is divided into socalled dynamic range control frames of gain samples, containing a fixed
number of garn samples Usually, a temporal frame size for lhe dynamic
range conlrol frames iS chosen to be equal to the temporal size of an audio
frame of the corresponding audio encoder Within each dynamic range control frame, so-called nodes are selected, preferably on a uniform time gnd.
The spacing of this 9[d defines the highest available time resolution, i.e., the
minimum distance in samples between two nodes equals to samples having
the highest available time resolution Each node is represented by the sample position wilhin the dynamic range controlframe, the gain information,
which may be expressed as a gain value, for that position and optionally information about the slope of the gain values at the node positions. For the
following discussion it will be useful to define the maximum number of nodes
that can be selected wrthin one frame
The dynamic range control encoder encodes the gain tnformation from lhe
nodes, e.9., by using quanlized differential values of pairs of consecutive
gain nodes At the decoder, the originalgain sequence is reconstructed as
4
good as possrble by using spllne rnterpolation or linear interpolation based on
the transmitted information of the nodes (gain value, sample positron within
the dynamic range conlrolframe, and slope information if applicable).
An efficrent approach Ior encoding the dynamic range control gain sequence
is lo use a quaniized value of the gain difference (typically in dB) of pairs of
consecutive nodes, as well as the time difference oflhe sample positions of
these nodes within the considered dynamic range controlframe The slope
information is usually not represented as a difference between two nodes
Since there is no precedrng node for the first node within a frame, its gain
value is not encoded in a differential way, but the values are encoded explicitly The time difference of the first node is usually determined as the offset to
the beginning of the dynamic range controlframe.
The encoder may then assign a fixed code word e g. of a pre-defined Huftman table (code book) to each of lhe gain and time differences of pairs of
nodes
At the dynamic range control decoder, the dynamic range control bitstream is
decoded and the relevant information (gain value, sample position within the
dynamrc range conlrol frame, and slope information if applicable) at the positions of the transmitted nodes is reconstructed The gain values for the remaining garn samples wilhin a frame are obtained by inlerpolalion between
pairs of lransmitted and decoded nodes. The interpolalion can be based on
splines it the slope informatron of the gain nodes has been transmitted or,
alternatively, using linear interpolalion if only the gain differences betlveen
pairs of nodes are available and the slope information is discarded
ln principle, dynamic range conlrol encoder/decoder chains can be operated
in two modes The so-called fullframe mode refers to the case \r',/here afler
decoding ot a received dynamic range control bitstream, corresponding to a
reference dynamic range control frame, the gains at each sample position of
5
the reference dynamic range conlrolframe can be immediately determined
after interpolation based on lhe decoded nodes This implies thal a node has
to be transmilted at each frame border, e , at the sample position corresponding to the last Sample of the reference dynamic range controlframe lI
the dynamic range control frame length is N this means the last transmitted
node has to be located at the sample position N within the reference dynamic
range control frame
The invention avoids this disadvantage as it is based on the second mode,
which is referred to as "delay mode". ln this case, there is no need lor transmitting a node for the last sample posilion within the reference dynamic
range conlrol frame. Therefore, the dynamic range control decoder has to
wait for decoding the dynam c range control frame subsequent to the reference dynamic range control frame in order to perform the required interpolation of allgain values following the last node within reference dynamic range
control frame This is because the information ol the first node of the subsequent dynamic range controlframe has lo be known to perform the interpolation between the last node of the reference dynamic range conlrol frame and
the flrst node of lhe subsequent dynamic range control frame in order to determine the gain value in between via interpolation.
ln practice the delay caused by using the delay mode for coding of the dynamic range control information is not an issue This is because audio codecs that commonly accompany the dynamrc range control coding scheme
also introduce an inherent delay of one audio frame when subsequently applying the encodrng and decoding steps. lmportant examples of such audio
codecs are the ISO/IEC 13818-7 Advanced Audio Coding (N/|PEG-2 MC),
ISO/IEC 14496-3, subpa( 4 ([IPEG-4 AAC), or ISO/lEC 23003-3, part 3
Unified Speech and Audio Coding (USAC) Such audio coding schemes require the reference audio frame and the audio frame subsequent lo lhe reference audio frame in order to compute (ustng an overlap-add structure) the
6
correct audlo samples corresponding to the relerence dynamic range control
audio frame
It is important to note that the number of nodes that are required to sufficiently approximate the original dynamic range control gain sequence significantly
varies from dynamic range control frame to dynamic range controlframeThat results from the fact that more nodes are required to represent highly
time-variant galns compared to the case where only slowly changing gain
values have to be encoded This observation implies that the required bitraie
to transmit gain sequences can vary significantly from frame to frame Some
frames may require a la.ge number of nodes to be encoded, resulting in high
bilrate peaks. This is not desirable, especially, when the audio signal and the
dynamic Iange controlgain sequence are transmitted in a joint bitstream
comprising the encoded dynamic range control bitstream and the encoded
audio bitstream, whrch should have almost constant bitrate. Then, a peak in
thedynamrc range control related bilrale reduces the available b rate for the
audio encoder, which often result in a degradation of the audio quality after
decoding. However, with the current state-of-the-art methods for the coding
of dynamic range control gain sequences, a reduction of the dynamic range
control related bitrate in a certain frame is only achieved by reducing the
number of nodes that are selected to represent the gain sequence within that
frame This agarn may lead to large errors between the original gain sequence and the one that is reconskucted after the dynamic range conlrol decoding process The invention overcomes these disadvantages by reducing
the peak bitrales of encoded dynamic range control bitstream without increasing the error between the original and the reconstructed dynamic range
control sequence
ln this section, the coding of dynamic range control gain sequences according to the invention is presented. The invention allows controlling the peak
bitrate required for a relerence dynamic range control frame without changing
the resultrng bilstream sequence compared to the case where the proposed
7
method is not used The proposed approach exploits the inherent delay of
one frame introduced by state-of-the-art audio coders to reduce peaks of
number of nodes within one frame by distributing the transmission of some of
the nodes to the next subsequent dynamic range control frame. The details
of the proposed method are presented in the fo owing
As explained above, when combined with an audio coding scheme that inkoduces a frame delay relative to the dynamic range control gains, the decoded
dynamic range control gains are detayed by one frame before being apptied
to the audio signal. This means that the nodes of the reference dynamic
range control frame are applied to lhe valid audio decoder output at dynamic
range control lrame subsequent to the reference dynamic range control
frame. This implies that in the default delay mode it is sufficient to transmit
the nodes of the reference dynamic range control frame together with the
nodes of the dynamic range control frame subsequent to the reference dynamic range control frame and apply the corresponding dynamic range controlgains wthout a delay direcfly to the corresponding audio output signal at
the decoder.
This fact is exploited in the invention in order to reduce the maximum number
of nodes transmitted within one dynamic range controlframe. According to
the invention some oI the nodes of the reference dynamic range control
frame are shifted to the subsequent dynamic range control frame, whtch may
be done before encoding As it wi be discussed in the following, the shifted
nodes may be "preceding" the first node in the subsequent dynamic range
control frame only for the encoding of the gain differences and the slope in_
formalion Forthe coding of the time difference information a different method may be applied
According to a preferred embodiment of the invention the shift procedure is
initiated in case that a number of the nodes of the reference dynamic range
control frame is greater than a predefined threshold value
8
According to a preferred embodiment of lhe invention the shift procedure is
initiated in case that a sum of a number of the nodes of the reference dynamic range control frame and a number of shifted nodes from the dynamic range
control frame preceding the reference dynamic range control frame to be
embedded in the bitstream portjon coffesponding to the reference dynamic
range conlrolframe is greater than a predefined threshold value.
According to a preferred embodiment of the invention the shift procedure is
initiated in case that a sum of a number of the nodes of the reference dynamic range control frame and a number of shifted nodes from the dynamic range
control frame preceding the reference dynamic range controlframe to be
embedded in the b[stream portion corresponding to the reference dynamic
range control frame is greater than a number of the nodes of the dynamic
range control frame subsequent to the reference dynamic range control
framelndependent frorr the conditions defined under which the shift procedure is
initiated, the first node oI the reference dynamic range conlrol frame should
not be shifted to the subsequent dynamic range control frame as its vatue is
needed for interpolation of the gain control values at the beginning of the reference dynamic range controlframe. Furthermore, a node should be shifted
only one time in order to avord a delay when decoding the bitslream.
According to a preferred embodiment of the invention the time information of
the one or more nodes is represenled in such way that the one or more shifted nodes may be identified by using the time information
According to a preferred embodiment of the invention the time information of
the one or more shifted nodes is represented by a sum of a time ditference
from a beginning of the dynamic range controlframe to which the respective
node belongs to the temporal posttion of lhe respective node within the dy9
namic range control frame to which the respective node belongs and an offset value being greater than or equal lo a temporal size of ihe dynamic range
controlframe subsequent 1o the respective dynamic range control frame
Accordrng to a preferred embodiment of the invention the gain information ol
the bit representation ofthe shfted node, which s ataflrst position ofthe
bitstream portion corresponding to lhe dynamic range controlframe subsequent lo the reference dynamic range controlframe, is represented by an
absolute gain value and wherein the gain information of each bit representation of the shifted nodes at a position after the bil representation of the node,
which is at the first position of the bitstream portion corresponding to the dynamic range controlframe subsequenl lo the reference dynamic range control frame, is represented by a relative gain value which is equal to a difference of a gain va ue of the bit representatron of the respective shifted node
and the gain value of the bit representation of the node, which pre€des the
bit representation of the respective node.
According lo preferred embodiment of the invention, in case that lhe bit representations of one or more shifted nodes of the reference dynamic range
control frame rs embedded in lhe bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, the gain information of the bit representation of the node ol the
subsequent dynamic range control frame at a flrst position of the bitstream
portion corresponding to the dynamic range control frame subsequent lo the
reference dynamic range control frame afler the one or more positions of the
bit representations of the one or more shifted nodes is represenled by a relative gain value which is equal to a difference of a gain value o[ the bit representation of the respective node and a gain value of the bit representatron of
the shifted node, which precedes the bit representation of lhe respective
node
10
According io a preferred embodiment of the invention a temporal size of the
audio Irames is equal to a temporal size of the dynamic range conkol frames
According to a preferred embodiment of the invention the one or more nodes
of one of the dynamic range conlrol ,rame are selected from a uniform time
grid
According to a preferred embodiment of the invention each node of the one
or more nodes comprises slope information
According to a preferred embodrment ol the invention the dynamic range
control encoder rs conflgured for encoding the nodes using an entropy encoding technique, such as Huffman coding or arithmetic coding.
The encoder may assign a fixed code word e.g of a pre-defined Huffman
table (code book) to each of the gain and time differences of pairs of nodes
Examples of suitable Huffman tables for encoding the time differences of
pairs of conseculive nodes are given in Table 1 and Table 2, respectively
Table 1: Example of a Huffman table for the coding of lime differences of
DRC gain nodes
Codeword size [bits] Time difference Tirne difference tDrcDelta
binary encoding in multiples of derlafmla
Ox000 nNodesMax
Ox004
0x014+(a-2) a=[2 s]
a=t6 131
0xE00+(a_14) a=114 2.nNodesMax-1)

^3
一5
^0
´―
Table 2: Example
DRC gain nodes,
of a Huffman lable lor the coding of time differences of
where Z=ceil(log2(2' nNodesM ax\)
11
Encoding
00
Size
2 bts
(2 bits,
bits}
′DrcDe″a=1
の′oOθ″a=μ+2
n Range
1
(01,μ) 2 5
{10,μ} (2Ы に 3
b ts}
`DrcDθ
′la=μ+6 6 13
{11,μ) (2 bns,Z
b ts}
tDrcDelta = p+14 14.2.nNodesMax
ln a lurther aspect of the invention the objective is achieved by an audio decoder device comprisrng:
an audro decoder configured for decoding an encoded audio bitslream in order to reproduce an audio signal comprising consecutive audio ffamesi
a dynamic range control decoder configured for decoding an encoded dynamic range control bitstream in order to reproduce an dynamic range control
sequence corresponding to the aUdio signaland comprising consecutive dy.
namic range control framesi
wherein the encoded dynamic range control bttstream comprises for each
dynamic range control frarne of the dynamic range control frames a corresponding bitstream portion;
wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit represenlation of one node of the
nodes comprises gain information for the audio signalAS and time information indicating to which pornt in time lhe gain information corresponds
12
wherein the encoded dynamic range control bit stream comprises bit repre"
sentations of shifted nodes selected from the nodes of one reference dynamic range controlframe of the dynamic range controlframes, which are embedded in a bitstream portron corresponding to the dynamic range control
frame subsequent to the one reference dynamic range controi frame, wherern
the bit representation of each remainrng node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is
embedded into the bitstream portron corresponding to lhe one reference dynamic range control frame; and
wherein the dynamic range control decoder is configured for decoding the bit
representation of each remaining node of the remaintng nodes of the one
reference dynamic range control frame of the dynamic range control frames
in order to reproduce each remaining node of the one reierence dynamic
range control frame of the dynamic range control frames, for decoding the bit
representation of each shifted node of the shifted nodes selecled from the
nodes of the one reference dynamic range control frame of the dynamic
range control frames in order to reproduce each shjfted node of the shrfled
nodes selected from the nodes of the one reference dynamic range control
frame of the dynamic range control frames and for combining the reproduced
remaining nodes and the reproduced shifted nodes in order to reconstruct the
reference dynamic range control frame
The dynamic range control decoder receives the dynamic range control bitslream The dynamic range control bitstream, which corresponds to the node
informalion (sample position, gain value, and slope information if applicable),
may be decoded in the following way
A value for the time difference between two nodes (e.g as an integer multiple
of the minimum distance between two nodes) is determined from the received code word based e g on the rules shown in a Huffman code book.
The corresponding sample position of the currenfly decoded node is obtained
13
by adding the t me difference value to the sample posiflon va,ue computed
Ior the previous node
After decoding the nodes of the reference dynamic range control frame the
nodes of the subsequent dynamic range conlrol frame are decoded.
lf the determined sample position within the subsequent dynamic range control frame corresponds to a value that is larger than the length of a subse_
quent dynamic range controtframe, lhe dynamic range controldecoder
knows that lhe current temporal node information refers to a node originally
located in the reference dynamic range controlframe.
To obtain the correct sample position within the reference dynamic range
controlframe, an offset is subtracted from the computed sample position. A
practical example is to subtract the value that corresponds to the length of a
dynamrc range control trame (which implies thal the encoder has added the
same value to the origrnal sample position) A typical example lor the otfset
value is the temporal size of a dynamic range control frame
After decoding and if apphcable correcting the time information of a nodes in
the entire subsequent dynamic range controlframe, the decoder knows how
many nodes have been shifted back to lhe reference dynamic range control
frame (without explici y providing this rnformation at the encoder) and on
which sample position they are located within the reference dynamic range
controlframe
The dynamic range control decoder further determines the gatn value informat on of all nodes of a received frame by decoding the differential gain information in the bitstream.
From the decoding step of the time information, the decoder knows how
many of the decoded gain values have to be assigned to the nodes of the
14
reference dynamic range control frame (and lo whrch sample position) and
which gain values are assigned to nodes in the reference dynamic range
controlframe
The decoding of the slope inrormation and the assignment to the correct
nodes are performed analogously to the decoding process oI the gain values.
After decoding all nodes ol the subsequent dynamic range control frame, it
can be assured that all nodes required for computing the gain values for each
sample of the reference dynamic range controJ frame via interpolation are
available. After the interpolation step, the dynamic range controlgain values
for each sample can be applied to the corresponding correct audio samples
According to a preferred embodiment of the invention the dynamic range
control decoder is configured for identifying the one or more shifted nodes by
using the time information.
According to a preferred embodiment of the invention the dynamic range
control decoder is configured for decoding the time information of the one or
more shitted nodes, which is represented by a sum of a time from a beginning of the dynamic range control frame to which the respective node belongs lo the temporal position of the respective node within the dynamic
range control frame to which the respective node belongs and an offset value
being greater than or equal to a temporal size of the dynamic range control
frame subsequent to lhe respective dynamic range controlframe
Accordrng to preferred embodiment oI the invention the dynamic range control decoder is configured for decoding the gain information of the btt representation of the shifted node, which is at a firsi position of the brtstream portion corlesponding lo the dynam c range control frame subsequent to the reference dynamic range controlframe, is represented by an absolute gain value and wherein the gain information of each bit representation of lhe shifted
15
nodes at a posrtion after the bit representation of the node which is at the
first position of the biistream portion corresponding to the dynamic range
controlframe subsequent to the reference dynamic range controlframe, is
represented by a relalive gain value which is equal to a difference of a gain
value of the bit representation of the respective shifted node and the gain
value of the bit represenlation of the node, which precedes the bit representalion of the respective node
According to a preferred embodiment o he invention the dynamic range
control decoder is configured for decoding the gain information of the brt rep_
resentation oI the node of lhe subsequent dynamic range control frame at a
first pos tion of the bitskeam portion correspondrng to the dynamic range
control frame subsequent to the reference dynamic range control frame after
the one or more positions of the bit representations of the one or more shifted
nodes is represented by a relative gain value which is equatto a difference ot
a gain value of the bit representatron of the respective node and a gain value
of the bit representation of the shifted node, which precedes the bit represen_
tation of the respective node
According to preferred embodiment of the invention a temporal size of the
audio frames is equal to a temporal size of the dynamic range control frames_
According to a preferred embodiment of the invenlion the one or more nodes
of one of the dynamic range control frames are selected from a uniform time
grid.
According to preferred embodimenl of the invention each node of the one or
more nodes comprises slope rnformation.
According to preferred embodiment of the inventton the dynamic range con_
trol decoder is configured for decoding the bit representations of the nodes
using an entropy decoding technique.
16
The objective is further obtained by a system comprising an audio encodel
device according to the invention and an audio decoder device according to
the invention.
The invention further provides a method for operating an audio encoder, the
method comprises the steps;
producing an encoded audio bitstream from an audio signal comprising consecutive audro frames;
producing an encoded dynamic range control bitstream from an dynamic
range control sequence corresponding to the audio signal and comprising
consecutive dynamrc range controlframes, wherein each dynamrc range
control frame of the dynamic range control frames comprises one or more
nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which poinl in
time the gain information corresponds
wherein the encoded dynamic range control bitstream comprises for each
dynamic range control frame of the dynamic range control f.ames a corresponding bitstream portion;
executing a shitt procedure, wherein one or more nodes of the nodes of one
reference dynamrc range control lrame of the dynamic range control frames
are selected as shjfted nodes, wherein a bit represenlation of each of the one
or more shifted nodes of the one reference dynamic range contlol frame is
embedded in the bitslream portion corresponding to the dynamic range control frame subsequent to the one reference dynamrc range controllrame,
wherein a bit representation of each remaining node of the nodes of the one
reference dynamic range control frame oI the dynamrc range contlol frames
17
is embedded into the bitstream portion corresponding to the one reference
dynamic range control [rame.
The invention further provides a method for operat ng an audio decoder, the
method comprises the steps:
decoding an encoded audio bitstream in order to reproduce an audio signal
comprising consecutive audio frames,
decoding an encoded dynamic range control bitstream in order to reproduce
an dynamic range controlsequence corresponding to the audio signal and
comprising consecutive dynamic range control framesi
wherein the encoded dynamic range control bitstream comprises for each
dynamic range control frame of the dynamic range control frames a coresponding bitstream portion;
wherein the encoded dynamrc range control bitstream comprises bit representations of nodes wherein each bit tepresentatron of one node o, the
nodes comprises garn information for the audio signalAS and time information indrcating to which point in time the gain information corresponds;
wherein the encoded dynamic range conlrol bit stream comprises bit repre_
sentations of shifted nodes selected from the nodes of one reference dynam_
ic range control frame of the dynamic range conkol frames, which are em_
bedded in a bitstream portion co(esponding lo the dynamic range control
frame subsequent to the one reference dynamic range contro frame, whereln
the bit representation of each remaining node of the nodes of the one reference dynamic range control frame of lhe dynamic range control frames is
embedded into the bitstream portion correspondlng to the one relerence dy_
namic range conlrol frame; and
18
wherein lhe bit representation of each remaining node of the remaining
nodes of lhe one reference dynamic range control lrame of the dynamic
range Control frames is decoded n order to reproduce each remaining node
of the one reference dynamic range control frame of the dynamic range con_
trol frames:
wherein the bit representation of each shifted node ol the shifted nodes se_
lected from the nodes of the one reference dynamic range control frame of
the dynamic range control frames is decoded in order to reproduce each
shifted node of the shifted nodes selected lrom the nodes of the one refer_
ence dynam c range control frame of the dynamic range control frames; and
wherein the reproduced remaining nodes and the reproduced shifted nodes
are combined in order to reconstruct the reference dynamic range control
frame
ln another aspect the invention provides a program for, when running on a
processor, execuling the method according to the invention.
Preferred embodimenls of the invention are subsequenfly discussed wilh respect to lhe accompanying drawings, in which:
Fig'1 illustrates an embodiment of an audio encoder device according
to the invention in a schematic viewl
illustrates the principle of dynamic range control apptied in the
contexl of audro codtng rn a schematic vrew,
illustrates the different modes for the coding of dynamic range
control gain sequences in a schemattc view;
Fig 2
Fig 3
19
Fig 4
Flg 5
Frg 6
illustrates the apphcatron of dynamic range control in the context of audio coding in a schematic viewi
illustrates a shift procedure for nodes according to the invention
in a schematic view;
illustrates the coding of time information according to the invention in a schematic viewl
illustrates the coding of gain information according to the invention in a schematic view:
illustrates the coding of sldpe information according to the invenlion in a schematic view; and
illustrates an embodrment of an audio decoder device according
to the invention in a schematic view.
Fig. 7
Fig 8
Fig I
Fig 1 illustrates an embodiment of an audio encoder device 1 according to
the invention in a schematic view. The audio encoder device 1 comprises:
an audio encoder 2 configured [or producing an encoded audio bitstream
ABS from an audio signalAS comprising consecutive audio frames AFP,
AFR, AFS:
a dynamic range conkol encoder 3 configured lor producing an encoded dynamic range control bitstream DBS from an dynamic range control sequence
DS corresponding to the audio signal AS and comprising consecutive dynamic range control frames DFP, DFR, DFS, wherein each dynamic range control
frame DFP, DFR, DFS of the dynamic range control frames DFP, DFR, DFS
comprises one or more nodes A6 . A5; 86 82 Co, wherein each node of
the one or more nodes Ao ... As Bo. 82; Co comprises gajn information GAo
20
GAsi GBo GB2; GCo for the audio srgnalAS and time information TAo
TAs TBo TBi, TCo indicating to which poinl in time the gain rnformation
GAo GAs GBo GB2 GCo corresponds;
wherein the dynamic range control encoder 3 is configured in such way that
the encoded dynamic range control brtstream DBS comprises for each dy,
namic range control frame DFP, DFR, OFS of the dynamac range control
frames DFP, DFR DFS a corresponding bitstream portion DFp', DFR', DFS'i
wherein the dynamic range control encoder 2 is configured for executing a
shift procedure, wherein one or more nodes Bj, 82 of the nodes Bo 82 of
one reference dynamic range control ftame DFR of the dynamic range controlframes DFP, DFR, DFS are selected as shifted nodes Bl 82, wherern a
bit representation B'1, B'2 of each of the one or more shitted nodes Bt 82of
the one reference dynamic range conlroltrame DFR is embedded in the bitstream portion DFS' corresponding to the dynamic range controlframe DFS
subsequent to the one reference dynamtc range controlframe DFR, wherein
a bit representation B'o of each remaining node Bo of the nodes Bo, 82of
the one reference dynamic range controt frame DFR of the dynamic range
control frames DFP, DFR, DFS is embedded into the bitstream portion DFR'
corresponding to the one reference dynamic range control frame DFR
The invention allows controlling the peak bitrate required for a reference dynamic range conlrolframe DFR withoul changing the resu ting bitstream sequence DBS compared to the case where the proposed method is not used.
The proposed approach exploits the inherent delay of one frame introduced
by state-of{he-art audio coders to reduce peaks of number of nodes within
one frame by diskibuting the transmission of some of the nodes to the next
subsequent dynamic range control frame. The details of the proposed method are presented in the following
As explained above, when combined with an audio coding scheme that introduces a frame delay relative to the dynamic range control gains, the decoded
21
dynamic range controlgains are delayed by one frame before being applied
to the audio signal This means thatthe nodes ofthe reference dynamic
range control frame are applied to the valid audio decoder output at dynamic
range conlrolframe subsequent to the leference dynamic lange control
frame This implies that in the default delay mode it is sufficient to transmit
the nodes of the reference dynamic range conlrol flame together wilh the
nodes of the dynamic range control frame subsequent to the reference dynamic range controlframe and apply the corresponding dynamic range control gains without a delay directly io the corresponding audio output signal at
the decoder.
This fact rs exploited in the invention in order lo reduce the maximum number
of nodes transmitted within one dynamic range controlframe. According to
the invention some of the nodes of lhe reference dynamic range control
frame are shifted to the subsequent dynamic range controllrame, which may
be done before encoding. As it will be discussed in the following, the shifted
nodes may be 'preceding" the first node in the subsequent dynamic range
control frame only for the encoding of the gain differences and the slope information For the coding ot the lime ditference information, a dfferent method may be applied.
ln the example shown in Fig 1 the preceding dynamic range controlframe
DFP contains six nodes Ao A5 of which the nodes A4, A5 are shifted into
the bitstream portion DFR Furthermore, the reference dynamic range control frame DFR contains three nodes Bo . Bz The sum of the number of the
shifted nodes A4, A5 and the nodes Bo .. Bz ofthe refelence dynamic range
control frame DFR is equal to five which is brgger than the number of the
nodes Co of the subsequent dynamic range control frame DFS so that a shift
procedure is initiated in such way lhat nodes Bl, 82 are shifted into the bitstream portion DFS' Although the maximum number of nodes within the dy'
namic range control frames DFS, DFR, DFP ls equalto six, is the maximum
22
number of nodes within lhe bitstream portions DFS', DFR,, DFp, own equal
to four so that bitstream peak is avoided.
According tg preferred embodiment of the invention a temporal size of the
audio frames AFP, AFR, AFS is equal to a temporal size of the dynamic
range control frames DFP, DFR DFS
According to preferred embodiment of the invenlion the one or more nodes
Ao As; Bo. 82 Co of one ofthe dynamic range controlframe DFp, DFR.
DFS are selected from a unilorm time grid
According to a preferred embodiment of the invention the dynamic range
control encoder 3 is configured for encoding the nodes Ao .. As; Bo .. Bz Co
using an entropy encoding technique
ln a further aspect the invenlion provides a method for operating an audio
encoder 1, the method compflses the steps:
producing an encoded audio bitstream ABS from an audio signalAS com_
prlsing consecutive audio frames AFp, AFR, AFSi
producing an encoded dynamrc range control bitstream DBS from an dynam_
rc range control sequence DS corresponding lo the audio signal AS and
compnsrng consecutive dynamic range controlframes DFp, DFR, DFS,
wherein each dynamic range controt frame DFp DFR, DFS of the dynamic
range control frames DFP, DFR, DFS comprises one or more nodes Ao .
As; Bo . Bzr Co. wherein each node of the one or more nodes Ao . Asi Bo .
82; Co comprises gain information GAo GA5; GBs GB2i GCo for lhe au_
dio signal AS and time information TAo . TAs; TBo. TB2i TCo indicating to
which point in time the gain information corresponds
wherein the encoded dynamic range control brtstream DBS comprises for
23
each dynamic range control frame DFp, DFR, DFS of the dynamic range
controlframes DFP, DFR, DFS a corresponding bitstream portion DFp,
DFR , DFS';
executing a shifl procedure wherein one or more nodes 81, 82 of the nodes
80... 82 of one reference dynamic range controlframe DFR of the dynamic
range controllrames DFP, DFR, DFS are setected as shifted nodes Br Br,
wherein a bit representation B'1, B'2 of each of the one or more shifted nodes
Br 82 of the one reference dynamtc range control frame DFR is embedded tn
the bitstream portion DFS' correspondrng to the dynamic range controlframe
DFS subsequent to the one reference dynamic range control frame DFR,
wherein a btt representatron B,o ofeach remaining node Bo ofthe nodes go...
82 of the one reference dynamic range control frame DFR of the dynamic
range controlframes DFP, DFR, DFS is embedded into the brtstream portion
DFR' corresponding to the one reference dynamic range controtframe DFR.
Fig. 2 illustrates the pnnciple of dynamrc range control applied in the contexl
of audio coding in a schematic view
The process of applying DRC to a signal can be expressed by a simple mul_
liplicalion of the audio signat x(k) by a time-variant garn vatue g(k):
y(た〕= ,(た〕χ〔た) (1)
where k denotes a sampte time index Thevalueof thegaing(k) is computed,eg based on a short-term estimaleof the root_mean square ofthe input
slgnal x(k) More details about strategies to determine suitable gains values
are discussed in [1]. ln the following we refer to the time_variant gains g(k) as
a gatn sequence
The rnvention refers to an application scenario, where both, the audio signal
AS and the dynamic range control sequence DS are coded and transmitted.
24
ln this case the dynamic range conlrolgains are not directly applied to lhe
audio signal AS but encoded and kansmitted together w(h the encoded audiosignal ABS Atthe decoder4, both, theaudio signalAS and the dynamic
range control sequence DS are decoded and the dynamic range control information is applied to the corresponding audio signalAS.
ln one aspect the invention provides a system comprising an audio encoder
device I according to the invention and an audro decoder device 4 according
to the invention
Fig 3 illustrates the different modes for the coding of dynamic range control
gain sequences in a schematic view, namely the full-frame mode (A) and delay mode (B) Gain nodes received in frame n are shown as circles and gain
nodes received frcme n+1 are shown as squares. The solid line illustrates
the interpolated DRC gain up to DRC frame n+7
ln principle, the dynamic range control encoder/decoder chain can be operated rn two modes The so-called full-frame mode refers lo the case where
after decoding of a received dynamic range control bitstream, corresponding
to a specific dynamic range controlframe, the gains at each sample position
of the dynamrc range control frame can be immediately determined after interpolation based on the decoded nodes. This implies that a node has to be
transmitted at each frame border, ie , at the sample position corresponding
to the last sample of the dynamic range controlframe lf thedynamic range
control frame length is N this means the last transmitted node has to be located at the sample position Nwithin thatframe This is illustrated at lhe top
in Fig 3 denoted by 'A" As shown, the dynamic range control gains of the
nlh frame can immedialely be applied to the corresponding audio frame
The second mode is referred to as "delay mode" and it is illuskated in the
lower part'B" of Frg 3 ln this case, there is no node transmitted torthe last
sample position within frame n. Therefore, the DRC decoder has to wait for
25
decoding the DRC frame n+, in order to perform the required interpotation of
all gain values following the last node wjthin frame n. This is because the information of the first node of frame n+? has to be known to perform the inter_
polation between the last node of frame r, and the first node in frame rl+7 in
order to determine the gain value n between via tnterpolation.
Fig 4 illustrates the apptrcation of dynamic range control in the context of
audio coding in a schematic view where the audio coder introduces one
frame delay relative to the dynamic range codrng scheme
Fig 5 illustrates a shitt procedure for nodes according to the inventlon in a
schematic view. The left-hand side shows the situation when using a state_of.
the-a11 approach, whereas the r ght-hand side shows the proposed method,
where each square corresponds to a node Ao . . A5; Bo Bz, Co.
According to a preferred embodrment oI the invention the shift procedure is
initiated in case that a number of the nodes Bo .. 82 of the reference dynamrc
range control frame DFR is greater than a predefined threshold value
According to a preferred embodiment of the invention the shft procedure is
initiated in case that a sum of a number of the nodes Bo . . 82 of the reference dynamic range control frame DFR and a number of shifted nodes &, A5
lrom the dynamic range control frame DFp preceding the reference dynamic
range control frame DFR to be embedded in the bitstream portron DFR,cor_
responding to the reference dynamic range controt frame DFR is greater than
a predefined threshold value
According to preferred embodiment of the invention the shitt procedure is
in iated in case that a sum of a number of the nodes 86 . 82 of the refer_
ence dynamic range control frame DFR and a number of shifted nodes &, A5
from the dynarric range controlframe DFp preceding the reference dynamic
range control frame DFR io be embedded in the bitstream porlion DFR,cor26
responding to the reference dynamic range conlrol frame DFR is grealer than
a number of the nodes Co of the dynamic range control Jrame DFS subsequent to the reference dynamic range control frame DFR
As explained above, when combined with an audio coding scheme that introduces a frame delay relahve to the dynamic range control frames, the decoded dynamic range conlrol gains are delayed by one frame before being applied to the audio signal Considering the left-hand side in Fig. 5, this means
that the nodes A of the nth frame are applied to the valid audio decoder oulput at frame n+1. This implies that in the default delay mode it would be sufficrent to transmit the nodes Ai logether wth the node Bo in frame n+1 and apply the corresponding DRC gains without a delay directly to the corresponding audio output signal at the decoder.
This fact is exploited in the proposed method to reduce the maximum number
of nodes transmitted wilhin one frame This is illustrated on the right-hand
side rn Figure 4. The nodes A,.4 and As are shitted to frame n+1 before encoding, i e , the maximum number of nodes in frame rl is reduced from 6 to 4 in
the given example. As il will be discussed in the following, the nodes A,,{ and
A5 are 'preceding" the first node in frame n+1, i e., Bo only for the encoding of
the gain differences and the slope information FoIthe coding ofthe time difference information, a different method has to be applied
Fig. 6 illustrates the coding of time information according lo the invention in a
schematic view.
According to a preferred embodiment ol the invention the time information
TAo TAs: TBo. TB2; TCo of the one or more nodes Ao .A5; Bo B:, Co
is represented in such way lhat the one or more shifted nodes &, Asi Bl. 82
may be identified by using the time information TAn, TAst TBr TB:
27
According to preferred embodimenl of the invention the time nformation TAa,
TA5; TB1 TB2 of the one or more shifted nodes A4, Ab; Br 82 is represented
by a sum of a trme difference t_A", LAs: t_Br t 82 from a beginning of the
dynamlc range control frame DFp; DFR to whrch the respeclive node Aa, 45;
B1 B, belongs tothe lemporal position ofthe respective node44, A5; Br Bz
within the dynamic range controlframe DFpi DFR to which the respective
node Aa, A5; Bj 82 belongs and an ofFset value drcFramesize being greater
than or equal to a temporal size of the dynamic range conlrolframe DFR;
DFS subsequent to the respective dynamic range control frame DFp; DFR.
Frrst we consider the encoding of the time differences between pairs of
nodes ln Fig 6 the siiuation for determining the time differences for pairs of
nodes is depicted for lhe example according to Figure 4, where l_Aidenotes
the sample posilion of node A on the grid of possible node positions within a
frame. As discussed earlier nodes can be selected on a uniform time grid,
where ihe spacing of this grid defines the highest available time resolution
deltaTmin fhus, the time information I A is given in samples, where the
time differences between two nodes are always integer multiples of del_
laTmin.
The temporal position information of a node is encoded in a differentiat way,
i.e , relative lo the position of the previous node. lf a node is the first node
within a frame, the time difference ts determined relative to the beginning of a
frame The left-hand side of Fig. 6 depicts the situation if no node shiftrng is
applied ln this case, the differentialtime information of node Aa tDrcDetta_&
is computed as tDrcDelta_Aa = t_A{ - t_A3 This diflerential time value is then
encoded using the corresponding entry in an appropriate Hufiman table, e g
according to Table 1 or 2. As another example we look at the encoded time
difference of node Bo Since t is the first node of frame .l+ 1 , the correspondg time difference is determined relative to the beginning of the frame, i.e
rcDelta-Bo = t_Bo
28
Let us now consider the encoding of the node position for the proposed node
reservoir lechnique using node shifting For the example shown on the right
hand side of Fig 6, the nodes & and A5 have been shifted to the next frame
for encoding The representation of nodes Ao to A3 has not changed and lhe
encoded time differences are therefore also not changed. The same is true
for the encoded positron information of node Bo. However, the time information of node A4 and node As is now processed differently. As shown in Fig.
6, the original value t_Aa indicating the sample position of node Aa is modified
al the encoder by adding an offset of drcFramesrze. Since the resulting posF
tion informalion exceeds the maximum value that would be possible in case
oI regular encoding, the otfset indicates the decoder that the corresponding
node has to be further processed within the context of the previous frame.
Furthermore, the decoder knows that the original sample pos(ron t_4,.a is obtained by subtracting the offset drcFramesze from lhe decoded value.
Next, we consider the computation of the time difference information that rs
aclually encoded for the situation shown on the right-hand side of Fig. 6 For
coding efficiency reasons, the differential position information for node A. is
computed relative to node Bo. ln conlrast to the situation previously discussed for the left-hand side of Fig 6, the differential time information is now
computed according to tDrcDelta_Ad = l_A4 + drcFramesize - t_Bo, i.e , by
including the offset. Analogously, for node As we obtain torcDelta_As = t_As
+ drcFramesize -l Aa- drcFramesize, which obviously is the same as
tDrcDelta_As = t-A5 - t_Ad. These ditferential time values are encoded using
the corresponding code word entry of the correct Huffman table, e.g. according to Table ', or 2
The method for decoding the temporal position information can be summarized as Iollows. The decoder extracts the time diflerence rnformalion of a
node based on the corresponding code word from the bitstream The time
information s obtained by adding the time difference information to the time
information of the previous node lflhe resultlng sample posilion is larger
29
lhan drcFrcmeSize lhe decoder knows that the present node has to be processed as if it were the last node in lhe previous frame, i.e ,
it has to be appended to the nodes decoded in the previous frame The correct sample position rs determined by sublracting the offsel ualre drcFrameSize from the
decoded time value The same processing steps are applied in an analog
way if more shifted nodes occur in a decoded frame.
After decoding and correcting the time information ot an entire frame, the decoder knows how many nodes have been shifted back to the prevtous frame
(without explicitly providing this information al the encoder) and on which
sample position they are located within the previous frame. The informatron
about the number of shifted nodes will be further exploiled in the context of
decoding gain and slope information described below.
Fig 7 illustrates the coding of garn information according to the invention in a
schematic view
According to preterred embodiment of the invention the gain information GBI
of the bit representatron B'1 of the shifted node Bj, which is at a flrst posilion
ol the bitstream portion DFS corresponding to the dynamic range conlrol
frame DFS subsequent to the reference dynamic range control frame DFR, is
represented by an absolute gain value g_Bl aod wherein the gain informalion
GB2 ol each bit representation B'2 of the shifted nodes 82 at a position after
the bit representation B'j of the node 81, which is at the lirst position of the
bitstream portion DFS' corresponding to the dynamic range controlframe
DFS subsequent to the reference dynamic range controi frame OFR, is represented by a relative gain value which is equal to a difference of a gain value g_82 of the bit representation B'2 of the respective shifted node 82 and the
gain value g_81 ol the bit representation B'i ol the nodeBl, which precedes
the bil representation B'2 of the respective node82,
30
According to a preferred embodiment of the invention, in case that the bit
representations B'r, B'2 of one or more shifled nodes 81, 82 of the reference
dynamic range conirolframe DFR is embedded in the bitstream portion DFS'
corresponding to the dynamic range control frame DFS subsequent to the
relerence dynamic range control frame DFR, lhe gain information GCo of the
bit representalion C'o of the node Co of the subsequent dynamic range conholframe DFS at a first position of the bilstream portion DFS' corresponding
to the dynamic range controlframe DFS subsequenl to the reference dynamic range controlframe DFR after the one or more positions of the bit representations B'1, B'2 of the one or more shifted nodes 81, 82 is represented by
a relative gain value which is equal to a difference of a gain value g_Co of lhe
bit representation C o of the respective node Co and a gain value g_82 of the
b( representation B', of the shifted node82, which precedes the bit representation C'o of the respective node Co
ln Fig 7 the situation for determining the gain ditferences for palrs of nodes is
depicted for the example according to Figure 5, where g_A denotes the gain
value of node A
First, the differential gain values for the node & is considered Fortheapproach without node reservoir, depicted on the left.hand side of F19. 7, the
differential gain value gain Delta_A"d is computed from the difference of the
gain value (in dB) of the preceding node A3 and the node A{, i e , gainDelta_Aa= g_Aa-g_A3 This differential gain value isthen encoded using the
corresponding entry rn an appropriate Huffman table Furthermore, we consider the first node of lrame n+l onthe left-hand side of Fig 7 Since Bo is
the flrst node of that frame, it gain value is not encoded in a differentialway
but accordrng to a specilic coding of initial gain values garlnllial ie., the
gain value is encoded as its actual valuer gainDelta_Bo = g_Bo
For the situation shown on the flght-hand side where the node Aa has been
shifted to the next frame ,+1, the values of the encoded gain information is
31
different As can be seen, after being shifted, the node A,r becomes the first
node in frame n+'1 with respect to encoding the gain differences. Thus, its
gain value s not encoded in a differential way, but the specific coding of initral garn values is applied as described above The difterential gain value of
As will remain the same for both situations shown on the lett- and the nghthand side. Since node Bo now foilows node A5 if the node reservoir is used,
its gain informalion will be delermined from the difference of the gains of
node Bo and A5, i.e , gainDelta_Bo = g_Bo, g_As. Note that only the way how
the gain differences are delermined changes when applying the node reservorr technique, whereas the reconstructed values of the gains remain the
same for each node. Obviously, after decoding the entire gain related inlormation of the frames n and ,+1, the obtained garn values for the nodes Ao to
Bo are identical to that oblained in the left-hand side, and the nodes can be
computed "in time" for application of the DRC gains to the corresponding audio lrame
As discussed in the previous paragraph, the number of shifted nodes and
their sample position within the previous frame are known afler decoding the
time difference information. As illustrated on the righ!hand side of Figure 6,
the gain values of shfted nodes from frame n start immediately from the beginning oI the received gatn information of frame ,+1 Therefore, the information on the number of shifted nodes s sufficient for the decoder to assign
each gain value to the correct sample posilion within the correct frame. Considering the example shown on the right-hand stde in Figure 6, the decoder
knows that the first two decoded gain values of frame n+1 have to be appended to the last gain values of the prevtous frame, whereas the third gain
value corresponds to the correct gain value of the first node in the current
frame
Fig 8 illuskates the coding of Slope information according to the invention in
a schematic view
32
According to a preferred embodiment ofthe invention each node Ao Asi Bo
. 82 Co of the one or more nodes comprises Ao Asi Bo 82 Cc slope
information SAo SAsi SBo SB, SCo
Next, the coding of slope information is consdered, which is illustrated in Fig.
8. The slope information of the nodes isn't encoded in a differential way between pairs of nodes, bul for each node independently. Therefore, the slope
related rnformation remains unchanged in both cases with and w hout usage
of the node reservoir. As ln case of coding of gain values, the Huffman tables
for generating the code words for slope information remain the same for both
cases, wth and without using the proposed node shifting The assignment of
the slope information to the correct sample position within the correct frame
is performed analogously to the case of decoding the gain values.
After all nodes information received for frame n+1 have been decoded and if
applicable shifted back to the preceding frame n, the gain interpolation for
framer, using splines or linear interpolation can be performed in the common
way and the gain values are applied to the corresponding audro frame
Fig. I illustrates an embodiment of an audio decoder device according to the
invention in a schematic view The audio decoder device 4 comprises;
an audro decoder 5 configured for decoding an encoded audio bitstream ABS
in order to reproduce an audio signal AS comprising consecutive audio
frames AFP, AFR, AFS;
a dynamic Iange control decoder 6 configured for decoding an encoded dynamic range control bitstream DBS in order to reproduce an dynamic range
control sequence DS corresponding to the audio signalAS and comprising
consecutive dynamic range control frames OFP, DFR, DFS;
wherein the encoded dynamic range control bilstream DBS comprises for
33
each dynamic range conirol frame DFp, DFR, DFS of the dynamic range
control frames a corresponding bilslream portion DFp', DFR,. DFS,i
wherein the encoded dynamic range control bitstream DBS comprises bit
representations A'o.. A's,B'o B'a; C'cofnodesAo. Asi Bo Bzj Co,
wherein each bit representation oI one node of the nodes comprises gain
information GAo. GA,s; G80... GB2; GCc for the audio signal ASandtime
information TAo TAst TBo . TB2; TCo indicating to which pornt in time the
gain information GAo GAs; GBo. GBri GCo correspondsi
wherein the encoded dynamic range control bit stream DBS comprises bit
representations B'r, B'2 of shitted nodes 81, 82 selected from the nodes Bo
82 of one relerence dynamic range control frame DFR of the dynamic range
control frames DFP, DFR, DFS, wh,ch are embedded in a bitstream portion
coresponding lo the dynamic range control irame DFS subsequent to the
one reference dynamic range control frame DFR, wherein the bit representa_
tion B'o of each remarnrng node Bo of the nodes Bo 82 of the one reference
dynamic range control frame DFR of the dynamic range control frames DFp,
DFR, DFS is embedded inlo the bitskeam portion DFR, corresponding to the
one reference dynamic range controlframe DFR, and
wherein the dynamic range control decoder 6 is conligured for decoding lhe
bit representation B'q of each remaining node Bo of the remaining nodes B,o
of the one reference dynamic range control kame DFR of the dynamic range
control frames DFP, DFR DFS in order to reproduce each remaining node
Bo of lhe one reference dynamic range control lrame DFR oI the dynamic
range cootrolframes DFP, DFR, DFS, for decoding the bit representation B'r
B2of each shifted node Bj 82ofthe shifted nodes 81 82 selected from the
nodes Bo .. 82 of the one reference dynamic range control frame DFR of the
dynamic range control frames DFp, DFR, DFS in order to reproduce each
shifted node 81 Bz of the shifted nodes Bj B? selected Irom the nodes of lhe
one reference dynam c range control frame DFR of lhe dynamic range con34
trolframes DFP, DFR, DFS and for combining the reproduced remaining
nodes Bo and the reproduced shifted nodes Bi 82 in order to reconstruct the
reference dynamic range control frame DFR
According to a preferred embodimenl of the invention the dynamic range
control decoder 6 is configured for ident fying the one or more shifted nodes
Ar, As; Br B: by using the time intormation TA,a, TA5; TB1 TBr.
According to a preferred embodrment of the invention the dynamic range
control decoder 6 is conftgured for decoding the time information T&, TA5;
TB1 TB, of the one or more shifted nodes A,4, A5; Br Bz, which ts represented
by a sum of a time difference t_A4, t_A5, t_Br t_Bz from a beginning ofthe
dynamic range control frame DFP, DFR to which the respeclive node A1, As;
Br. 82 belongs tothetemporal position ofthe respective node A,a, As; Br Bz
wilhin the dynamic range controlframe DFP; DFR to which the respective
node &, Asi Br 82 belongs and an offset value drcFramesize being greater
than or equal to a temporal size of the dynamic range controlframe DFR;
OFS subsequent lo the respective dynamic range control frame DFPi DFR.
According io a preferred embodiment of the rnvention the dynamic range
control decoder 6 is configured for decoding the gatn information GBj of the
bit representation B'1 of the shifted node 81, which is at a first position of the
brtstream portion DFS' corresponding to the dynamic range controlframe
DFS subsequenl to the reference dynamic range control frame DFR, rs represented by an absolute gain value g_B1 and wherein the gain information
GB2 of each bil representation B'2 of the shifted nodes 82 at a position after
the bit representation B'1 oi the node Br which is at the first position of the
bitstream portion DFS'corresponding lo the dynamic range control frame
DFS subsequent to the reference dynamic range control frame DFR, is represented by a relative gain value whtch is equal to a difference of a gain vaL
ue g_82 of the bit representation B 2 of the respective shifted node 82 and the
35
gain value g_81 ofthe b( representation B'1 oflhe nodeBr, which precedes
the bit representatton B'2 oI the respective node82
According to a preferred embodiment of the invention the dynamic range
control decoder 6 is configured for decoding the gain information GCq of the
bit representation C'o of the node Co of the subsequent dynamic range control frame DFS at a first position of the bitstream portion DFS, corresponding
to the dynamic range controlframe DFS subsequent to the reference dynamic range control trame DFR after the one or more positions of the bit representations B'r, B'2 ofthe one or more shifted nodes Bl, Bz is represented by
a relative gain value which is equal to a difference of a gain value g_Co the
bit representation C'o of the respeciive node Co and the garn value g_82 of
the bit representation B'2 of the shifted nodeB2, which precedes the bit representation C'o of the respective node Co
According to preferred embodiment of the invention a temporal size of the
audio frames AFP, AFR, AFS is equalto a temporal size of the dynamic
range control frames AFP, AFR, AFS
According to a preferred embodiment of the invention the one or more nodes
Ao.. As; Bo. . 82, Co of one of thedynamic range control frames DFp, DFR,
DFS are selected from a uniform time grid.
According to preferred embodiment of the invenlion each node Ao As; Bo
82 Co of the one or more nodes Aq . A5, 86 82 Co comprises slope
information SAo SAs; SBo . SBz, SCo
According to a preferred embodimenl of the invention the dynamic range
control decoder 6 is configured for decoding the bit representations of the
nodes A'o A'si B'o B', C'o using an entropy decoding technique
36
ln another aspect the invention provides a method for operating an audio decoder, the melhod comprises the steps:
decoding an encoded audio bitstream ABS in order to reproduce an aud o
signalAS comprising consecutive audio frames AFP, AFR, AFS;
decoding an encoded dynamic range control bitstream DBS in order to reproduce an dynamic range control sequence DS corresponding to the audio
signalAS and comprising consecutive dynamic range controlframes DFp,
DFR, DFS:
wherein the encoded dynamic range control bitstream DBS comprises tor
each dynamic range control frame DFP, DFR, DFS of the dynamic range
controlframes a corresponding bitstream portion DFP', DFR', DFS';
wherein the encoded dynamic range control bitstream DBS comprises bI
representations A'o . A's, B'o . B'2; C'oofnodesAo .. A5, Be.. 82; C6,
wherein each bit representation of one node of the nodes comprises gain
information GAo . GAs; GBo . GB2: GCo for the audio signal AS and lime
information TAo ... TAst TBo... TB2i TCo indicating to whrch point in time the
gain information GAo GAsi GBo.. cB2i cco corresponds;
wherein the encoded dynamic range control bit stream DBS comprises bit
representations B'r, B'2 of shfled nodes 81. 82 selected from the nodes Bo .
82 of one reference dynamic range control frame DFR of the dynamic range
conttolframes DFP, DFR, DFS, which are embedded in a bitstream portion
corresponding to the dynamic range control trame DFS subsequent to the
one reference dynamrc range controlframe DFR, wherein the bit representation B'o of each remaining node Bo of lhe nodes Bo . . 82 of the one reference
dynamic range control frame DFR of the dynamic range control frames DFP,
DFR, DFS is embedded tnto lhe bitstream portion DFR' corresponding to the
one reference dynamic range control frame DFR; and
37
wherein the bit representation B'o of each remaining node Bo of the remaining
nodes B'o ol the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is decoded in order to reproduce
each remaining node Bo of the one reference dynamic range control frame
DFR of the dynamic range control frames DFP, DFR, DFS,
wherein the bit representation B'1 B'2 of each shrfted node 81 82 ofthe shiftednodesBl 82 selecled from the nodes Bo Bz oltheone reference dynamic range control frame DFR of the dynamic range control frames DFP,
DFR, DFS is decoded in order to reproduce each shifted node 81. 82 of the
shifted nodes 81. 82 selected from the nodes of the one reference dynamrc
range control frame DFR of the dynamic range control frames DFP, DFR,
OFS. and
wherein the reproduced remaining nodes Bo and the reproduced shifted
nodes 81. 82 are combined in order to reconstruct the reference dynamic
range control frame DFR
W(h respect to the decoder, the encoder and the methods of the described
embodiments the followrng shall be mentioned:
Although some aspects have been described in lhe context of an apparatus,
it is clear that these aspects also represent a description of the corresponding method, where a block or devrce corresponds to a method step or a feature of a melhod step. Analogously aspects descflbed in the context of a
method step also represent a description of a corresponding block or item or
feature of a cgrresponding apparatus
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software The implementation
can be performed using a drgital storage medium, for example a floppy drsk,
38
a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH
memory, having electronically readabte control slgnals stored thereon, which
cooperate (or are capable of cooperEting) with a programmable computer
system such that the respective method is performed
Some embodiments according to the invention comprise a data carrier hav_
ing electronically readable control signals, which are capable of cooperaling
with a programmable computer system such lhat one of the methods de_
scribed herein is performed.
Generally, embodiments of the present invenlion can be imptemented as a
computer program product with a program code, the program code being
operative for performing one of the methods when the compuler program
product runs on a computer. The program code may for example be stored
on a machine readable carrier.
Other embodtments comprise the computer program for performing one of
the methods described herein, whrch is stored on a machrne readable carrier
or a noo-transitory storage medium
ln other words, an embodiment ot the inventive method is, therefore, a computer program havrng a program code for performing one of the melhods de_
scribed herein, when the computer program runs on a computer.
A further embodiment of the inventive methods is, therefore, a data carner (or
a digital storage medium, or a computer-readable medium) comprising, rec_
orded thereon, the computer program for performing one of the methods de_
scribed herein.
A further embodiment of the inventive method is therefore, a data stream or
a sequence of signals representing lhe computer program for performing one
of the methods described herein. The data stream or the sequence of signals
39
may be configured, for example, to be lransferred via a data communication
connection, for example via the lnternet
A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured or adapted to perform one
of the methods described herein
A further embodiment comprises a computer having installed thereon the
computer program for performing one of the methods described herein
ln some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of lhe methods described herein ln some embodiments, a field programmable gate array may cooperate w h a microprocessor in order to perform one ofthe methods described herein Generally, the methods are advantageously performed by any hardware apparatus.
While this invention has been described in terms of several embodiments,
there are alterations, permutations, and equivalents which fallwithin the
scope of this invention. lt should also be noted that there are many altemative ways of implementing the methods and composilions of the present invention lt rs therefore inlended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall
within the true spirit and scope of the presenl invention.
Reference siqns
audio encoder device
audio encoder
dynamic range control encoder
audio decoder device
audio decoder





40
6 dynamic range control decoder
ABS encoded audio bitslream
AS audio signal
5 AFP preceding audio frame
AFR reference audio frame
AFS subsequent audio frame
OBS encoded dynamic range control bilstream
DS dynamic range control sequence
DFP preceding dynamic range controlframe
DFR reference dynamic range controlframe
DFS subsequent dynamic range control frame
Ao As nodes of the preceding dynamic range control frame
86 82 nodes oI the reference dynamic range control frame
Co node ofthe subsequent dynamic range controlframe
DFP' bit stream portion corresponding to the preceding dynamic
range controlframe
DFR' bit stream po.tion corresponding to the reference dynamic
range controlframe
DFS' bit stream portion corresponding to the subsequent dynamic
range controllrame
TAo TAs time information of the nodes of the preceding dynamic range
controlframe
TBo TBz time information of nodes of the reference dynamic range
control rame
TCo time tnformation of node of the subsequent dynamic range
controlframe
t -Ao t_As trme difference of the nodes of the preceding dynamic range
control frame
t-Bo . . t_82 time difference of nodes of the reference dynamic range
controlframe
41
t_Co time difference of node oI the subsequenl dynamic range
control frame
GA{ GAs gain information of the nodes ofthe preceding dynamic range
controlframe
GBo . GBu gain information of nodes ofthe reference dynamic range
controlframe
GCo gain information of node of the subsequent dynamic range
controlframe
g_Ao ... 9_As gainvalueof the nodes of the preceding dynamic range
control frame
9_Bo g_82 gain value of nodes of the reference dynamic range control
frame
9_Co gain value of node of the subsequent dynamrc range control
frame
SAq. SA5 slope information of the nodes of the preceding dynamic
range conlrolframe
SBo SBz slope information of nodes of the reference dynamic range
control frame
SCo slope information of node ofthe subsequent dynamic range
controlframe
References:
t1l D Giannoulis, M lvlassberg, J D Reiss, "Digital Dynamic Range
Compressor Design -A Tutorial and Analysis" J. Audio Engrneenng
Society, Vol.60, No 6, June 2012. in
42
1. Audio encoder device (1) comprising:
an audio encoder (2) configured for producing an encoded audio bitstream
5 (ABS) from an audio signal (AS) comprising consecutive audio frames (AFP,
AFR, AFS);
a dynamic range control encoder (3) configured for producing an encoded
dynamic range control bitstream (DBS) from a dynamic range control sequence
(DS) corresponding to the audio signal (AS) and comprising consecutive dy10 namic range control frames (DFP, DFR, DFS), wherein each dynamic range
control frame (DFP, DFR, DFS) of the dynamic range control frames (DFP,
DFR, DFS) comprises one or more nodes (A0 … A5; B0 … B2; C0), wherein each
node of the one or more nodes (A0 … A5; B0 … B2; C0) comprises gain information (GA0 … GA5; GB0 … GB2; GC0) for the audio signal (AS) and time infor15 mation (TA0 … TA5; TB0 … TB2; TC0) indicating to which point in time the gain
information (GA0 … GA5; GB0 … GB2; GC0) corresponds;
wherein the dynamic range control encoder (3) is configured in such way
that the encoded dynamic range control bitstream (DBS) comprises for each
dynamic range control frame (DFP, DFR, DFS) of the dynamic range control
20 frames (DFP, DFR, DFS) a corresponding bitstream portion (DFP’, DFR’,
DFS’);
wherein the dynamic range control encoder (2) is configured for executing
a shift procedure, wherein one or more nodes (B1, B2) of the nodes (B0 … B2)
of one reference dynamic range control frame (DFR) of the dynamic range con25 trol frames (DFP, DFR, DFS) are selected as shifted nodes (B1, B2), wherein a
bit representation (B’1, B’2) of each of the one or more shifted nodes (B1, B2) of
the one reference dynamic range control frame (DFR) is embedded in the bitstream portion (DFS’) corresponding to the dynamic range control frame (DFS)
subsequent to the one reference dynamic range control frame (DFR), wherein
30 a bit representation (B’0) of each remaining node (B0) of the nodes (B0 … B2) of
the one reference dynamic range control frame (DFR) of the dynamic range
control frames (DFP, DFR, DFS) is embedded into the bitstream portion (DFR’)
corresponding to the one reference dynamic range control frame (DFR);
43
wherein the audio encoder (2) is configured for producing the encoded audio bitstream (ABS) using a delay mode.
2. The audio encoder device (1) as claimed in claim 1, wherein the shift procedure
5 is initiated in case that a number of the nodes of the reference dynamic range
control frame is greater than a predefined threshold value.
3. The audio encoder device (1) as claimed in claim 1, wherein the shift procedure
is initiated in case that a sum of a number of the nodes of the reference dynamic
10 range control frame and a number of shifted nodes from the dynamic range
control frame preceding the reference dynamic range control frame to be embedded in the bitstream portion corresponding to the reference dynamic range
control frame is greater than a predefined threshold value.
15 4. The audio encoder device (1) as claimed in claim 1, wherein the shift procedure
is initiated in case that a sum of a number of the nodes (B0 … B2) of the reference dynamic range control frame (DFR) and a number of shifted nodes (A4,
A5) from the dynamic range control frame (DFP) preceding the reference dynamic range control frame (DFR) to be embedded in the bitstream portion
20 (DFR’) corresponding to the reference dynamic range control frame (DFR) is
greater than a number of the nodes (C0) of the dynamic range control frame
(DFS) subsequent to the reference dynamic range control frame (DFR).
5. The audio encoder device (1) as claimed in one of the claims 1 to 4, wherein
25 the time information (TA0 … TA5; TB0 … TB2; TC0) of the one or more nodes
(A0 … A5; B0 … B2; C0) is represented in such way that the one or more shifted
nodes (A4, A5; B1, B2) may be identified by using the time information (TA4, TA5;
TB1, TB2).
30 6. The audio encoder device (1) as claimed in claim 5, wherein the time information (TA4, TA5; TB1, TB2) of the one or more shifted nodes (A4, A5; B1, B2) is
represented by a sum of a time difference (t_A4, t_A5; t_B1, t_B2) from a beginning of the dynamic range control frame (DFP; DFR) to which the respective
node (A4, A5; B1, B2) belongs to the temporal position of the respective node
44
(A4, A5; B1, B2) within the dynamic range control frame (DFP; DFR) to which the
respective node (A4, A5; B1, B2) belongs and an offset value (drcFrameSize)
being greater than or equal to a temporal size of the dynamic range control
frame (DFR; DFS) subsequent to the respective dynamic range control frame
5 (DFP; DFR).
7. The audio encoder device (1) as claimed in one of the claims 1 to 6, wherein
the gain information (GB1) of the bit representation (B'1) of the shifted node (B1),
which is at a first position of the bitstream portion (DFS’) corresponding to the
10 dynamic range control frame (DFS) subsequent to the reference dynamic
range control frame (DFR), is represented by an absolute gain value (g_B1)
and wherein the gain information (GB2) of each bit representation (B’2) of the
shifted nodes (B2) at a position after the bit representation (B'1) of the node (B1),
which is at the first position of the bitstream portion (DFS’) corresponding to the
15 dynamic range control frame (DFS) subsequent to the reference dynamic
range control frame (DFR), is represented by a relative gain value which is
equal to a difference of a gain value (g_B2) of the bit representation (B’2) of the
respective shifted node (B2) and a gain value (g_B1) of the bit representation
(B’1) of the node (B1), which precedes the bit representation (B’2) of the respec20 tive node (B2).
8. The audio encoder device (1) as claimed in one of the claims 1 to 7, wherein,
in case that the bit representations (B’1, B’2) of one or more shifted nodes (B1,
B2) of the reference dynamic range control frame (DFR) is embedded in the
25 bitstream portion (DFS’) corresponding to the dynamic range control frame
(DFS) subsequent to the reference dynamic range control frame (DFR), the
gain information (GC0) of the bit representation (C’0) of the node (C0) of the
subsequent dynamic range control frame (DFS) at a first position of the bitstream portion (DFS’) corresponding to the dynamic range control frame (DFS)
30 subsequent to the reference dynamic range control frame (DFR) after the one
or more positions of the bit representations (B’1, B’2) of the one or more shifted
nodes (B1, B2) is represented by a relative gain value which is equal to a difference of a gain value (g_C0) of the bit representation (C’0) of the respective node
45
(C0) and a gain value (g_B2) of the bit representation (B’2) of the shifted node
(B2), which precedes the bit representation (C’0) of the respective node (C0).
9. The audio encoder device (1) as claimed in one of the claims 1 to 8, wherein
5 each node (A0 … A5; B0 … B2; C0) of the one or more nodes comprises (A0 …
A5; B0 … B2; C0) slope information (SA0 … SA5; SB0 … SB2; SC0).
10. The audio encoder device (1) as claimed in one of the claims 1 to 9, wherein
the dynamic range control encoder (3) is configured for encoding the nodes (A0
10 … A5; B0 … B2; C0) using an entropy encoding technique.
11. Audio decoder device (4) comprising:
an audio decoder (5) configured for decoding an encoded audio bitstream (ABS) in order to reproduce an audio signal (AS) comprising con15 secutive audio frames (AFP, AFR, AFS);
a dynamic range control decoder (6) configured for decoding an encoded dynamic range control bitstream (DBS) in order to reproduce an dynamic range control sequence (DS) corresponding to the audio signal (AS)
and comprising consecutive dynamic range control frames (DFP, DFR,
20 DFS);
wherein the encoded dynamic range control bitstream (DBS) comprises
for each dynamic range control frame (DFP, DFR, DFS) of the dynamic
range control frames a corresponding bitstream portion (DFP’, DFR’, DFS’);
wherein the encoded dynamic range control bitstream (DBS) comprises
25 bit representations (A’0 … A’5; B’0 … B’2; C’0) of nodes (A0 … A5; B0 … B2;
C0), wherein each bit representation of one node of the nodes comprises
gain information (GA0 … GA5; GB0 … GB2; GC0) for the audio signal (AS)
and time information (TA0 … TA5; TB0 … TB2; TC0) indicating to which point
in time the gain information (GA0 … GA5; GB0 … GB2; GC0) corresponds;
30 wherein the encoded dynamic range control bit stream (DBS) comprises bit representations (B’1, B’2) of shifted nodes (B1, B2) selected from
the nodes (B0 … B2) of one reference dynamic range control frame (DFR)
of the dynamic range control frames (DFP, DFR, DFS), which are embedded in a bitstream portion corresponding to the dynamic range control
46
frame (DFS) subsequent to the one reference dynamic range control frame
(DFR), wherein the bit representation (B’0) of each remaining node (B0) of
the nodes (B0 … B2) of the one reference dynamic range control frame
(DFR) of the dynamic range control frames (DFP, DFR, DFS) is embedded
5 into the bitstream portion (DFR’) corresponding to the one reference dynamic range control frame (DFR);
wherein the dynamic range control decoder (6) is configured for decoding the bit representation (B’0) of each remaining node (B0) of the remaining
nodes (B’0) of the one reference dynamic range control frame (DFR) of the
10 dynamic range control frames (DFP, DFR, DFS) in order to reproduce each
remaining node (B0) of the one reference dynamic range control frame
(DFR) of the dynamic range control frames (DFP, DFR, DFS), for decoding
the bit representation (B’1, B’2) of each shifted node (B1, B2) of the shifted
nodes (B1, B2) selected from the nodes (B0 … B2) of the one reference dy15 namic range control frame (DFR) of the dynamic range control frames
(DFP, DFR, DFS) in order to reproduce each shifted node (B1, B2) of the
shifted nodes (B1, B2) selected from the nodes of the one reference dynamic
range control frame (DFR) of the dynamic range control frames (DFP, DFR,
DFS) and for combining the reproduced remaining nodes (B0) and the re20 produced shifted nodes (B1, B2) in order to reconstruct the reference dynamic range control frame (DFR); and
wherein the audio decoder (5) is configured for decoding the encoded
audio bitstream (ABS) using a delay mode.
25 12. The audio decoder device (4) as claimed in claim 11, wherein the dynamic
range control decoder (6) is configured for identifying the one or more shifted
nodes (A4, A5; B1, B2) by using the time information (TA4, TA5; TB1, TB2).
13. The audio decoder device (4) as claimed in claims 11 or 12, wherein the dy30 namic range control decoder (6) is configured for decoding the time information
(TA4, TA5; TB1, TB2) of the one or more shifted nodes (A4, A5; B1, B2), which is
represented by a sum of a time difference (t_A4, t_A5; t_B1, t_B2) from a beginning of the dynamic range control frame (DFP; DFR) to which the respective
node (A4, A5; B1, B2) belongs to the temporal position of the respective node
47
(A4, A5; B1, B2) within the dynamic range control frame (DFP; DFR) to which the
respective node (A4, A5; B1, B2) belongs and an offset value (drcFrameSize)
being greater than or equal to a temporal size of the dynamic range control
frame (DFR; DFS) subsequent to the respective dynamic range control frame
5 (DFP; DFR).
14. The audio decoder device (4) as claimed in one of the claims 11 to 13, wherein
the dynamic range control decoder (6) is configured for decoding the gain information (GB1) of the bit representation (B'1) of the shifted node (B1), which is
10 at a first position of the bitstream portion (DFS’) corresponding to the dynamic
range control frame (DFS) subsequent to the reference dynamic range control
frame (DFR), is represented by an absolute gain value (g_B1) and wherein the
gain information (GB2) of each bit representation (B’2) of the shifted nodes (B2)
at a position after the bit representation (B'1) of the node (B1), which is at the
15 first position of the bitstream portion (DFS’) corresponding to the dynamic
range control frame (DFS) subsequent to the reference dynamic range control
frame (DFR), is represented by a relative gain value which is equal to a difference of a gain value (g_B2) of the bit representation B’2 of the respective shifted
node B2 and a gain value (g_B1) of the bit representation (B’1) of the node (B1),
20 which precedes the bit representation (B’2) of the respective node (B2)
15. The audio decoder device (4) as claimed in one of the claims 11 to 14, wherein
the dynamic range control decoder (6) is configured for decoding the gain information (GC0) of the bit representation (C’0) of the node (C0) of the subse25 quent dynamic range control frame (DFS) at a first position of the bitstream
portion (DFS’) corresponding to the dynamic range control frame (DFS) subsequent to the reference dynamic range control frame (DFR) after the one or
more positions of the bit representations (B’1, B’2) of the one or more shifted
nodes (B1, B2) is represented by a relative gain value which is equal to a differ30 ence of a gain value (g_C0) of the bit representation (C’0) of the respective node
(C0) and a gain value (g_B2) of the bit representation (B’2) of the shifted node
(B2), which precedes the bit representation (C’0) of the respective node (C0).
48
16. The audio decoder device (4) as claimed in one of the claims 11 to 15, wherein
each node (A0 … A5; B0 … B2; C0) of the one or more nodes (A0 … A5; B0 … B2;
C0) comprises slope information (SA0 … SA5; SB0 … SB2; SC0).
5 17. The audio decoder device (4) as claimed in one of the claims 11 to 16, wherein
the dynamic range control decoder (6) is configured for decoding the bit representations of the nodes (A’0 … A’5; B’0 … B’2; C’0) using an entropy decoding
technique.
10 18. A method for operating an audio encoder device (1) comprising an audio encoder (2) and a dynamic range control encoder (3), the method comprises the
steps:
using the audio encoder (2) for producing an encoded audio bitstream
(ABS) from an audio signal (AS) comprising consecutive audio frames
15 (AFP, AFR, AFS);
using the dynamic range control encoder (3) for producing an encoded
dynamic range control bitstream (DBS) from an dynamic range control sequence (DS) corresponding to the audio signal (AS) and comprising consecutive dynamic range control frames (DFP, DFR, DFS), wherein each
20 dynamic range control frame (DFP, DFR, DFS) of the dynamic range control frames (DFP, DFR, DFS) comprises one or more nodes (A0 … A5; B0
… B2; C0), wherein each node of the one or more nodes (A0 … A5; B0 … B2;
C0) comprises gain information (GA0 … GA5; GB0 … GB2; GC0) for the audio
signal (AS) and time information (TA0 … TA5; TB0 … TB2; TC0) indicating to
25 which point in time the gain information corresponds,
wherein the encoded dynamic range control bitstream (DBS) comprises
for each dynamic range control frame (DFP, DFR, DFS) of the dynamic
range control frames (DFP, DFR, DFS) a corresponding bitstream portion
(DFP’, DFR’, DFS’);
30 using the dynamic range control encoder (3) for executing a shift procedure, wherein one or more nodes (B1, B2) of the nodes (B0 … B2) of one
reference dynamic range control frame (DFR) of the dynamic range control
frames (DFP, DFR, DFS) are selected as shifted nodes (B1, B2), wherein a
bit representation (B’1, B’2) of each of the one or more shifted nodes (B1, B2)
49
of the one reference dynamic range control frame (DFR) is embedded in
the bitstream portion (DFS’) corresponding to the dynamic range control
frame (DFS) subsequent to the one reference dynamic range control frame
(DFR), wherein a bit representation (B’0) of each remaining node (B0) of the
5 nodes (B0 … B2) of the one reference dynamic range control frame (DFR)
of the dynamic range control frames (DFP, DFR, DFS) is embedded into
the bitstream portion (DFR’) corresponding to the one reference dynamic
range control frame (DFR);
wherein the audio encoder (2) is configured for producing the encoded
10 audio bitstream (ABS) using a delay mode.
19. A method for operating an audio decoder device (4) comprising an audio decoder (5) and a dynamic range control decoder (6), the method comprises the
steps:
15 using the audio decoder (5) for decoding an encoded audio bitstream
(ABS) in order to reproduce an audio signal (AS) comprising consecutive
audio frames (AFP, AFR, AFS);
using the dynamic range control decoder (6) for decoding an encoded
dynamic range control bitstream (DBS) in order to reproduce an dynamic
20 range control sequence (DS) corresponding to the audio signal (AS) and
comprising consecutive dynamic range control frames (DFP, DFR, DFS);
wherein the encoded dynamic range control bitstream (DBS) comprises for each dynamic range control frame (DFP, DFR, DFS) of the dynamic range control frames a corresponding bitstream portion (DFP’, DFR’,
25 DFS’);
wherein the encoded dynamic range control bitstream (DBS) comprises bit representations (A’0 … A’5; B’0 … B’2; C’0) of nodes (A0 … A5; B0 …
B2; C0), wherein each bit representation of one node of the nodes comprises
gain information (GA0 … GA5; GB0 … GB2; GC0) for the audio signal (AS) and
30 time information (TA0 … TA5; TB0 … TB2; TC0) indicating to which point in
time the gain information (GA0 … GA5; GB0 … GB2; GC0) corresponds;
wherein the encoded dynamic range control bit stream (DBS) comprises bit representations (B’1, B’2) of shifted nodes (B1, B2) selected from the
nodes (B0 … B2) of one reference dynamic range control frame (DFR) of the
50
dynamic range control frames (DFP, DFR, DFS), which are embedded in a
bitstream portion corresponding to the dynamic range control frame (DFS)
subsequent to the one reference dynamic range control frame (DFR),
wherein the bit representation (B’0) of each remaining node (B0) of the nodes
5 (B0 … B2) of the one reference dynamic range control frame (DFR) of the
dynamic range control frames (DFP, DFR, DFS) is embedded into the bitstream portion (DFR’) corresponding to the one reference dynamic range
control frame (DFR);
using the dynamic range control decoder (6) for decoding the bit rep10 resentation (B’0) of each remaining node (B0) of the remaining nodes (B’0) of
the one reference dynamic range control frame (DFR) of the dynamic range
control frames (DFP, DFR, DFS) in order to reproduce each remaining node
(B0) of the one reference dynamic range control frame (DFR) of the dynamic
range control frames (DFP, DFR, DFS); and
15 using the dynamic range control decoder (6) for decoding the bit representation (B’1, B’2) of each shifted node (B1, B2) of the shifted nodes (B1, B2)
selected from the nodes (B0 … B2) of the one reference dynamic range control frame (DFR) of the dynamic range control frames (DFP, DFR, DFS) in
order to reproduce each shifted node (B1, B2) of the shifted nodes (B1, B2)
20 selected from the nodes of the one reference dynamic range control frame
(DFR) of the dynamic range control frames (DFP, DFR, DFS);
wherein the reproduced remaining nodes (B0) and the reproduced
shifted nodes (B1, B2) are combined in order to reconstruct the reference dynamic range control frame (DFR); and
25 wherein the audio decoder (5) is configured for decoding the encoded
audio bitstream (ABS) using a delay mode.
51

Documents

Application Documents

# Name Date
1 202538029638-STATEMENT OF UNDERTAKING (FORM 3) [27-03-2025(online)].pdf 2025-03-27
2 202538029638-REQUEST FOR EXAMINATION (FORM-18) [27-03-2025(online)].pdf 2025-03-27
3 202538029638-PROOF OF RIGHT [27-03-2025(online)].pdf 2025-03-27
4 202538029638-FORM 18 [27-03-2025(online)].pdf 2025-03-27
5 202538029638-FORM 1 [27-03-2025(online)].pdf 2025-03-27
6 202538029638-FIGURE OF ABSTRACT [27-03-2025(online)].pdf 2025-03-27
7 202538029638-DRAWINGS [27-03-2025(online)].pdf 2025-03-27
8 202538029638-DECLARATION OF INVENTORSHIP (FORM 5) [27-03-2025(online)].pdf 2025-03-27
9 202538029638-COMPLETE SPECIFICATION [27-03-2025(online)].pdf 2025-03-27
10 202538029638-FORM-26 [15-05-2025(online)].pdf 2025-05-15
11 202538029638-FORM 3 [02-09-2025(online)].pdf 2025-09-02