“Audio Encoder And Decoder, Methods For Encoding And Decoding An Audio

< Back

“Audio Encoder And Decoder, Methods For Encoding And Decoding An Audio Signal, Audio Stream”

Abstract: An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal comprises a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also comprises an audio stream provider configured to provide the audio stream such that the audio stream comprises an information describing an audio content of the frequency bands and an information describing the multi- band quantization error. A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal comprises a noise filler configured to introduce noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multiband noise intensity value.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

10 January 2011

Publication Number

47/2011

Publication Type

INA

Invention Field

ELECTRONICS

Status

Parent Application

Patent Number

Legal Status

Grant Date

2020-03-19

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

HANSASTR. 27C, 80686 MENNCHEN, GERMANY

Inventors

1. RETTELBACH, NIKOLAUS

SPESSARTSTRASSE 38 90427 NUERNBERG, GERMANY

2. GRILL, BERNHARD

PETER-HENLEIN-STRASSE 7 91207 LAUF, GERMANY

3. FUCHS, GUILLAUME

PARKSTRASSE 12 90409 NUERNBERG, GERMANY

4. GEYERSBERGER, STEFAN

OTTO-ROTH-STRASSE 90 97076 WUERZBURG, GERMANY

5. MULTRUS, MARKUS

ETZLAUBWEG 7 90469 NUERNBERG, GERMANY

6. POPP, HARALD

OBERMICHELBACHER STRASSE 18 90587 TUCHENBACH, GERMANY

7. HERRE, JUERGEN

HALLERSTRASSE 24 91054 BUCKENHOF, GERMANY

8. WABNIK, STEFAN

FICHTENWEG 5 98693 LLMENAU, GERMANY

9. SCHULLER, GERALD

LEOPOLDSTRASSE 13 99089 ERLANGEN, GERMANY

10. HIRSCHFELD, JENS

STEINWEG 32 36266 HERING, GERMANY

Specification

Audio Encoder, Audio Decoder, Methods for Encoding and Decoding an Audio Signal, Audio Stream and Computer Program Background of the Invention Embodiments according to the invention are related to an encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal. Further embodiments according to the invention are related to a decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream. Further embodiments according to the invention provide methods for encoding an audio signal and for decoding an audio signal. Further embodiments according to the invention provide an audio stream. Further embodiments according to the invention provide computer programs for encoding an audio signal and for decoding an audio signal. Generally speaking, embodiments according to the invention are related to a noise filling. Audio coding concepts often encode an audio signal in the frequency domain. For example, the so-called "advanced audio coding" (AAC) concept encodes the contents of different spectral bins (or frequency bins), taking into consideration a psychoacoustic model. For this purpose, intensity information for different spectral bins is encoded. However, the resolution used for encoding intensities in different spectral bins is adapted in accordance with the psychoacoustic relevances of the different spectral bins. Thus, some spectral bins, which are considered as being of low psychoacoustic relevance, are encoded with a very low intensity resolution, such that some of the spectral bins considered to be of low psychoacoustic relevance, or even a dominant number thereof, are quantized to zero. Quantizing the intensity of a spectral bin to zero brings along the advantage that the quantized zero-value can be encoded in a very bit-saving manner, which helps to keep the bit rate as small as possible. Nevertheless, spectral bins quantized to zero sometimes result in audible artifacts, even if the psychoacoustic model indicates that the spectral bins are of low psychoacoustic relevance. Therefore, there is a desire to deal with spectral bins quantized to zero, both in an audio encoder and an audio decoder. Different approaches are known for dealing with spectral bins encoded to zero in transform-domain audio coding systems and also in speech coders. For example, the MPEG-4 "AAC" (advanced audio coding) uses the concept of perceptual noise substitution (PNS). The perceptional noise substitution fills complete scale factor bands with noise only. Details regarding the MPEG-4 AAC may, for example, be found in the International Standard ISO/IEC 14496-3 (Information Technology - Coding of Audio- Visual Objects - Part 3: Audio). Furthermore, the AMR-WB+ speech coder replaces vector quantization vectors (VQ vectors) quantized to zero with a random noise vector, where each complex spectral value has a constant amplitude, but a random phase. The amplitude is controlled by one noise value transmitted with the bitstream. Details regarding the AMR-WB+ speech coder may, for example, be found in the technical specification entitled "Third Generation Partnership Project; Technical Specification Group Services and System Aspects; Audio Codec Processing Functions; Extended Adaptive Multi-Rate-Wide Band (AMR-WB+) Codec; Transcoding Functions (Release Six)", which is also known as "3GPP TS 26.290 V6.3.0 (2005-06) - Technical Specification". Further, EP 1 395 980 B1 describes an audio coding concept. The publication describes a means by which selected frequency bands of information from an original audio signal, which are audible, but which are perceptionally less relevant, need not be encoded, but may be replaced by a noise filling parameter. Those signal bands having content, which is perceptionally more relevant are, in contrast, fully encoded. Encoding bits are saved in this manner without leaving voids in the frequency spectrum of the received signal. The noise filling parameter is a measure of the RMS signal value within the band in question and is used at the reception end by a decoding algorithm to indicate the amount of noise to inject in the frequency band in question. Further approaches provide for a non-guided noise insertion in the decoder, taking into account the tonality of the transmitted spectrum. However, the conventional concepts typically bring along the problem that they either comprise a poor resolution regarding the granularity of the noise filling, which typically degrades the hearing impression, or require a comparatively large amount of noise filling side information, which requires extra bit rate. In view of the above, there is the need for an improved concept of noise filling, which provides for an improved trade-off between the achievable hearing impression and the required bit rate. Summary of the Invention An embodiment according to the invention creates an encoder for providing an audio stream on the basis of a transform- domain representation of an input audio signal. The encoder comprises a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands (for example, over a plurality of scale factor bands) of the input audio signal, for which separate band gain information (for example, separate scale factors) is available. The encoder also comprises an audio stream provider configured to provide the audio stream such that the audio stream comprises an information describing an audio content of the frequency bands and an information describing the multi-band quantization error. The above-described encoder is based on the finding that the usage of a multi-band quantization error information brings along the possibility to obtain a good hearing impression on the basis of a comparatively small amount of side information. In particular, the usage of a multi-band quantization error information, which covers a plurality of frequency bands for which separate band gain information is available, allows for a decoder-sided scaling of noise values, which are based on the multi-band quantization error, in dependence on the band gain information. Accordingly, as the band gain information is typically correlated with a psychoacoustic relevance of the frequency bands or with a quantization accuracy applied to the frequency bands, the multi-band quantization error information has been identified as a side information, which allows for a synthesis of filling noise providing a good hearing impression while keeping the bit rate-cost of the side information low. In a preferred embodiment, the encoder comprises a quantizer configured to quantize spectral components (for example, spectral coefficients) of different frequency bands of the transform domain representation using different quantization accuracies in dependence on psychoacoustic relevances of the different frequency bands to obtain quantized spectral components, wherein the different quantization accuracies are reflected by the band gain information. Also, the audio stream provider is configured to provide the audio stream such that the audio stream comprises an information describing the band gain information (for example, in the form of scale factors) and such that the audio stream also comprises the information describing the multi-band quantization error. In a preferred embodiment, the quantization error calculator is configured to determine the quantization error in the quantized domain, such that a scaling, in dependence on the band gain information of the spectral component, which is performed prior to an integer value quantization, is taken into consideration. By considering the quantization error in the quantized domain, the psychoacoustic relevance of the spectral bins is considered when calculating the multi-band quantization error. For example, for frequency bands of small perceptual relevance, the quantization may be coarse, such that the absolute quantization error (in the non-quantized domain) is large. In contrast, for spectral bands of high psychoacoustic relevance, the quantization is fine and the quantization error, in the non-quantized domain, is small. In order to make the quantization errors in the frequency bands of high psychoacoustic relevance and of low psychoacoustic relevance comparable, such as to obtain a meaningful multi- band quantization error information, the quantization error is calculated in the quantized domain (rather than in the non-quantized domain) in a preferred embodiment. In a further preferred embodiment, the encoder is configured to set a band gain information (for example, a scale factor) of a frequency band, which is quantized to zero (for example, in that all spectral bins of the frequency band are quantized to zero) to a value representing a ratio between an energy of the frequency band quantized to zero and an energy of the multi-band quantization error. By setting a scale factor of a frequency band which is quantized to zero to a well-defined value, it is possible to fill the frequency band quantized to zero with a noise, such that the energy of the noise is at least approximately equal to the original signal energy of the frequency band quantized to zero. By adapting the scale factor in the encoder, a decoder can treat the frequency band quantized to zero in the same way as any other frequency bands not quantized to zero, such that there is no need for a complicated exception handling (typically requiring an additional signaling). Rather, by adapting the band gain information (e.g. scale factor), a combination of the band gain value and the multi-band quantization error information allows for a convenient determination of the filling noise. In a preferred embodiment, the quantization error calculator is configured to determine the multi-band quantization error over a plurality of frequency bands comprising at least one frequency component (e.g. frequency bin) quantized to a non-zero value while avoiding frequency bands entirely quantized to zero. It has been found that a multi-band quantization error information is particularly meaningful if frequency bands entirely quantized to zero are omitted from the calculation. In frequency bands entirely quantized to zero, the quantization is typically very coarse, so that the quantization error information obtained from such a frequency band is typically not particularly meaningful. Rather, the quantization error in the psychoacoustically more relevant frequency bands, which are not entirely quantized to zero, provides a more meaningful information, which allows for a noise filling adapted to the human hearing at the decoder side. An embodiment according to the invention creates a decoder for providing a decoded representation of an audio signal on the basis of an encoded stream representing spectral components of frequency bands of the audio signal. The decoder comprises a noise filler configured to introduce noise into spectral components (for example, spectral line values or, more generally, spectral bin values) of a plurality of frequency bands to which separate frequency band gain information (for example, scale factors) is associated on the basis of a common multi-band noise intensity value. The decoder is based on the finding that a single multi- band noise intensity value can be applied for a noise filling with good results if separate frequency band gain, information is associated with the different frequency bands. Accordingly, an individual scaling of noise introduced in the different frequency bands is possible on the basis of the frequency band gain information, such that, for example, the single common multi-band noise intensity value provides, when taken in combination with separate frequency band gain information, sufficient information to introduce noise in a way adapted to human psychoacoustics. Thus, the concept described herein allows to apply a noise filling in the quantized (but non- rescaled) domain. The noise added in the decoder can be scaled with the psychoacoustic relevance of the band without requiring additional side information (beyond the side information, which is, anyway, required to scale the non-noise audio content of the frequency bands in accordance with the psychoacoustic relevance of the frequency bands). In a preferred embodiment, the noise filler is configured to selectively decide on a per-spectral-bin basis whether to introduce a noise into individual spectral bins of a frequency band in dependence on whether the respective individual spectral bins are quantized to zero or not. Accordingly, it is possible to obtain a very fine granularity of the noise filling while keeping the quantity of required side information very small. Indeed, it is not required to transmit any frequency-band-specific noise filling side information, while still having an excellent granularity with respect to the noise filling. For example, it is typically required to transmit a band gain factor (e.g. scale factor) for a frequency band even if only a single spectral line (or a single spectral bin) of said frequency band is quantized to a non-zero intensity value. Thus, it can be said that the scale factor information is available for noise filling at no extra cost (in terms of bitrate) if at least one spectral line (or a spectral bin) of the frequency band is quantized to a non-zero intensity. However, according to a finding of the present invention, it is not necessary to transport frequency-band-specific noise information in order to obtain an appropriate noise filling in such a frequency band in which at least one non- zero spectral bin intensity value exists. Rather, it has been found that psychoacoustically good results can be obtained by using the multi-band combination with the frequency-band-specific frequency band gain information (e.g. scale factor). Thus, it is not necessary to waste bits on a frequency-band-specific noise filling information. Rather, the transmission of a single multi-band noise intensity value is sufficient, because this multi-band noise filling information can be combined with the frequency band gain information transmitted anyway to obtain frequency-band-specific noise filling information well adapted to the human hearing expectations. In another preferred embodiment, the noise filler is configured to receive a plurality of spectral bin values representing different overlapping or non-overlapping frequency portions of the first frequency band of a frequency domain audio signal representation, and to receive a plurality of spectral bin values representing different overlapping or non-overlapping frequency portions of the second frequency band of the frequency domain audio signal representation. Further, the noise filler is configured to replace one or more spectral bin values of the first frequency band of the plurality of frequency bands with a first spectral bin noise value, wherein a magnitude of the first spectral bin noise value is determined by the multi-band noise intensity value. In addition, the noise filler is configured to replace one or more spectral bin values of the second frequency band with a second spectral bin noise value having the same magnitude as the first spectral bin noise value. The decoder also comprises a scaler configured to scale spectral bin values of the first frequency band with the first frequency band gain value to obtain scaled spectral bin values of the first frequency band, and to scale spectral bin values of the second frequency band with a second frequency band gain value to obtain scaled spectral bin values of the second frequency band, such that the replaced spectral bin values, replaced with the first and second spectral bin noise values, are scaled with different frequency band gain values, and such that the replaced spectral bin value, replaced with the first spectral bin noise value, an un- replaced spectral bin values of the first frequency band representing an audio content of the first frequency band are scaled with the first frequency band gain value, and such that the replaced spectral bin value, replaced with the second spectral bin noise value, an un-replaced spectral bin values of the second frequency band representing an audio content of the second frequency band are scaled with the second frequency band gain value. In an embodiment according to the invention, the noise filler is optionally configured to selectively modify a frequency band gain value of a given frequency band using a noise offset value if the given frequency band is quantized to zero. Accordingly, the noise offset serves for minimizing a number of side information bits. Regarding this minimization, it should be noted that the encoding of the scale factors (scf) in an AAC audio coder is performed using a Huffmann encoding of the difference of subsequent scale factors (scf). Small differences obtain the shortest codes (while larger differences obtain larger codes). The noise offset minimizes the "mean difference" at a transition from conventional scale factors (scale factors of bands not quantized to zero) to noise scale factors and back, and thus optimizes the bit demand for the side information. This is due to the fact that normally the "noise scale factors" are larger than the conventional scale factors, as the included lines are not >= 1, but correspond to the mean quantization error e (wherein typically 0

Documents

Application Documents

#	Name	Date
1	abstract-139-kolnp-2011.jpg	2011-10-06
2	139-kolnp-2011-specification.pdf	2011-10-06
3	139-kolnp-2011-pct request form.pdf	2011-10-06
4	139-kolnp-2011-pct priority document notification.pdf	2011-10-06
5	139-KOLNP-2011-PA.pdf	2011-10-06
6	139-KOLNP-2011-IPRB.pdf	2011-10-06
7	139-kolnp-2011-international search report.pdf	2011-10-06
8	139-kolnp-2011-international publication.pdf	2011-10-06
9	139-kolnp-2011-form-5.pdf	2011-10-06
10	139-kolnp-2011-form-3.pdf	2011-10-06
11	139-kolnp-2011-form-2.pdf	2011-10-06
12	139-kolnp-2011-form-13.pdf	2011-10-06
13	139-kolnp-2011-form-1.pdf	2011-10-06
14	139-KOLNP-2011-FORM 3-1.1.pdf	2011-10-06
15	139-KOLNP-2011-FORM 18.pdf	2011-10-06
16	139-kolnp-2011-drawings.pdf	2011-10-06
17	139-kolnp-2011-description (complete).pdf	2011-10-06
18	139-kolnp-2011-correspondence.pdf	2011-10-06
19	139-KOLNP-2011-CORRESPONDENCE-1.3.pdf	2011-10-06
20	139-KOLNP-2011-CORRESPONDENCE-1.1.pdf	2011-10-06
21	139-KOLNP-2011-CORRESPONDENCE 1.2.pdf	2011-10-06
22	139-kolnp-2011-claims.pdf	2011-10-06
23	139-KOLNP-2011-ASSIGNMENT.pdf	2011-10-06
24	139-kolnp-2011-abstract.pdf	2011-10-06
25	Other Patent Document [08-08-2016(online)].pdf	2016-08-08
26	Other Patent Document [12-10-2016(online)].pdf	2016-10-12
27	139-KOLNP-2011-FER.pdf	2016-10-13
28	Other Patent Document [28-02-2017(online)].pdf_620.pdf	2017-02-28
29	Other Patent Document [28-02-2017(online)].pdf	2017-02-28
30	Petition Under Rule 137 [08-04-2017(online)].pdf	2017-04-08
31	Examination Report Reply Recieved [08-04-2017(online)].pdf	2017-04-08
32	Description(Complete) [08-04-2017(online)].pdf_183.pdf	2017-04-08
33	Description(Complete) [08-04-2017(online)].pdf	2017-04-08
34	Claims [08-04-2017(online)].pdf	2017-04-08
35	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [28-08-2017(online)].pdf	2017-08-28
36	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [22-11-2017(online)].pdf	2017-11-22
37	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [22-02-2018(online)].pdf	2018-02-22
38	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [22-05-2018(online)].pdf	2018-05-22
39	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [22-09-2018(online)].pdf	2018-09-22
40	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [13-06-2019(online)].pdf	2019-06-13
41	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [13-06-2019(online)]-1.pdf	2019-06-13
42	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [05-11-2019(online)].pdf	2019-11-05
43	139-KOLNP-2011-Information under section 8(2) (MANDATORY) [05-11-2019(online)]-1.pdf	2019-11-05
44	139-KOLNP-2011-HearingNoticeLetter-(DateOfHearing-28-02-2020).pdf	2020-01-30
45	139-KOLNP-2011-Correspondence to notify the Controller [06-02-2020(online)].pdf	2020-02-06
46	139-KOLNP-2011-Written submissions and relevant documents [14-03-2020(online)].pdf	2020-03-14
47	139-KOLNP-2011-PatentCertificate19-03-2020.pdf	2020-03-19
48	139-KOLNP-2011-IntimationOfGrant19-03-2020.pdf	2020-03-19
49	139-KOLNP-2011-RELEVANT DOCUMENTS [26-09-2021(online)].pdf	2021-09-26
50	139-KOLNP-2011-RELEVANT DOCUMENTS [10-09-2022(online)].pdf	2022-09-10
51	139-KOLNP-2011-RELEVANT DOCUMENTS [07-09-2023(online)].pdf	2023-09-07
52	139-KOLNP-2011-NO [25-06-2025(online)].pdf	2025-06-25

Search Strategy

1	US7343287_31-08-2016.pdf
2	US7212973_31-08-2016.pdf
3	US6092041_31-08-2016.pdf
4	US5960389_31-08-2016.pdf
5	US4956871_31-08-2016.pdf