Apparatus And Method For Post Processing An Audio Signal Using

< Back

Apparatus And Method For Post Processing An Audio Signal Using Prediction Based Shaping

Abstract: Apparatus for post-processing (20) an audio signal comprising: a time-spectrum- converter (700) for converting the audio signal into a spectral representation comprising a sequence of spectral frames; a prediction analyzer (720) for calculating prediction filter data for a prediction over frequency within a spectral frame; a shaping filter (740) controlled by the prediction filter data for shaping the spectral frame to enhance a transient portion within the spectral frame; and a spectrum-time-converter (760) for converting a sequence of spectral frames comprising a shaped spectral frame into a time domain.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

23 September 2019

Publication Number

45/2019

Publication Type

INA

Invention Field

ELECTRONICS

Status

lsdavar@vsnl.com

Parent Application

Patent Number

Legal Status

Grant Date

2024-03-06

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Hansastraße 27c 80686 München

Inventors

1. DISCH, Sascha

Wilhelmstraße 70 90766 Fürth

2. UHLE, Christian

Hoher Rain 28 92289 Ursensollen

3. HERRE, Jürgen

Rathsberger Straße 10a 91054 Erlangen

4. PROKEIN, Peter

Walburgastraße 13 91056 Erlangen

5. GAMPP, Patrick

Sieglitzhofer Str, 28 91054 Erlangen

6. KARAMPOURNIOTIS, Antonios

Mendelstrasse 25 90429 Nurnberg

7. HAVENSTEIN, Julia

Gostenhofer Hauptstrasse 58 90443 NOrnberg

8. HELLMUTH, Oliver

Am Ruhstein 29 91054 Buckenhof

9. RICHTER, Daniel

ParkstraBe 3 71642 Ludwigsburg

Specification

Apparatus and Method for Post-Processing an Audio Signal using Prediction Based

Shaping

Specification

The present invention relates to audio signal processing and, in particular, to audio signal post-processing in order to enhance the audio quality by removing coding artifacts.

Audio coding is the domain of signal compression that deals with exploiting redundancy and irrelevance in audio signals using psychoacoustic knowledge. At low bitrate conditions, often unwanted artifacts are introduced into the audio signal. A prominent artifact are temporal pre- and post-echoes that are triggered by transient signal components.

Especially in block-based audio processing, these pre-and post-echoes occur, since e.g. the quantization noise of spectral coefficients in a frequency domain transform coder is spread over the entire duration of one block. Semi-parametric coding tools like gap-filling, parametric spatial audio, or bandwidth extension can also lead to parameter band confined echo artefacts, since parameter-driven adjustments usually happen within a time block of samples.

The invention relates to a non-guided post-processor that reduces or mitigates subjective quality impairments of transients that have been introduced by perceptual transform coding.

State of the art approaches to prevent pre- and post-echo artifacts within a codec include transform codec block-switching and temporal noise shaping. A state of the art approach to suppress pre- and post-echo artifacts using post-processing techniques behind a codec chain is published in [1].

[1] Imen Samaali, Mania Turki-Hadj Alauane, Gael Mahe, "Temporal Envelope Correction for Attack Restoration in Low Bit-Rate Audio Coding", 17th European Signal Processing Conference (EUSIPCO 2009) , Scotland, August 24-28, 2009; and

[2] Jimmy Lapierre and Roch Lefebvre, "Pre-Echo Noise Reduction In Frequency- Domain Audio Codecs", ICASSP 2017, New Orleans.

The first class of approaches need to be inserted within the codec chain and cannot be applied a-posteriori on items that have been coded previously (e.g., archived sound material). Even though the second approach is essentially implemented as a post-processor to the decoder, it still needs control information derived from the original input signal at the encoder side.

It is an object of the present invention to provide an improved concept for post-processing an audio signal.

This object is achieved by an apparatus for post-processing an audio signal of claim 1, a method of post-processing an audio signal of claim 19 or a computer program of claim 20.

An aspect of the present invention is based on the finding that transients can still be localized in audio signals that have been subjected to earlier encoding and decoding, since such earlier coding/decoding operations, although degrading the perceptual quality, do not completely eliminate transients. Therefore, a transient location estimator is provided for estimating a location in time of a transient portion using the audio signal or the time-frequency representation of the audio signal. In accordance with the present invention, a time-frequency representation of the audio signal is manipulated to reduce or eliminate the pre-echo in the time-frequency representation at the location in time before the transient location or to perform a shaping of the time-frequency representation at the transient location and, depending on the implementation, subsequent to the transient location so that an attack of the transient portion is amplified.

In accordance with the present invention, a signal manipulation is performed within a time-frequency representation of the audio signal based on the detected transient location. Thus, a quite accurate transient location detection and, on the one hand, a corresponding useful pre-echo reduction, and, on the other hand, an attack amplification can be obtained by processing operations in the frequency domain so that a final frequency-time conversion results in an automatic smoothing/distribution of manipulations over the entire frame and due to overlap add operations over more than one frame. In the end, this avoids audible clicks due to the manipulation of the audio signal and, of course, results in an improved audio signal without any pre-echo or with a reduced amount of pre-echo on the one hand and/or with sharpened attacks for the transient portions on the other hand.

Preferred embodiments relate to a non-guided post-processor that reduces or mitigates subjective quality impairments of transients that have been introduced by perceptual transform coding.

In accordance with a further aspect of the present invention, transient improvement processing is performed without the specific need of a transient location estimator. In this aspect, a time-spectrum converter for converting the audio signal into a spectral representation comprising a sequence of spectral frames is used. A prediction analyzer then calculates prediction filter data for a prediction over frequency within a spectral frame and a subsequently connected shaping filter controlled by the prediction filter data shapes the spectral frame to enhance a transient portion within the spectral frame. The post-processing of the audio signal is completed with the spectrum-time conversion for converting a sequence of spectral frames comprising a shaped spectral frame back into a time domain.

Thus, once again, any modifications are done within a spectral representation rather than in a time domain representation so that any audible clicks, etc., due to a time domain processing are avoided. Furthermore, due to the fact that a prediction analyzer for calculating prediction filtered data for a prediction over frequency within a spectral frame is used, the corresponding time domain envelope of the audio signal is automatically influenced by subsequent shaping. Particularly, the shaping is done in such a way that, due to the processing within the spectral domain and due to the fact that the prediction over frequency is used, the time domain envelope of the audio signal is enhanced, i.e., made so that the time domain envelope has higher peaks and deeper valleys. In other words, the opposite of smoothing is performed by the shaping which automatically enhances transients without the need to actually locate the transients.

Preferably, two kinds of prediction filter data are derived. The first prediction filter data are prediction filter data for a flattening filter characteristic and the second prediction filter data are prediction filter data for a shaping filter characteristic. In other words, the flattening filter characteristic is an inverse filter characteristic and the shaping filter characteristic is a prediction synthesis filter characteristic. However, once again, both these filter data are derived by performing a prediction over frequency within a spectral frame. Preferably, time

constants for the derivation of the different filter coefficients are different so that, for calculating the first prediction filter coefficients, a first time constant is used and for the calculation of the second prediction filter coefficients, a second time constant is used, where the second time constant is greater than the first time constant. This processing, once again, automatically makes sure that transient signal portions are much more influenced than non-transient signal portions. In other words, although the processing does not rely on an explicit transient detection method, the transient portions are much more influenced than the non-transient portion by means of the flattening and subsequent shaping that are based on different time constants.

Thus, in accordance with the present invention and due to the application of a prediction over frequency, an automatic kind of transient improvement procedure is obtained, in which the time domain envelope is enhanced (rather than smoothed).

Embodiments of the present invention are designed as post-processors on previously coded sound material operating without requiring further guidance information. Therefore, these embodiments can be applied on archived sound material that has been impaired through perceptual coding that has been applied to this archived sound material before it has been archived.

Preferred embodiments of the first aspect consist of the following main processing steps:

Unguided detection of transient locations within the signals to find the transient locations;

Estimation of pre-echo duration and strength preceding transient;

Deriving a suitable temporal gain curve for muting the pre-echo artefact;

Ducking/Damping of estimated pre-echo through said adapted temporal gain curve before transient (to mitigate pre-echo);

at attack, mitigate dispersion of attack;

Exclusion of tonal or other quasi-stationary spectral bands from ducking.

Preferred embodiments of the second aspect consist of the following main processing steps:

Unguided detection of transient locations within the signals to find the transient locations (this step is optional);

Sharpening of an attack envelope through application of a Frequency Domain Linear Prediction Coefficients (FD-LPC) flattening filter and a subsequent FD-LPC shaping filter, the flattening filter representing a smoothed temporal envelope and the shaping filter representing a less smooth temporal envelope, wherein the prediction gains of both filters is compensated for.

A preferred embodiment is that of a post-processor that implements unguided transient enhancement as a last step in a multi-step processing chain. If other enhancement techniques are to be applied, e.g., unguided bandwidth extension, spectral gap filling etc., then the transient enhancement is preferred to be last in chain, such that the enhancement includes and is effective on signal modifications that have been introduced from previous enhancement stages.

All aspects of the invention can be implemented as post-processors, one, two or three modules can be computed in series or can share common modules (e.g., (I)STFT, transient detection, tonality detection) for computational efficiency.

It is to be noted that the two aspects described herein can be used independently from each other or together for post-processing an audio signal. The first aspect relying on transient location detection and pre-echo reduction and attack amplification can be used in order to enhance a signal without the second aspect. Correspondingly, the second aspect based on LPC analysis over frequency and the corresponding shaping filtering within the frequency domain does not necessarily rely on a transient detection but automatically enhances transients without an explicit transient location detector. This embodiment can be enhanced by a transient location detector but such a transient location detector is not necessarily required. Furthermore, the second aspect can be applied independently from the first aspect. Additionally, it is to be emphasized that, in other embodiments, the second aspect can be applied to an audio signal that has been post-processed by the first aspect. Alternatively, however, the order can be made in such a way that, in the first step, the second aspect is applied and, subsequently, the first aspect is applied in order to post-process an audio signal to improve its audio quality by removing earlier introduced coding artifacts.

Furthermore it is to be noted that the first aspect basically has two sub-aspects. The first sub-aspect is the pre-echo reduction that is based on the transient location detection and the second sub-aspect is the attack amplification based on the transient location detection. Preferably, both sub-aspects are combined in series, wherein, even more preferably, the pre-echo reduction is performed first and then the attack amplification is performed. In other embodiments, however, the two different sub-aspects can be implemented independent from each other and can even be combined with the second sub-aspect as the case may be. Thus, a pre-echo reduction can be combined with the prediction-based transient enhancement procedure without any attack amplification. In other implementations, a pre-echo reduction is not preformed but an attack amplification is performed together with a subsequent LPC-based transient shaping not necessarily requiring a transient location detection.

In a combined embodiment, the first aspect including both sub-aspects and the second aspect are performed in a specific order, where this order consists of first performing the pre-echo reduction, secondly performing the attack amplification and thirdly performing the LPC-based attack/transient enhancement procedure based on a prediction of a spectral frame over frequency.

Preferred embodiments of the present invention are subsequently discussed with respect to the accompanying drawings in which:

Fig. 1 is a schematic block diagram in accordance with the first aspect;

Fig. 2a is a preferred implementation of the first aspect based on a tonality estimator;

Fig. 2b is a preferred implementation of the first aspect based on a pre-echo width estimation;

Fig. 2c is a preferred embodiment of the first aspect based on a pre-echo threshold estimation;

Fig.2d is a preferred embodiment of the first sub-aspect related to pre-echo reduction/elimination;

Fig.3a is a preferred implementation of the first sub-aspect;

Fig.3b is a preferred implementation of the first sub-aspect;

Fig.4 is a further preferred implementation of the first sub-aspect;

Fig.5 illustrates the two sub-aspects of the first aspect of the present invention;

Fig.6a illustrates an overview over the second sub-aspect;

Fig.6b illustrates a preferred implementation of the second sub-aspect relying on a division into a transient part and a sustained part;

Fig.6c illustrates a further embodiment of the division of Fig. 6b;

Fig.6d illustrates a further implementation of the second sub-aspect;

Fig.6e illustrates a further embodiment of the second sub-aspect;

Fig.7 illustrates a block diagram of an embodiment of the second aspect of the present invention;

Fig.8a illustrates a preferred implementation of the second aspect based on two different filter data;

Fig.8b illustrates a preferred implementation of the second aspect for the calculation of the two different prediction filter data;

Fig.8c illustrates a preferred implementation of the shaping filter of Fig. 7;

Fig.8d illustrates a further implementation of the shaping filter of Fig. 7;

Fig. 8e illustrates a further embodiment of the second aspect of the present invention;

Fig. 8f illustrates a preferred implementation for the LPC filter estimation with different time constants;

Fig. 9 illustrates an overview over a preferred implementation for a post- processing procedure relying on the first sub-aspect and the second sub- aspect of the first aspect of the present invention and additionally relying on the second aspect of the present invention performed on an output of a procedure based on the first aspect of the present invention;

Fig. 10a illustrates a preferred implementation of the transient location detector;

Fig. 10b illustrates a preferred implementation for the detection function calculation of Fig. 10a;

Fig. 10c illustrates a preferred implementation of the onset picker of Fig. 10a;

Fig. 11 illustrates a general setting of the present invention in accordance with the first and/or the second aspect as a transient enhancement post-processor;

Fig. 12.1 illustrates a moving average filtering;

Fig. 12.2 illustrates a single pole recursive averaging and high-pass filtering;

Fig. 12.3 illustrates a time signal prediction and residual;

Fig. 12.4 illustrates an autocorrelation of the prediction error;

Fig. 12.5 illustrates a spectral envelope estimation with LPC;

Fig. 12.6 illustrates a temporal envelope estimation with LPC;

Fig. 12.7 illustrates an attack transient vs. frequency domain transient;

Fig. 12.8 illustrates spectra of a "frequency domain transient";

Fig. 12.9 illustrates the differentiation between transient, onset and attack; Fig. 12.10 illustrates an absolute threshold in quiet and simultaneous masking;

Fig. 12.11 illustrates a temporal masking;

Fig. 12.12 illustrates a generic structure of a perceptual audio encoder;

Fig. 12.13 illustrates a generic structure of a perceptual audio decoder;

Fig. 12.14 illustrates a bandwidth limitation in perceptual audio coding;

Fig. 12.15 illustrates a degraded attack character;

Fig. 12.16 illustrates a pre-echo artifact;

Fig. 13.1 illustrates a transient enhancement algorithm;

Fig. 13.2 illustrates a transient detection: Detection Function (Castanets);

Fig. 13.3 illustrates a transient detection: Detection Function (Funk);

Fig. 13.4 illustrates a biock diagram of the pre-echo reduction method;

Fig. 13.5 illustrates a detection of tonal components;

Fig. 13.6 illustrates a pre-echo width estimation - schematic approach;

Fig. 13.7 illustrates a pre-echo width estimation - examples;

Fig. 13.8 illustrates a pre-echo width estimation - detection function;

Fig. 13.9 illustrates a pre-echo reduction - spectrograms (Castanets);

Fig. 13.10 is an illustration of the pre-echo threshold determination (castanets);

Fig. 13.11 is an illustration of the pre-echo threshold determination for a tonal component;

Fig. 13.12 illustrates a parametric fading curve for the pre-echo reduction;

Fig. 13.13 illustrates a model of the pre-masking threshold;

Fig. 13.14 illustrates a computation of the target magnitude after the pre-echo reduction

Fig. 13.15 illustrates a pre-echo reduction - spectrograms (glockenspiel);

Fig. 13.16 illustrates an adaptive transient attack enhancement;

Fig. 13.17 illustrates a fade-out curve for the adaptive transient attack enhancement;

Fig. 13.18 illustrates autocorrelation window functions;

Fig. 13.19 illustrates a time-domain transfer function of the LPC shaping filter; and

Fig. 13.20 illustrates an LPC envelope shaping - input and output signal.

Fig. 1 illustrates an apparatus for post-processing an audio signal using a transient location detection. Particularly, the apparatus for post-processing is placed, with respect to a general framework, as illustrated in Fig. 11. Particularly, Fig. 11 illustrates an input of an impaired audio signal shown at 10. This input is forwarded to a transient enhancement post-processor 20, and the transient enhancement post-processor 20 outputs an enhanced audio signal as illustrated at 30 in Fig. 11.

The apparatus for post-processing 20 illustrated in Fig. 1 comprises a converter 100 for converting the audio signal into a time-frequency representation. Furthermore, the apparatus comprises a transient location estimator 120 for estimating a location in time of a transient portion. The transient location estimator 120 operates either using the time-frequency representation as shown by the connection between the converter 100 and the

transient location estimation 120 or uses the audio signal within a time domain. This alternative is illustrated by the broken line in Fig. 1. Furthermore, the apparatus comprises a signal manipulator 140 for manipulating the time-frequency representation. The signal manipulator 140 is configured to reduce or to eliminate a pre-echo in the time-frequency representation at a location in time before the transient location, where the transient location is signaled by the transient location estimator 120. Alternatively or additionally, the signal manipulator 140 is configured to perform a shaping of the time-frequency representation as illustrated by the line between the converter 100 and the signal manipulator 140 at the transient location so that an attack of the transient portion is amplified.

Thus, the apparatus for post-processing in Fig. 1 reduces or eliminates a pre-echo and/or shapes the time-frequency representation to amplify an attack of the transient portion.

Fig. 2a illustrates a tonality estimator 200. Particularly, the signal manipulator 140 of Fig. 1 comprises such a tonality estimator 200 for detecting tonal signal components in the time-frequency representation preceding the transient portion in time. Particularly, the signal manipulator 140 is configured to apply the pre-echo reduction or elimination in a frequency-selective way so that, at frequencies where tonal signal components have been detected, the signal manipulation is reduced or switched off compared to frequencies, where the tonal signal components have not been detected. In this embodiment, the pre-echo reduction/elimination as illustrated by block 220 is, therefore, frequency-selectively switched on or off or at least gradually reduced at frequency locations in certain frames, where tonal signal components have been detected. This makes sure that tonal signal components are not manipulated, since, typically, tonal signal components cannot, at the same time, be a pre-echo or a transient. This is due to the fact that a typical nature of the transient is that a transient is a broad-band effect that concurrently influences many frequency bins, while, on the contrary, a tonal component is, with respect to a certain frame, a certain frequency bin having a peak energy while other frequencies in this frame have only a low energy.

Furthermore, as illustrated in Fig. 2b, the signal manipulator 140 comprises a pre-echo width estimator 240. This block is configured for estimating a width in time of the pre-echo preceding the transient location. This estimation makes sure that the correct time portion before the transient location is manipulated by the signal manipulator 140 in an effort to reduce or eliminate the pre-echo. The estimation of the pre-echo width in time is based on a development of a signal energy of the audio signal over time in order to determine a pre-echo start frame in the time-frequency representation comprising a plurality of subsequent audio signal frames. Typically, such a development of the signal energy of the audio signal over time will be an increasing or constant signal energy, but will not be a falling energy development over time.

Fig. 2b illustrates a block diagram of a preferred embodiment of the post-processing in accordance with a first sub-aspect of the first aspect of the present invention, i.e., where a pre-echo reduction or elimination or, as stated in Fig. 2d, a pre-echo "ducking" is performed.

An impaired audio signal is provided at an input 10 and this audio signal is input into a converter 100 that is, preferably, implemented as short-time Fourier transform analyzer operating with a certain block length and operating with overlapping blocks.

Furthermore, the tonality estimator 200 as discussed in Fig. 2a is provided for controlling a pre-echo ducking stage 320 that is implemented in order to apply a pre-echo ducking curve 160 to the time-frequency representation generated by block 100 in order to reduce or eliminate pre-echos. The output of block 320 is then once again converted into the time domain using a frequency-time converter 370. This frequency-time converter is preferably implemented as an inverse short-time Fourier transform synthesis block that operates with an overlap-add operation in order to fade-in/fade-out from each block to the next one in order to avoid blocking artifacts.

The result of block 370 is the output of the enhanced audio signal 30.

Preferably, the pre-echo ducking curve block 160 is controlled by a pre-echo estimator 150 collecting characteristics related to the pre-echo such as the pre-echo width as determined by block 240 of Fig. 2b or the pre-echo threshold as determined by block 260 or other pre-echo characteristics as discussed with respect to Fig. 3a, Fig. 3b, Fig. 4.

Preferably, as outlined in Fig. 3a, the pre-echo ducking curve 160 can be considered to be a weighting matrix that has a certain frequency-domain weighting factor for each frequency bin of a plurality of time frames as generated by block 100. Fig. 3a illustrates a pre-echo threshold estimator 260 controlling a spectral weighting matrix calculator 300

corresponding to block 160 in Fig. 2d, that controls a spectral weighter 320 corresponding to the pre-echo ducking operation 320 of Fig. 2d.

Preferably, the pre-echo threshold estimator 260 is controlled by the pre-echo width and also receives information on the time-frequency representation. The same is true for the spectral weighting matrix calculator 300 and, of course, for the spectral weighter 320 that, in the end, applies the weighting factor matrix to the time-frequency representation in order to generate a frequency-domain output signal, in which the pre-echo is reduced or eliminated. Preferably, the spectral weighting matrix calculator 300 operates in a certain frequency range being equal to or greater than 700 Hz and preferably being equal than or greater than 800 Hz. Furthermore, the spectral weighting matrix calculator 300 is limited to calculate weighting factors so that only for the pre-echo area that, additionally, depends on an overlap-add characteristic as applied by the converter 100 of Fig. 1. Furthermore, the pre-echo threshold estimator 260 is configured for estimating pre-echo thresholds for spectral values in the time-frequency representation within a pre-echo width as, for example, determined by block 240 of Fig. 2b, wherein the pre-echo thresholds indicate amplitude thresholds of corresponding spectral values that should occur subsequent to the pre-echo reduction or elimination, i.e., that should correspond to the true signal amplitudes without a pre-echo.

Preferably, the pre-echo threshold estimator 260 is configured to determine the pre-echo threshold using a weighting curve having an increasing characteristic from a start of the pre-echo width to the transient location. Particularly, such a weighting curve is determined by block 350 in Fig. 3b based on the pre-echo width indicated by Mpre. Then, this weighting curve Cm is applied to spectral values in block 340, where the spectral values have been smoothed before by means of block 330. Then, as illustrated in block 360, minima are selected as the thresholds for all frequency indices k. Thus, in accordance with a preferred embodiment, the pre-echo threshold estimator 260 is configured to smooth 330 the time-frequency representation over a plurality of subsequent frames of the time-frequency representation and to weight (340) the smoothed time-frequency representation using a weighting curve having an increasing characteristic from a start of the pre-echo width to the transient location. This increasing characteristic makes sure that a certain energy increase or decrease of the normal "signal", i.e., a signal without a pre-echo artifact is allowed.

In a further embodiment, the signal manipulator 140 is configured to use a spectral weights calculator 300, 160 for calculating individual spectral weights for spectral values of the time-frequency representation. Furthermore, a spectral weighter 320 is provided for weighting spectral values of the time-frequency representation using the spectral weights to obtain a manipulated time-frequency representation. Thus, the manipulation is performed within the frequency domain by using weights and by weighting individual time/frequency bins as generated by the converter 100 of Fig. 1.

Preferably, the spectral weights are calculated as illustrated in the specific embodiment illustrated in Fig. 4. The spectral weighter 320 receives, as a first input, the time-frequency representation Xk,m and receives, as a second input, the spectral weights. These spectral weights are calculated by raw weights calculator 450 that is configured to determine raw spectral weights using an actual spectral value and a target spectral value that are both input into this block. The raw weights calculator operates as illustrated in equation 4.18 illustrated later on, but other implementations relying on an actual value on the one hand and a target value on the other hand are useful as well. Furthermore, alternatively or additionally, the spectral weights are smoothed over time in order to avoid artifacts and in order to avoid changes that are too strong from one frame to the other.

Preferably, the target value input into the raw weights calculator 450 is specifically calculated by a pre-masking modeler 420. The pre-masking modeler 420 preferably operates in accordance with equation 4.26 defined later, but other implementations can be used as well that rely on psychoacoustic effects and, particularly rely on a pre-masking characteristic that is typically occurring for a transient. The pre-masking modeler 420 is, on the one hand, controlled by a mask estimator 410 specifically calculating a mask relying on the pre-masking type acoustic effect. In an embodiment, the mask estimator 410 operates in accordance with equation 4.21 described later on but, alternatively, other mask estimations can be applied that rely on the psychoacoustic pre-masking effect.

Furthermore, a fader 430 is used for fade-in a reduction or elimination of the pre-echo using a fading curve over a plurality of frames at the beginning of the pre-echo width. This fading curve is preferably controlled by the actual value in a certain frame and by the determined pre-echo threshold thk. The fader 430 makes sure that the pre-echo reduction / elimination not only starts at once, but is smoothly faded in. A preferred implementation is illustrated later on in connection with equation 4.20, but other fading operations are useful as well. Preferably, the fader 430 is controlled by a fading curve estimator 440

controlled by the pre-echo width Mpre as determined, for example, by the pre-echo width estimator 240. Embodiments of the fading curve estimator operate in accordance with equation 4.19 discussed later on, but other implementations are useful as well. All these operations by blocks 410, 420, 430, 440 are useful to calculate a certain target value so that, in the end, together with the actual value, a certain weight can be determined by block 450 that is then applied to the time-frequency representation and, particularly, to the specific time/frequency bin subsequent to a preferred smoothing.

Naturally, a target value can also be determined without any pre-masking psychoacoustic effect and without any fading. Then, the target value would be directiy the threshold thk, but it has been found that the specific calculations performed by blocks 410, 420, 430, 440 result in an improved pre-echo reduction in the output signal of the spectral weighter 320.

Thus, it is preferred to determine the target spectral value so that the spectral value having an amplitude below a pre-echo threshold is not influenced by the signal manipulation or to determine the target spectral values using the pre-masking model 410, 420 so that a damping of a spectral value in the pre-echo area is reduced based on the pre-masking model 410.

Preferably, the algorithm performed in the converter 100 is so that the time-frequency representation comprises complex-valued spectral values. On the other hand, however, the signal manipulator is configured to apply real-valued spectral weighting values to the complex-valued spectral values so that, subsequent to the manipulation in block 320, only the amplitudes have been changed, but the phases are the same as before the manipulation.

Fig. 5 illustrates a preferred implementation of the signal manipulator 140 of Fig. 1. Particularly, the signal manipulator 140 either comprises the pre-echo reducer/eliminator operating before the transient location illustrated at 220 or comprises an attack amplifier operating after/at the transient location as illustrated by block 500. Both blocks 220, 500 are controlled by a transient location as determined by the transient location estimator 120. The pre-echo reducer 220 corresponds to the first sub-aspect and block 500 corresponds to the second sub-aspect in accordance with the first aspect of the present invention. Both aspects can be used alternatively to each other, i.e., without the other aspect as illustrated by the broken lines in Fig. 5. On the other hand, however, it is

preferred to use both operations in the specific order illustrated in Fig. 5, i.e., that the pre-echo reducer 220 is operative and the output of the pre-echo reducer/eliminator 220 is input into the attack amplifier 500.

Fig. 6a illustrates a preferred embodiment of the attack amplifier 500. Again, the attack amplifier 500 comprises a spectral weights calculator 610 and a subsequently connected spectral weighter 620. Thus, the signal manipulator is configured to amplify 500 spectral values within a transient frame of the time-frequency representation and preferably to additionally amplify spectral values within one or more frames following the transient frame within the time-frequency representation.

Preferably, the signal manipulator 140 is configured to only amplify spectral values above a minimum frequency, where this minimum frequency is greater than 250 Hz and lower than 2 KHz. The amplification can be performed until the upper border frequency, since attacks at the beginning of the transient location typically extend over the whole high frequency range of the signal.

Preferably, the signal manipulator 140 and, particularly, the attack amplifier 500 of Fig. 5 comprises a divider 630 for dividing the frame within a transient part on the one hand and a sustained part on the other hand. The transient part is then subjected to the spectral weighting and, additionally, the spectral weights are also calculated depending on information on the transient part. Then, only the transient part is spectrally weighted and the result of block 610, 620 in Fig. 6b on the one hand and the sustained part as output by the divider 630 are finally combined within a combiner 640 in order to output an audio signal where an attack has been amplified. Thus, the signal manipulator 140 is configured to divide 630 the time-frequency representation at the transient location into a sustained part and the transient part and to preferably, additionally divide frames subsequent to the transient location as well. The signal manipulator 140 is configured to only amplify the transient part and to not amplify or manipulate the sustained part.

As stated, the signal manipulator 140 is configured to also amplify a time portion of the time-frequency representation subsequent to the transient location in time using a fade-out characteristic 685 as illustrated by block 680. Particularly, the spectral weights calculator 610 comprises a weighting factor determiner 680 receiving information on the transient part on the one hand, on the sustained part on the other hand, on the fade-out curve Gm 685 and preferably also receiving information on the amplitude of the

corresponding spectral value Xk,m. Preferably, the weighting factor determiner 680 operates in accordance with equation 4.29 discussed later on, but other implementations relying on information on the transient part, on the sustained part and the fade-out characteristic 685 are useful as well.

Subsequent to the weighting factor determination 680, a smoothing across frequency is performed in block 690 and, then, at the output of block 690, the weighting factors for the individual frequency values are available and are ready to be used by the spectral weighter 620 in order to spectrally weight the time/frequency representation. Preferably, of the amplified part as determined, for example by a maximum of the fade-out characteristics 685 is predetermined and between 300 % and 150 %. In a preferred embodiment, as maximum amplification factor of 2.2 is used that decreases, over a number of frames, until a value of 1, where, as illustrated in Fig. 13.17, such a decrease is obtained, for example, after 60 frames. Although Fig. 13.17 illustrates a kind of exponential decay, other decays, such as a linear decay or a cosine decay can be used as well.

Preferably, the result of the signal manipulation 140 is converted from the frequency domain into the time domain using a spectral-time converter 370 illustrated in Fig. 2d. Preferably, the spectral-time converter 370 applies an overlap-add operation involving at least two adjacent frames of the time-frequency representation, but multi-overlap procedures can be used as well, wherein an overlap of three or four frames is used.

Preferably, the converter 100 on the one hand and the other converter 370 on the other hand apply the same hop size between 1 and 3 ms or an analysis window having a window length between 2 and 6 ms. And, preferably, the overlap range on the one hand, the hop size on the other hand or the windows applied by the time-frequency converter 100 and the frequency-time converter 370 are equal to each other.

Fig. 7 illustrates an apparatus for post-processing 20 of an audio signal in accordance with the second aspect of the present invention. The apparatus comprises a time-spectrum converter 700 for converting the audio signal into a spectral representation comprising a sequence of spectral frames. Additionally, a prediction analyzer 720 for calculating prediction filter data for a prediction over frequency within the spectral frame is used. The prediction analyzer operating over frequency 720 generates filter data for a frame and this filter data for a frame is used by a shaping filter 740 frame to enhance a transient portion within the spectral frame. The output of the shaping filter 740 is forwarded to a spectrum-time converter 760 for converting a sequence of spectral frames comprising a shaped spectral frame into a time-domain.

Preferably, the prediction analyzer 720 on the one hand or the shaping filter 740 on the other hand operate without an explicit transient location detection. Instead, due to the prediction over frequency applied by block 720 and due to the shaping to enhance the transient portion generated by block 740, a time envelope of the audio signal is manipulated so that a transient portion is enhanced automatically, without any specific transient detection. However, as the case may be, block 720, 740 can also be supported by an explicit transient location detection in order to make sure that any probably artifacts are not impressed into the audio signal at non-transient portions.

Preferably, the prediction analyzer 720 is configured to calculate first prediction filter data 720a for a flattening filter characteristic 740a and second prediction filter data 720b for a shaping filter characteristic 740b as illustrated in Fig. 8a. In particular, the prediction analyzer 720 receives, as an input, a complete frame of the sequence of frames and then performs an operation for the prediction analysis over frequency in order to obtain either the flattening filter data characteristic or to generate the shaping filter characteristic. The flattening filter characteristic is the filter characteristic that, in the end, resembles an inverse filter that can also be represented by an FIR (finite impulse response) characteristic 740a, in which the second filter data for the shaping corresponds to a synthesis or IIR filter characteristic (IIR = Infinite Impulse Response) illustrated at 740b.

Preferably, the degree of shaping represented by the second filter data 720b is greater than the degree of flattening 720a represented by the first filter data so that, subsequent to the application of the shaping filter having both characteristics 740a, 740b, a kind of an "over shaping" of the signal is obtained that results in a temporal envelope being less flatter than the original temporal envelope. This is exactly what is required for a transient enhancement.

Although Fig. 8a illustrates a situation in which two different filter characteristics, one shaping filter and one flattening filter are calculated, other embodiments rely on a single shaping filter characteristic. This is due to the fact that a signal can, of course, also be shaped without a preceding flattening so that, in the end, once again an over-shaped signal that automatically has improved transients is obtained. This effect of the over-

shaping may be controlled by a transient location detector but this transient location detector is not required due to a preferred implementation of a signal manipulation that automatically influences non-transient portions less than transient portions. Both procedures fully rely on the fact that the prediction over frequency is applied by the prediction analyzer 720 in order to obtain information on the time envelope of the time domain signal that is then manipulated in order to enhance the transient nature of the audio signal.

In this embodiment, an autocorrelation signal 800 is calculated from a spectral frame as illustrated at 800 in Fig. 8b. A window with a first time constant is then used for windowing the result of block 800 as illustrated in block 802. Furthermore, a window having a second time constant being greater than the first time constant is used for windowing the autocorrelation signal obtained by block 800, as illustrated in block 804. From the result signal obtained from block 802, the first prediction filter data are calculated as illustrated by block 806 preferably by applying a Levinson-Durbin recursion. Similarly, the second prediction filter data 808 are calculated from block 804 with the greater time constant. Once again, block 808 preferably uses the same Levinson-Durbin algorithm.

Due to the fact that the autocorrelation signal is windowed with windows having two different time constants, the - automatic - transient enhancement is obtained. Typically, the windowing is such that the different time constants only have an impact on one class of signals but do not have an impact on the other class of signals. Transient signals are actually influenced by means of the two different time constants, while non-transient signals have such an autocorrelation signal that windowing with the second larger time constant results in almost the same output as windowing with the first time constant. With respect to Figs. 13 and 18, this is due to the fact that non-transient signals do not have any significant peaks at high time lags and, therefore, using two different time constants does not make any difference with respect to these signals. However, this is different for transient signals. Transient signals have peaks at higher time lags and, therefore, applying different time constants to the autocorrelation signal that actually has the peaks at higher time lags as illustrated in Figs. 13 and 18 at 1300, for example, results in different outputs for the different windowing operations with different time constants.

Depending on the implementation, the shaping filter can be implemented in many different ways. One way is illustrated in Fig. 8c and is a cascade of a flattening sub-filter controlled by the first filter data 806 as illustrated at 809 and a shaping sub-filter controlled by the second filter data 808 as illustrated at 810 and a gain compensator 811 that is also implemented in the cascade.

However, the two different filter characteristics and the gain compensation can also be implemented within a single shaping filter 740 and the combined filter characteristic of the shaping filter 740 is calculated by a filter characteristic combiner 820 relying, on the one hand, on both first and second filter data and additionally relying, on the other hand, on the gains of the first filter data and the second filter data to finally also implement the gain compensation function 811 as well. Thus, with respect to Fig. 8d embodiment in which a combined filter is applied, the frame is input into a single shaping filter 740 and the output is the shaped frame that has both filter characteristics, on the one hand, and the gain compensation functionality, on the other hand, implemented on it.

Fig. 8e illustrates a further implementation of the second aspect of the present invention, in which the functionality of the combined shaping filter 740 of Fig. 8d is illustrated in line with Fig. 8c but it is to be noted that Fig. 8e can actually be an implementation of three separate stages 809, 810, 811 but, at the same time, can be seen as a logical representation that is practically implemented using a single filter having a filter characteristic with a nominator and a denominator, in which the nominator has the inverse/flattening filter characteristic and the denominator has the synthesis characteristic and in which, additionally, a gain compensation is included as, for example, illustrated in equation 4.33 that is determined later on.

Fig. 8f illustrates the functionality of the windowing obtained by block 802, 804 of Fig. 8b in which r(k) is the autocorrelation signal and Wlag is the window r'(k) is the output of the windowing, i.e., the output of blocks 802, 804 and, additionally, a window function is exemplarily illustrated that, in the end, represents an exponential decay filter having two different time constants that can be set by using a certain value for a in Fig. 8f.

Thus, applying a window to the autocorrelation value prior to Levinson-Durbin recursion results in an expansion of the time support at local temporal peaks. In particular, the expansion using a Gaussian window is described by Fig. 8f. Embodiments here rely on the idea to derive a temporal flattening filter that has a greater expansion of time support at local non-flat envelopes than the subsequent shaping filter through the choice of different values 4a. Together, these filters result in a sharpening of temporal attacks in the signal. In the result there is a compensation for the prediction gains of the filter such that spectral energy of the filtered spectral region is preserved.

Thus, a signal flow of a frequency domain-LPC based attack shaping is obtained as illustrated in Fig. 8a to 8e.

Fig. 9 illustrates a preferred implementation of embodiments that rely on both the first aspect illustrated from block 100 to 370 in Fig. 9 and a subsequently performed second aspect illustrated by block 700 to 760. Preferably, the second aspect relies on a separate time-spectrum conversion that uses a large frame size such as a frame size of 512 and the 50% overlap. On the other hand, the first aspect relies on a small frame size in order to have a better time resolution for transient location detection. Such a smaller frame size is, for example, a frame size of 128 samples and an overlap of 50%. Generally, however, it is preferred to use separate time-spectrum conversions for the first and the second aspect in which the frame size aspect is greater (the time resolution is lower but the frequency resolution is higher) while the time resolution for the first aspect is higher with a corresponding lower frequency resolution.

Fig. 10a illustrates a preferred implementation of the transient location estimator 120 of Fig. 1. The transient location estimator 120 can be implemented as known in the art but, in the preferred embodiment, relies on a detection function calculator 1000 and the subsequently connected onset picker 1100 so that, in the end, a binary value for each frame indicating a presence of a transient onset in frame is obtained.

The detection function calculator 1000 relies on several steps illustrated in Fig. 10b. These are a summing up of energy vaiues in block 1020. In block 1030 a computation of temporal envelopes is performed. Subsequently, in step 1040, a high-pass filtering of each bandpass signal temporal envelope is performed. In step 1050, a summing up of the resulted high-pass filtered signals in the frequency direction is performed and in block 1060 an accounting for the temporal post-masking is performed so that, in the end, a detection function is obtained.

Fig. 10c illustrates a preferred way of onset picking from the detection function as obtained by block 1060. In step 1110, local maxima (peaks) are found in the detection function. In block 1120, a threshold comparison is performed in order to only keep peaks for the further prosecution that are above a certain minimum threshold.

In block 1130, the area around each peak is scanned for a larger peak in order to determine from this area the relevant peaks. The area around the peaks extends a number of lb frames before the peak and a number of la frames subsequent to the peak.

In block 1140, close peaks are discarded so that, in the end, the transient onset frame indices mi are determined.

Subsequently, technical and auditory concepts, that are utilized in the proposed transient enhancement methods are disclosed. First, some basic digital signal processing techniques regarding selected filtering operations and linear prediction will be introduced, followed by a definition of transients. Subsequently, the psychoacoustic concept of auditory masking is explained, that is exploited in the perceptual coding of audio content. This portion closes with a brief description of a generic perceptual audio codec and the induced compression artifacts, that are subject to the enhancement methods in accordance with the invention.

Smoothing and differentiating filters

The transient enhancement methods described later on make frequent use of some particular filtering operations. An introduction to these filters will be given in the section below. Refer to [9, 10] for a more detailed description. Eq. (2.1) describes a finite impulse response (FIR) low-pass filter that computes the current output sample value yn as the mean value of the current and past samples of an input signal xn. The filtering process of this so-called moving average filter is given by

where p is the filter order. The top image of Figure 12.1 shows the result of the moving average filter operation in Eq. (2.1) for an input signal xn. The output signal yn in the bottom image was computed by applying the moving average filter two times on xn in both forward and backward direction. This compensates the filter delay and also results in a smoother output signal yn since xn is filtered two times.

A different way to smooth a signal is to apply a single pole recursive averaging filter, that is given by the following difference equation:

with y0 = x1 and N denoting the number of samples in xn . Figure 12.2 (a) displays the result of a single pole recursive averaging filter applied to a rectangular function. In (b) the filter was applied in both directions to further smooth the signal. By taking
and as

and

where xn and yn are the input and output signals of Eq. (2.2), respectively, the resulting output signals and directly follow the attack or decay phase of the

input signal. Figure 12.2 (c) shows as the solid black curve and as the

dashed black curve.

Strong amplitude increments or decrements of an input signal xn can be detected by filtering xn with a FIR high-pass filter as

with b = [1 , -1] or b = [1, 0, . . . ,-1]. The resulting signal after high-pass filtering the rectangular function is shown in Figure 12.2 (d) as the black curve.

Linear Prediction

Linear prediction (LP) is a useful method for the encoding of audio. Some past studies particularly describe its ability to model the speech production process [11, 12, 13], while others also apply it for the analysis of audio signals in general [14, 15, 16, 17]. The following section is based on [11, 12, 13, 15, 18].

In linear predictive coding (LPC) a sampled time signal ( with T
being the sampling period, can be predicted by a weighted linear combination of its past values in the form of

where n is the time index that identifies a certain time sample of the signal, p is the prediction order, ar, with 1 ≤ r ≤ p, are the linear prediction coefficients (and in this case the filter coefficients of an all-pole infinite impulse response (IIR) filter, G is the gain factor and un is some input signal that excites the model. By taking the z-transform of Eq. (2.6), the corresponding all-pole transfer function H (z) of the system is

where

The UR filter H(z) is called the synthesis or LPC filter, while the FIR filter A(z) = is referred to as the inverse filter. Using the prediction coefficients ar

as the filter coefficients of a FIR filter, a prediction of the signal sn can be obtained by

This results in a prediction error between the predicted signal and the actual

signal sn which can be formulated by

with the equivalent representation of the prediction error in the z-domain being

Figure 12.3 shows the original signal sn, the predicted signal and the difference

signal en,p, with a prediction order p = 10. This difference signal en,p is also called the residual. In Figure 2.4 the autocorrelation function of the residual shows almost complete decorrelation between neighboring samples, which indicates that en,p can be seen as proximately as white Gaussian noise. Using en,p from Eq. (2.10) as the input signal un in Eq. (2.6) or filtering Ep(z) from Eq. (2.11) with the all-pole filter H (z) from Eq. (2.7) (with G = 1)the original signal can be perfectly recovered by

and

respectively.

With increasing prediction order p the energy of the residual decreases. Besides the number of predictor coefficients, the residual energy also depends on the coefficients themselves. Therefore, the problem in linear predictive coding is how

to obtain the optimal filter coefficients ar, so that the energy of the residual is minimized. First, we take the total squared error (total energy) of the residual from a windowed signal block xn = sn · wn, where wn is some window function of width N, and its prediction by

with

To minimize the total squared error E , the gradient of Eq. (2.14) has to be computed with respect to each ar and set to 0 by setting

This leads to the so-called normal equations:

Ri denotes the autocorrelation of the signal xn as

Eq. (2.17) forms a system of p linear equations, from which the p unknown prediction coefficients ar, 1 ≤ r ≤ p, which minimize the total squared error, can be computed. With Eq. (2.14) and Eq. (2.17), the minimum total squared error Ep can be obtained by

A fast way to solve the normal equations in Eq. (2.17) is the Levinson-Durbin algorithm [19]. The algorithm works recursively, which brings the advantage that with increasing prediction order it yields the predictor coefficients for the current and all the previous orders less than p. First, the algorithm gets initialized by setting

Subsequently, for the prediction orders m = 1 ,... , P, the prediction coefficients ar(m), which are the coefficients ar of the current order m, are computed with the partial correlation coefficients pm as follows:

With every iteration the minimum total squared error Em of the current order m is computed in Eq. (2.24). Since Em is always positive and with E0 = R0, it can be shown that with increasing order m the minimum total energy decreases, so that we have

Claims

1. Apparatus for post-processing (20) an audio signal, comprising:

a time-spectrum-converter (700) for converting the audio signal into a spectral representation comprising a sequence of spectral frames;

a prediction analyzer (720) for calculating prediction filter data for a prediction over frequency within a spectral frame;

a shaping filter (740) controlled by the prediction filter data for shaping the spectral frame to enhance a transient portion within the spectral frame; and

a spectrum-time-converter (760) for converting a sequence of spectral frames comprising a shaped spectral frame into a time domain.

2. Apparatus of claim 1,

wherein the prediction analyzer (720) is configured to calculate first prediction filter data (720a) for a flattening filter characteristic (740a) and second prediction filter data (720b) for a shaping filter characteristic (740b).

3. Apparatus of claim 2,

wherein the prediction analyzer (720) is configured for calculating the first prediction filter data (720a) using a first time constant and to calculate the second prediction filter data using a second time constant (720b), the second time constant being greater than the first time constant.

4. Apparatus of claim 2 or 3,

wherein the flattening filter characteristic (740a) is an analysis FIR filter characteristic or an all zero filter characteristic resulting, when applied to the spectral frame, in a modified spectral frame having a flatter temporal envelope compared to a temporal envelope of the spectral frame; or

wherein the shaping filter characteristic (740b) is a synthesis MR filter characteristic or an all pole filter characteristic resulting, when applied to a spectral frame, in a modified spectral frame having a less flatter temporal envelope compared to a temporal envelope of the spectral frame.

5. Apparatus of one of the preceding claims,

wherein the prediction analyzer (720) is configured:

to calculate (800) an autocorrelation signal from the spectral frame;

to window (802, 804) the autocorrelation signal using a window with a first time constant or with a second time constant, the second time constant being greater than the first time constant;

to calculate (806, 808) first prediction filter data from a windowed autocorrelation signal windowed using the first time constant or to calculate second prediction filter coefficients from a windowed autocorrelation signal windowed using the second time constant; and

wherein the shaping filter (740) is configured to shape the spectral frame using the second prediction filter coefficients or using the second prediction filter coefficients and the first prediction filter coefficients.

6. Apparatus of one of the preceding claims,

wherein the shaping filter (740) comprises a cascade of two controllable sub-filters (809, 810), a first sub-filter (809) being a flattening filter having a flattening filter characteristic and a second sub-filter (810) being a shaping filter having a shaping filter characteristic,

wherein the sub-filters (809, 810) are both controlled by the prediction filter data derived by the prediction analyzer (720), or

wherein the shaping filter (740) is a filter having a combined filter characteristic derived by combining (820) a flattening characteristic and a shaping characteristic, wherein the combined characteristic is controlled by the prediction filter data derived from the prediction analyzer (720).

7. Apparatus of claim 6,

wherein the prediction analyzer (720) is configured to determine

the prediction filter data so that using prediction filter data for the shaping filter (740) results in a degree of shaping being higher than a degree of flattening obtained by using the prediction filter data for the flattening filter characteristic.

8. Apparatus of one of the preceding claims,

wherein the prediction analyzer (720) is configured to applying (806, 808) a Levinson-Durbin algorithm to a filtered autocorrelation signal derived from the spectral frame.

9. Apparatus of one of the preceding claims,

wherein the shaping filter (740) is configured to apply a gain compensation so that an energy of a shaped spectral frame is equal to an energy of the spectral frame generated by the time-spectral-converter (700) or is within a tolerance range of

±20% of an energy of the spectral frame.

10. Apparatus of one of the preceding claims,

wherein the shaping filter (740) is configured to apply a flattening filter characteristic (740a) having a flattening gain and a shaping filter characteristic (740b) having a shaping gain, and

wherein the shaping filter (740) is configured to perform a gain compensation for compensating an influence of the flattening gain and the shaping gain.

11. Apparatus of claim 6,

wherein the prediction analyzer (720) is configured to calculate a flattening gain and a shaping gain,

wherein the cascade of the two controllable sub-filters (809, 810) furthermore comprises a separate gain stage (811) or a gain function included in at least one of the two sub-filters for applying a gain derived from the flattening gain and/or the shaping gain, or

wherein the filter (740) having the combined characteristic is configured to apply a gain derived from the flattening gain and/or the shaping gain.

12. Apparatus of claim 5,

wherein the window comprises a Gaussian window having a time lag as a parameter.

13. Apparatus of one of the preceding claims,

wherein the prediction analyzer (720) is configured to calculate the prediction filter data for a plurality of frames so that the shaping filter (740) controlled by the prediction filter data performs a signal manipulation for a frame of the plurality of frames comprising a transient portion, and

so that the shaping filter (740) does not perform a signal manipulation or performs a signal manipulation being smaller than the signal manipulation for the frame for a further frame of the plurality of frames not comprising a transient portion.

14. Apparatus of one of the preceding claims,

wherein the spectrum-time converter (760) is configured to apply an overlap-add operation involving at least two adjacent frames of the spectral representation.

15. Apparatus of one of the preceding claims,

wherein the time-spectrum converter (700) is configured to apply a hop size between 3 and 8 ms or an analysis window having a window length between 6 and 16 ms, or

wherein the spectrum-time converter (760) is configured to use and overlap range corresponding to an overlap size of overlapping windows or corresponding to a hop size used by the converter between 3 and 8 ms, or to use a synthesis window having a window length between 6 and 16 ms, or wherein the analysis window and the synthesis window are identical to each other.

16. Apparatus of claim 2 or 3,

wherein the flattening filter characteristic (740a) is an inverse filter characteristic resulting, when applied to the spectral frame, in a modified spectral frame having a flatter temporal envelope compared to a temporal envelope of the spectral frame; or

wherein the shaping filter characteristic (740b) is a synthesis filter characteristic resulting, when applied to a spectral frame, in a modified spectral frame having a less flatter temporal envelope compared to a temporal envelope of the spectral frame.

17. Apparatus of one of the preceding claims, wherein the prediction analyzer (720) is configured to calculate prediction filter data for a shaping filter characteristic (740b), and wherein the shaping filter (740) is configured to filter the spectral frame as obtained by the time-spectrum converter (700) e.g. without a preceding flattening.

18. Apparatus of one of the preceding claims, wherein the shaping filter (740) is configured to represent a shaping action in accordance with a time envelope of the spectral frame with a maximum or a less than maximum time resolution, and wherein the shaping filter (740) is configured to represent no flattening action or a flattening action in accordance with a time resolution being smaller than the time resolution associated with the shaping action.

19. Method for post-processing (20) an audio signal, comprising;

converting (700) the audio signal into a spectral representation comprising a sequence of spectral frames;

calculating (720) prediction filter data for a prediction over frequency within a spectral frame;

shaping (740), in response to the prediction filter data, the spectral frame to enhance a transient portion within the spectral frame; and

converting (760) a sequence of spectral frames comprising a shaped spectral frame into a time domain.

20. Computer program for performing, when running on a computer or a processor, the method of claim 19.

Documents

Application Documents

#	Name	Date
1	201937038279-IntimationOfGrant06-03-2024.pdf	2024-03-06
1	201937038279.pdf	2019-09-23
2	201937038279-STATEMENT OF UNDERTAKING (FORM 3) [23-09-2019(online)].pdf	2019-09-23
2	201937038279-PatentCertificate06-03-2024.pdf	2024-03-06
3	201937038279-Information under section 8(2) [27-01-2024(online)].pdf	2024-01-27
3	201937038279-FORM 1 [23-09-2019(online)].pdf	2019-09-23
4	201937038279-FORM 3 [04-12-2023(online)].pdf	2023-12-04
4	201937038279-FIGURE OF ABSTRACT [23-09-2019(online)].pdf	2019-09-23
5	201937038279-Information under section 8(2) [18-09-2023(online)].pdf	2023-09-18
5	201937038279-DRAWINGS [23-09-2019(online)].pdf	2019-09-23
6	201937038279-Information under section 8(2) [19-07-2023(online)].pdf	2023-07-19
6	201937038279-DECLARATION OF INVENTORSHIP (FORM 5) [23-09-2019(online)].pdf	2019-09-23
7	201937038279-FORM 3 [01-06-2023(online)].pdf	2023-06-01
7	201937038279-COMPLETE SPECIFICATION [23-09-2019(online)].pdf	2019-09-23
8	201937038279-Information under section 8(2) [03-04-2023(online)].pdf	2023-04-03
8	201937038279-FORM 18 [01-10-2019(online)].pdf	2019-10-01
9	201937038279-MARKED COPIES OF AMENDEMENTS [21-10-2019(online)].pdf	2019-10-21
9	201937038279-Information under section 8(2) [14-02-2023(online)].pdf	2023-02-14
10	201937038279-FORM 13 [21-10-2019(online)].pdf	2019-10-21
10	201937038279-Information under section 8(2) [19-01-2023(online)].pdf	2023-01-19
11	201937038279-AMMENDED DOCUMENTS [21-10-2019(online)].pdf	2019-10-21
11	201937038279-FORM 3 [01-12-2022(online)].pdf	2022-12-01
12	201937038279-FORM-26 [30-11-2019(online)].pdf	2019-11-30
12	201937038279-Information under section 8(2) [09-11-2022(online)].pdf	2022-11-09
13	201937038279-Information under section 8(2) [19-09-2022(online)].pdf	2022-09-19
13	201937038279-Information under section 8(2) [24-02-2020(online)].pdf	2020-02-24
14	201937038279-FORM 3 [01-06-2022(online)].pdf	2022-06-01
14	201937038279-Proof of Right [11-03-2020(online)].pdf	2020-03-11
15	201937038279-Information under section 8(2) [12-03-2022(online)].pdf	2022-03-12
15	201937038279-Information under section 8(2) [21-08-2020(online)].pdf	2020-08-21
16	201937038279-FORM 3 [07-12-2021(online)].pdf	2021-12-07
16	201937038279-Information under section 8(2) [25-01-2021(online)].pdf	2021-01-25
17	201937038279-Information under section 8(2) [07-12-2021(online)].pdf	2021-12-07
17	201937038279-Information under section 8(2) [08-02-2021(online)].pdf	2021-02-08
18	201937038279-Information under section 8(2) [01-06-2021(online)].pdf	2021-06-01
18	201937038279-Information under section 8(2) [28-10-2021(online)].pdf	2021-10-28
19	201937038279-FORM 3 [01-06-2021(online)].pdf	2021-06-01
19	201937038279-FER.pdf	2021-10-18
20	201937038279-ABSTRACT [11-10-2021(online)].pdf	2021-10-11
20	201937038279-FORM 4(ii) [06-07-2021(online)].pdf	2021-07-06
21	201937038279-CLAIMS [11-10-2021(online)].pdf	2021-10-11
21	201937038279-OTHERS [11-10-2021(online)].pdf	2021-10-11
22	201937038279-COMPLETE SPECIFICATION [11-10-2021(online)].pdf	2021-10-11
22	201937038279-FER_SER_REPLY [11-10-2021(online)].pdf	2021-10-11
23	201937038279-DRAWING [11-10-2021(online)].pdf	2021-10-11
24	201937038279-COMPLETE SPECIFICATION [11-10-2021(online)].pdf	2021-10-11
24	201937038279-FER_SER_REPLY [11-10-2021(online)].pdf	2021-10-11
25	201937038279-OTHERS [11-10-2021(online)].pdf	2021-10-11
25	201937038279-CLAIMS [11-10-2021(online)].pdf	2021-10-11
26	201937038279-FORM 4(ii) [06-07-2021(online)].pdf	2021-07-06
26	201937038279-ABSTRACT [11-10-2021(online)].pdf	2021-10-11
27	201937038279-FER.pdf	2021-10-18
27	201937038279-FORM 3 [01-06-2021(online)].pdf	2021-06-01
28	201937038279-Information under section 8(2) [01-06-2021(online)].pdf	2021-06-01
28	201937038279-Information under section 8(2) [28-10-2021(online)].pdf	2021-10-28
29	201937038279-Information under section 8(2) [07-12-2021(online)].pdf	2021-12-07
29	201937038279-Information under section 8(2) [08-02-2021(online)].pdf	2021-02-08
30	201937038279-FORM 3 [07-12-2021(online)].pdf	2021-12-07
30	201937038279-Information under section 8(2) [25-01-2021(online)].pdf	2021-01-25
31	201937038279-Information under section 8(2) [12-03-2022(online)].pdf	2022-03-12
31	201937038279-Information under section 8(2) [21-08-2020(online)].pdf	2020-08-21
32	201937038279-FORM 3 [01-06-2022(online)].pdf	2022-06-01
32	201937038279-Proof of Right [11-03-2020(online)].pdf	2020-03-11
33	201937038279-Information under section 8(2) [19-09-2022(online)].pdf	2022-09-19
33	201937038279-Information under section 8(2) [24-02-2020(online)].pdf	2020-02-24
34	201937038279-FORM-26 [30-11-2019(online)].pdf	2019-11-30
34	201937038279-Information under section 8(2) [09-11-2022(online)].pdf	2022-11-09
35	201937038279-AMMENDED DOCUMENTS [21-10-2019(online)].pdf	2019-10-21
35	201937038279-FORM 3 [01-12-2022(online)].pdf	2022-12-01
36	201937038279-FORM 13 [21-10-2019(online)].pdf	2019-10-21
36	201937038279-Information under section 8(2) [19-01-2023(online)].pdf	2023-01-19
37	201937038279-MARKED COPIES OF AMENDEMENTS [21-10-2019(online)].pdf	2019-10-21
37	201937038279-Information under section 8(2) [14-02-2023(online)].pdf	2023-02-14
38	201937038279-Information under section 8(2) [03-04-2023(online)].pdf	2023-04-03
38	201937038279-FORM 18 [01-10-2019(online)].pdf	2019-10-01
39	201937038279-FORM 3 [01-06-2023(online)].pdf	2023-06-01
39	201937038279-COMPLETE SPECIFICATION [23-09-2019(online)].pdf	2019-09-23
40	201937038279-Information under section 8(2) [19-07-2023(online)].pdf	2023-07-19
40	201937038279-DECLARATION OF INVENTORSHIP (FORM 5) [23-09-2019(online)].pdf	2019-09-23
41	201937038279-Information under section 8(2) [18-09-2023(online)].pdf	2023-09-18
41	201937038279-DRAWINGS [23-09-2019(online)].pdf	2019-09-23
42	201937038279-FORM 3 [04-12-2023(online)].pdf	2023-12-04
42	201937038279-FIGURE OF ABSTRACT [23-09-2019(online)].pdf	2019-09-23
43	201937038279-FORM 1 [23-09-2019(online)].pdf	2019-09-23
43	201937038279-Information under section 8(2) [27-01-2024(online)].pdf	2024-01-27
44	201937038279-PatentCertificate06-03-2024.pdf	2024-03-06
44	201937038279-STATEMENT OF UNDERTAKING (FORM 3) [23-09-2019(online)].pdf	2019-09-23
45	201937038279-IntimationOfGrant06-03-2024.pdf	2024-03-06
45	201937038279.pdf	2019-09-23

Search Strategy

1	SS201937038279E_10-01-2021.pdf