A Device And A Method For Processing An Information Signal
Abstract:
Processing of information signals separated according to modulation and carrier components in a more controlled way is made possible by a device for processing an information signal (14) including means (20) for converting the information signal (14) to a time/spectral representation by block-wise transforming of the information signal and means (22) for converting the information signal from the time/spectral representation to a spectral/modulation spectral representation, wherein the means (22) for converting is designed such that the spectral/modulation spectral representation depends on both a magnitude component and a phase component of the time/spectral representation of the information signal (14). A means (24, 40) then performs a manipulation and/or modification of the information signal (14) in the spectral/modulation spectral representation to obtain a modified spectral/modulation spectral representation. A further means (26) finally forms a processed information signal (18) representing a processed version of the information signal (14) based on the modified spectral/modulation spectral representation.
Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence
Information signal processing by modification in the
spectral/modulation spectral range representation
Description
The present invention generally relates to the processing
of information signals, such as audio signals, vide
signals or other multimedia signals, and particularly to
the processing of information signals in the
spectral/modulation spectral range.
In the field of signal processing, such as the processing
of digital audio signals, there are frequently signals
consisting of a carrier signal component and a modulation
component. In the case of modulated signals, a
representation in which the signals are decomposed into
carrier and modulation components is often required, for
example to be able to filter, code or otherwise modify
them.
For the purposes of audio coding, it is known, for example,
to subject the audio signal to a so-called modulation
transform. Here, the audio signal is decomposed into
frequency bands by a transform. Subsequently, a
decomposition into magnitude and phase is performed. While
the phase is not processed any further, the magnitudes per
subband are re-transformed via a number of transform blocks
in a second transform. The result is a frequency
decomposition of the time envelope of the respective
subband into modulation coefficients. Audio codings
consisting of such a modulation transform are, for example,
described in M. Vinton and L. Atlas, "A Scalable and
Progressive Audio Codec", in Proceedings of the 2001 IEEE
ICASSP, 7-11 May 2001, Salt Lake City, United States
Patent Application US 2002/0176353A1: Atlas et al.,
"Scalable And Perceptually Ranked Signal Coding And
Decoding", 11/28/2002, and J. Thompson and L. Atlas, "A
Non-uniform Modulation Transform for Audio Coding with
Increased Time Resolution", in Proceedings of the 2003 IEEE
ICASSP, 6-10 April, Hong Kong, 2003.
An overview of further various demodulation techniques
across the full bandwidth of the signal to be demodulated
including asynchronous and synchronous demodulation
techniques, etc. is given, for example, by the article L.
Atlas, "Joint Acoustic And Modulation Frequency", Journal
on Applied Signal Processing 7 EURASIP, pp. 668-675, 2003.
A disadvantage of the above schemes for audio coding using
a modulation transform is the following. As long as no
further processing steps are performed on the modulation
coefficients together with the phases, the modulation
coefficients form a spectral/modulation spectral
representation of the audio signal that is reversible and
perfectly reconstructing, i.e. it is re-convertible without
changes back into the original audio signal in the time
domain. However, in these methods the modulation
coefficients are filtered to reduce and/or quantize the
modulation coefficients to values as small as possible
according to psychoacoustic criteria, so that a maximum
compression rate is achieved. However, this generally does
not accomplish the desired goal to remove the respective
modulation components from the resulting signal or to
deliberately introduce quantization noise in this
component. This is due to the fact that, after the back-
transform of the changed modulation coefficients, the
phases of the subbands are no longer consistent with the
changed magnitudes of these subbands and continue to
contain strong components of the modulation component of
the original signal. If the phases of the subbands are now
recombined with the changed magnitudes, these modulation
components are reintroduced into the filtered or quantized
signal by the phase. In other words, a modulation transform
followed by a modification of the modulation coefficients
in the above manner, i.e. by filtering the modulation
coefficients, together with a subsequent synthesis of the
phase and magnitude components provides a signal that, in
another analysis and/or modulation transform, still
contains significant modulation components at those places
in the spectral/modulation spectral range representation
that should have been filtered out. Effective filtering lis
thus not possible based on the above-mentioned modulation
transform-based signal processing schemes.
Therefore, there is a need for an information signal
processing scheme allowing to process modulated signals
with a carrier component and a modulation component
separated according to modulation and carrier component in
a more controlled way.
It is thus the object of the present invention to provide a
processing scheme for information signals allowing
processing of information signals that is separated
according to modulation and carrier components in a mpre
controlled way.
This object is achieved by a device according to claim 1
and a method according to claim 17.
An inventive device for processing an information signal
includes means for converting the information signal into a
time/spectral representation by block-wise transforming the
information signal and means for converting the information
signal from the time/spectral representation to a
spectral/modulation spectral representation, wherein the
means for converting is designed such that the
spectral/modulation spectral representation depends on both
a magnitude component and a phase component of the
time/spectral representation of the information signal. A
means then performs a manipulation and/or modification of
the information signal in the spectral/modulation spectral
representation to obtain a modified spectral/modulation
spectral representation. A further means finally forms a
processed information signal representing a processed
version of the information signal based on the modified
spectral/modulation spectral representation.
The core idea of the present invention is that processing
of information signals that is separated more rigorously
according to modulation and carrier components may be
achieved if the conversion of the information signal from
the time/spectral representation and/or the time/frequency
representation into the spectral/modulation spectral
representation and/or the frequency /modulation frequency
representation is performed depending on both a magnitude
component and a phase component of the time/spectral
representation of the information signal. This eliminates a
recombination between phase and magnitude and thus the
reintroduction of undesired modulation components into the
time representation of the processed information signal on
the synthesis side.
The conversion of the information signal from the
time/spectral representation to the spectral/modulatlion
spectral representation considering both the magnitude and
the phase involves the problem that the time/spectfral
representation of the information signal actually depends
not only on the information signal, but also on the pnase
offset of the time blocks with respect to the carlier
spectral component of the information signal. In other
words, the block-wise transform of the information signal
from the time representation to the time/spectral
representation causes the sequences of spectral values
obtained in the time/spectral representation of the
information signal per spectral component to comprise an
up-modulated complex carrier depending only on the
asynchronism of the block repeating frequency with respect
to the carrier frequency component of the information
signal. According to the embodiments of the present
invention, a demodulation of the sequence of spectral
values in the time/spectral representation of the
information signal is thus performed per spectral component
to obtain a demodulated sequence of spectral values per
spectral component. The subsequent conversion of the thus
obtained demodulated sequences of spectral values is
performed by block-wise transform of the time/spectral
representation into the spectral/modulation spectral
representation and/or by their block-wise spectral
decomposition, thereby obtaining blocks of modulation
values. These are manipulated and/or modified, for example
weighted with a corresponding weighting function for
bandpass filtering for the removal of the modulation
component from the original information signal. The result
is a modified demodulated sequence of spectral values
and/or a modified demodulated time/spectral representation.
The complex carrier is again modulated upon the thus
obtained modified demodulated sequences of spectral values,
thus obtaining a modified sequence of spectral values
representing a part of a time/spectral representation of
the processed information signal. A back-conversion of this
representation into the time representation yields a
processed information signal in the time representation
and/or time domain, which may be changed in a highly
accurate way with respect to the original information
signal regarding modulation and carrier components.
Preferred embodiments of the present invention will be
explained below in more detail referring to the
accompanying drawings, in which:
Fig. 1 shows a block circuit diagram of a device for
processing an information signal according to an
embodiment of the present invention; and
Fig. 2 shows a schematic for illustrating the operation
of the device of Fig. 1.
Fig. 1 shows a device for processing an information signal
according to an embodiment of the present invention. The
device of Fig. 1, generally indicated at 10, includes an
input 12, at which it receives the information signal 14 to
be processed. The device of Fig. 1 is exemplarily provided
to process the information signal 14 such that the
modulation component is removed from the information signal
14, and to thus obtain a processed information signal with
only the carrier component. Furthermore, the device 10
includes an output 16 to output the carrier component as
the processing result and/or the processed information
signal 18.
Internally, the device 10 is essentially divided into a
portion 20 for converting the information signal 14 from a
time representation to a time/frequency representation,
means 22 for converting the information signal from the
time/frequency representation to the frequency/modulation
frequency representation, a portion 24 in which the actual
processing is performed, i.e. the modification of the
information signal, and a portion 26 for the back-
conversion of the information signal processed in the
frequency /modulation frequency representation from tihis
representation to the time representation. The mentioned
four portions are connected in series between the input 12
and the output 16 in this order, wherein their more
detailed structure and their more detailed operation will
be described below.
Portion 20 of the device 10 includes a windowing means 28
and a transform means 30 that follow at the input 12 in
this order. In particular, an input of the windowing means
28 is connected to input 12 to receive the information
signal 14 as a sequence of information values. If the
information signal is still present as an analog signal, it
may, for example, be converted to a sequence of information
and/or sample values by an A/D converter and/or discrete
sampling. The windowing means 28 forms blocks of the same
number of information values each from the sequence of
information values and additionally performs a weighting
with a weighting function on each block of information
values which, however, cannot, for example, exclusively
correspond to a sine window or a KBD window. The blocks may-
overlap, such as by 50%, or not. Merely as an example, a
50% overlap is assumed in the following. The preferred
window functions have the property that they allow good
subband separation in the time/spectral representation aind
that the squares of their weighting values, which
correspond to each other as they are applied to one and the
same information value, add to one in the overlap area.
An output of the windowing means 28 is connected to an
input of the transform means 30. The blocks of information
values output by the windowing means 28 are received by the
transform means 30. The transform means 30 then subjects
them block-wise to a spectrally decomposing transform, such
as a DFT or another complex transform. The transform means
30 thus block-wise achieves a decomposition of the
information signal 14 into spectral components and thus
particularly generates a block of spectral values including
one spectral value per spectral component per time block,
as it is received from the windowing means 28. Several
spectral values may be combined to subbands. In the
following, however, the terms subband and spectral
component are used as synonyms. For each spectral component
and/or each subband, the result is thus one spectral value
or several ones, if there is a subband combination, which,
however, is not assumed in the following, per time block.
Accordingly, the transform means 30 outputs a sequence of
spectral values per spectral component and/or subband that
represent the course in time of this spectral component
and/or this subband. The spectral values output by the
transform means 30 represent a time/frequency
representation of the information signal 14.
Portion 22 includes a carrier frequency determination means
32, a mixer 34 serving as demodulation means, a windowing
means 36 and a second transform means 38.
The windowing means 32 includes an input connected to the
output of the transform means 30. There it receives the
spectral value sequences for the individual subbands and
divides the spectral value sequences per subband
similarly to the windowing means 28 with respect to the
information signal 14 - into blocks and weights the
spectral values of each block with an appropriate weighting
function. The weighting function may be one of the
weighting functions already exemplarily mentioned above
with respect to means 28. The consecutive blocks in a
subband may or may not overlap, wherein the following again
exemplarily assumes a mutual overlap of 50%. The following
assumes that the blocks of different subbands are aligned
with respect to each other, as it will be explained in more
detail below with respect to Fig. 1. However, another
procedure with block sequences offset between the subbands
would also be conceivable. At the output, the windowing
means outputs sequences of windowed spectral value blocks
per subband.
The carrier frequency determination means 32 also includes
an input connected to the output of the transform means 30
to obtain the spectral values of the subbands and/or
spectral components as sequences of spectral values per
subband. It is provided to find out, in each subband, the
carrier component caused by the individual time blocks,
from which the individual spectral values of the subbands
have been derived, comprising a phase offset varying in
time with respect to the carrier frequency component of the
information signal 14. The carrier frequency determination
means 32 outputs the carrier component determined per
subband at its output to an input of the mixer 34 which, in
turn, has another input connected to the output of the
windowing means 36.
The mixer 34 is designed such that it multiplies, per
subband, the blocks of windowed spectral values, as they
are output by the transform means, by the complex conjugate
of the respective carrier component, as it has been
determined by the carrier frequency determination means 30
for the respective subband, thus demodulating the subbands
and/or blocks of windowed spectral values.
At the output of the mixer 34, the result are thus
demodulated subbands and/or the result is a sequence of
demodulated blocks of windowed spectral values per subband.
The output of the mixer 34 is connected to an input of the
transform means 38, so that the latter receives blocks of
windowed and demodulated spectral values overlapping each
other - here by exemplary 50% - per subband and transforms
and/or spectrally decomposes them block-wise into the
spectral/modulation spectral representation to generate a
frequency/modulation frequency representation of the
information signal 14 up to now only modified with respect
to the demodulation of the subband spectral value sequences
by processing all subbands and/or spectral components. The
transform on which the transform means 38 is based per
subband may be, for example, a DFT, an MDCT, MOST or the
like, and particularly also the same transform as that of
transform means 30. Fig. 1 exemplarily assumes that the
transforms of both transform means 30, 38 is a DFT.
Accordingly, the transform means 38 successively outputs
blocks of values, referred to as modulation values in the
following and representing a spectral decomposition of the
blocks of windowed and demodulated spectral values, at its
output for each subband and/or each spectral component. The
blocks of spectral values per subband, with respect to
which the transform means 38 performs the transforms, are
time-aligned with each other, so that the result per time
period is always immediately a matrix of modulation values
composed of a modulation value block per subband. The
transform means 38 passes the modulation values on to the
portion 24, which only comprises a signal processing means
40.
The signal processing means 40 is connected to the output
of the transform means 38 and thus receives the blocks of
modulation values, in the present exemplary case, because
the device 10 serves for modulation component suppression,
the signal processing means 40 performs an effective low-
pass filtering in the frequency domain on the incoming
blocks of modulation values, i.e. a weighting of the
modulation values with a function dropping to higher and/or
lower modulation frequencies starting from the modulation
frequency zero. The thus modified blocks of modulation
values are passed to the back-conversion portion 26 by the
signal processing means 40. The modified blocks of
modulation values output by the signal processing means 40
represent a modified frequency/modulation frequency
representation of the information signal 14, or in other
words a frequency/modulation frequency representation still
differing from the frequency/modulation frequency
representation of the modified information signal 18 by the
demodulation by the mixer 34.
The back-conversion portion 26, in turn, is divided into
two portions, i.e. a portion for the conversion of the
processed information signal 18 from the
frequency/modulation frequency representation, as output by
the signal processing means 40, to the time/f requency
representation, and a portion for the back-conversion of
the processed information signal from the time/frequency
representation to the time representation. The former of
the two portions includes transform means 42 for performing
a block-wise transform inverse to the transform according
to the transform means 38, a mixer 4 6 and a combination
means 44. The latter portion of the back-conversion portion
26 includes transform means 48 for performing a block-wise
transform inverse to the transform of the transform means
30 and a combination means 50.
With its input, the inverse transform means 42 is connected
to the output of the signal processing means 40 and
transforms the modified blocks of modulation values
subband-wise from the spectral representation back to the
time/frequency representation and thus reverses the
spectral decomposition to obtain a sequence of modified
blocks of spectral values per subband. These modified
spectral value blocks output by the inverse transform means
42 differ from the spectral value blocks as output by the
windowing means 36, but not only by the processing by the
signal processing means 40, but also by the demodulation
effected by the mixer 34. Therefore, the mixer 46 receives
the sequences of modified spectral value blocks output by
the inverse transform means 42 per subband and mixes them
with a complex carrier, which is complex conjugate with
respect to that used at the corresponding place and/or for
the corresponding block for the demodulation of the
information signal at the mixer 34, to modulate the
spectral value blocks again with the carrier caused by the
phase offsets of the time blocks. The result yielded at the
output of the mixer 4 6 is a sequence of modified, non-
demodulated spectral value blocks per subband.
The output of the mixer 4 6 is connected to an input of the
combination means 44. It combines, per subband, the
sequence of modified blocks of spectral values again up-
modulated with the complex carrier to form a uniform stream
and/or a uniform sequence of spectral values by
appropriately linking mutually corresponding spectral
values of adjacent and/or consecutive blocks of spectral
values for a subband, as they are received from the mixer
46. In the case of the use of weighting functions
exemplarily mentioned above with the positive property that
the squares of mutually corresponding weighting values are
summed to one in the case of overlapping, the combination
consists in a simple addition of spectral values associated
with each other. The result output at the output of the
combination means 44 (OLA = overlap add) is composed of a
modified sequence of spectral values per subband. The
result thus output at the output of the OLA 4 4 are thus
modified subbands and/or modified sequences of spectral
values for all spectral components and represents a
modified time/frequency representation of the information
signal 14 and/or a time/frequency representation of bhe
modified information signal 18.
The transform means 48 receives the spectral value
sequences and thus particularly one after the other always
one spectral value for all subbands and/or spectral
components and/or one after the other one spectral
decomposition of a portion of the modified information
signal 18. By reversing the spectral decomposition, it
generates a sequence of modified time blocks from the
sequence of spectral decompositions. These modified time
blocks are, in turn, received by the combination means 50.
The combination means 50 operates similarly to the
combination means 44. It combines the modified time blocks
exemplarily overlapping by 50% by adding mutually
corresponding information values from adjacent anal/or
consecutive modified time blocks. The result at the output
of the combination means 50 is thus a sequence of
information values representing the processed information
signal 18.
The structure of the device 10 and the operation of the
individual components having been described above, the
following will discuss their operation in more detail with
respect to Figs. 1 and 2.
The processing of the information signal by the device 10
starts with the reception of the audio signal 14 at the
input 12. The information signal 14 is present in a sampled
form. The sampling has been done, for example, by means of
an analog/digital converter. The sampling has been done
with a certain sampling frequency ωs. The information
signal 14 consequently reaches the input 12 as a sequence
of sample and/or information values Si = s (27π/ωs.i) , wherein
s is the analog information signal, Si are the information
values, and the index i is an index for the information
values. Among the incoming samples Si, the windowing meams
28 always combines 2N consecutive samples to form time
blocks, in the present example with a 50% overlap. For
example, it combines the samples s0 to S2N-1 to form a time
block with the index n = 0, the samples sN to S3N-1 to form
a second time block with the index n = 1, the samples S2N
to s4N-1 to form a third time block of information values
with the index n = 2, etc. The windowing means 28 weights
each of these blocks with a window and/or weighting
function, as described above. Let sn0 to sn2N-1 be, for
example, the 2N information values of the time block n,
then the block output by the means 28 is finally yielded as
sn0 —> sn0-go to sn2N-1 -> sn2N-1g2N-1, wherein gi with i = 0 to
2N-1 is the weighting function.
Fig. 2 shows the windowing functions applied to the
information values si exemplarily for four consecutive time
blocks n = 0, 1, 2, 3 in a diagram 70, in which the time t
is plotted along the x-axis in arbitrary units, and the
amplitude of the windowing functions is plotted along the
y-axis in arbitrary units. In this way, the windowing means
28 passes a new windowed time block of 2N information
values each to the transform means 30 after always N
information values. The repetition frequency of the time
blocks is thus ωS/N.
The transform means 30 transforms the windowed time blocks
to a spectral representation. The transform means 30
performs a spectral decomposition of the time blocks of
windowed information values into a plurality of
predetermined subbands and/or spectral components. The
present case exemplarily assumes that the transform is a
DFT and/or discrete Fourier transform. For each time block
of 2N information values, the transform means 30 generates
N complex-valued spectral values for N spectral components,
if the information signal is real, in this exemplary case.
The complex spectral values output by the transform means
30 represent the time/frequency representation 74 of the
information signal. The complex spectral values are
illustrated by boxes 76 in Fig. 2. As the transform means
30 generates at least one spectral value per consecutive
time block of information values per subband and/or
spectral component, the transform means 30 thus outputs a
sequence of spectral values 7 6 per subband and/or spectral
component at the frequency ωs/N. The spectral values output
for a time block are illustrated horizontally located along
the frequency axis 78 at 74 in Fig. 2. The spectral values
output for a subsequent time block follow directly below in
a vertical direction along the axis 80. The axes 78 and 80
thus represent the frequency and/or time axis of the
time/frequency representation of the information signal 14.
Exemplarily, Fig. 3 only shows four subbands. The sequence
of spectral values per subband run along the columns in the
exemplary representation of Fig. 2 and are illustrated by
82a, 82b, 82c and 82d.
Reference is briefly made to Fig. 1 again, where the
information signal 14 is exemplarily illustrated as a
function representable by sin (bt) • (1+µ.sin (at) ) , wherein α
is, for example, the modulation frequency of the envelope
of the information signal 14 indicated by the dashed line
84, while β represents the carrier frequency of the
information signal 14, t is the time, and µ is the
modulation depth. With a sufficiently high sampling
frequency ωs, the result for this exemplary information
signal by the transform 72 per time block is a block of
spectral values 7 6, i.e. a row at 74, in which mainly the
spectral component and/or the pertinent spectral value has
a distinct maximum at the carrier frequency β. However, the
spectral values for this spectral component f = β vary in
time for consecutive time blocks due to the variation of
the envelope 84. Accordingly, the magnitude of the spectral
values of the spectral component β varies with the
modulation frequency α.
Up to here, the discussion has not taken into account that
the various time blocks may each have a different phase
offset with respect to the carrier frequency β due to a
frequency mismatch between the time block repeating
frequency ωs/N and the carrier frequency of the information
signal 14. Depending on the phase offset, the spectral
values of the spectral blocks resulting from the time
blocks in transform 72 are modulated with a carrier ,
wherein j represent the imaginary unit, f represents the
frequency, and ∆φ represents the phase offset of the
respective time block. For an essentially equal carrier
frequency, as is the case in the present exemplary case,
the phase offset ∆φ increases linearly. Therefore, the
spectral values of a subband experience, due to a frequency
mismatch between the time block repeating frequency and the
carrier frequency, a modulation with a carrier component
depending on the mismatch of the two frequencies.
Taking this into account, the carrier frequency
determination means 32 now derives the carrier component in
the subbands resulting by the phase offset of the time
blocks and/or effected by the time block phase offset from
the spectral values a(ωb,n), wherein ωb is the angular
frequency ω and/or frequency f (ω=2πf) of the respective
subband 0≤b