Abstract: The invention provides an information encoder for encoding an in formation signal (IS), the information encoder (1) comprising: an analyzer (2) for analyzing the information signal (IS) in order to obtain linear prediction coefficients of a predictive polynomial A(z); a converter (3) for converting the linear prediction coefficients of the predictive polynomial A(z) to frequency values fi...f„ of a spectral frequency representation of the predictive polynomial A(z), wherein the converter (3) is configured to determine the frequency values fi...f„ by analyzing a pair of polynomials P(z) and Q(z) being defined as P(z) = A(z) + z "A 1) and Q(z) = A(z) - z "A 1), wherein m is an order of the predictive polynomial A(z) and I is greater or equal to zero, wherein the converter (3) is configured to obtain the frequency val ues (fi...f„) by establishing a strictly real spectrum (RES) derived from P(z) and a strictly imaginary spectrum (IES) from Q(z) and by identifying zeros of the strictly real spectrum (RES) derived from P(z) and the strictly imaginary spectrum (IES)derived from Q(z); a quantizer (4) for obtaining quantized frequency (f..i..f)val.ues from the frequency values (fi...£,); and a bitstream producer (5) for producing a bitstream comprising the quantized frequency values (fqi...fq„).
The most frequently used paradigm in speech coding is Algebraic Code Ex
cited Linear Prediction (ACELP), which is used in standards such as the
AMR-family, G.71 8 and MPEG USAC [1-3]. It is based on modelling speech
using a source model, consisting of a linear predictor (LP) to model the spec
tral envelope, a long time predictor (LTP) to model the fundamental frequen
cy and an algebraic codebook for the residual.
The coefficients of the linear predictive model are very sensitive to quantiza
tion, whereby usually, they are first transformed to Line Spectral Frequencies
(LSFs) or Imittance Spectral Frequencies (ISFs) before quantization. The
LSF/ISF domains are robust to quantization errors and in these domains; the
stability of the predictor can be readily preserved, whereby it offers a suitable
domain for quantization [4].
The LSFs/ISFs, in the following referred to as frequency values, can be ob
tained from a linear predictive polynomial A(z) of order m as follows. The Line
Spectrum Pair polynomials are defined as
P(z) = A(z) + z~ ~ A(z~ 1 )
Q(z) = A(z) - ~ m - A(z~ 1) (1)
where i = 1 for the Line Spectrum Pair and = 0 for the Imittance Spectrum
Pair representation, but any I > 0 is in principle valid. In the following it thus
will be assumed only that > 0 .
Note that the original predictor can always be reconstructed using A(z) = 1/2
[P(z)+Q(z)]. The polynomials P(z) and Q(z) thus contain all the information of
A(z)
The central property of LSP/ISP polynomials is that if and only if A(z) has all
its roots inside the unit circle, then the roots of P(z) and Q(z) are interlaced
on the unit circle. Since the roots of P(z) and Q(z) are on the unit circle, they
can be represented by their angles only. These angles correspond to fre¬
quencies and since the spectra of P(z) and Q(z) have vertical lines in their
logarithmic magnitude spectra at frequencies corresponding to the roots, the
roots are referred to as frequency values.
It follows that the frequency values, encode all information of the predictor
A(z). Moreover, it has been found that frequency values are robust to quanti¬
zation errors such that a small error in one of the frequency values produces
a small error in spectrum of the reconstructed predictor which is localized, in
the spectrum, near the corresponding frequency. Due to these favorable
properties, quantization in the LSF or ISF domains is used in all main-stream
speech codecs [1-3].
One of the challenges in using frequency values is, however, finding their
locations efficiently from the coefficients of the polynomials P(z) and Q(z).
After all, finding the roots of polynomials is a classic and difficult problem.
The previously proposed methods for this task include the following ap¬
proaches:
One of the early approaches uses the fact that zeros reside on the unit
circle, whereby they appear as zeros in the magnitude spectrum [5]. By
taking the discrete Fourier transform of the coefficients of P(z) and Q(z),
one can thus search for valleys in the magnitude spectrum. Each valley
indicates the location of a root and if the spectrum is upsampled sufficiently,
one can find a l roots. This method however yields on y an ap
proximate position, since it is difficult to determine the exact position
from the valley location.
The most frequently used approach is based on Chebyshev polynomials
and was presented in [6]. It relies on the realization that the polynomials
P (z) and Q(z) are symmetric and antisymmetric, respectively, whereby
they contain plenty of redundant information. By removing trivial zeros
at z = ± 1 and with the substitution x = z + z 1 (which is known as the
Chebyshev transform), the polynomials can be transformed to an alter
native representation FP (x) and FQ(x). These polynomials are half the
order of P(z) and Q(z) and they have only real roots on the range -2 to
+2. Note that the polynomials FP(x) and FQ(x) are real-valued when x is
real. Moreover, since the roots are simple, FP(x) and FQ(x) will have a
zero-crossing at each of their roots.
In speech codecs such as the AMR-WB, this approach is applied such
that the polynomials FP(x) and FQ(x) are evaluated on a fixed grid on
the real axis to find all zero-crossings. The root locations are further re
fined by linear interpolation around the zero-crossing. The advantage of
this approach is the reduced complexity due to omission of redundant
coefficients.
While the above described methods work sufficiently in existing codecs, they
do have a number of problems.
The problem to be solved is to provide an improved concept for encoding of
information.
In a first aspect the problem is solved by an information encoder for encoding
an information signal. The information encoder comprises:
an analyzer for analyzing the information signal in order to obtain linear pre
diction coefficients of a predictive poiynomiai A(z);
a converter for converting the linear prediction coefficients of the predictive
polynomial A(z) to frequency values of a spectral frequency representation of
the predictive polynomial A(z), wherein the converter is configured to deter
mine the frequency values by analyzing a pair of polynomials P(z) and Q(z)
being defined as
P(z) = A(z) + z_m_i A(z _ ) and
Q(z) = A(z) - z- m - A(z- ) ,
wherein m is an order of the predictive poiynomiai A(z) and is greater or
equal to zero, wherein the converter is configured to obtain the frequency
values by establishing a strictly real spectrum derived from P(z) and a strictly
imaginary spectrum from Q(z) and by identifying zeros of the strictly real
spectrum derived from P(z) and the strictly imaginary spectrum derived from
Q{z);
a quantizer for obtaining quantized frequency values from the frequency values;
and
a bitstream producer for producing a bitstream comprising the quantized f re
quency values.
The information encoder according to the invention uses a zero crossing
search, whereas the spectral approach for finding the roots according to prior
art relies on finding valleys in the magnitude spectrum. However, when
searching for valleys, the accuracy is poorer than when searching for zerocrossings.
Consider, for example, the sequence [4, 2 , 1, 2, 3]. Clearly, the
smallest value is the third element, whereby the zero would lie somewhere
between the second and the fourth element. In other words, one cannot determine
whether the zero is on the right or left side of the third element. How
ever, if one considers the sequence [4, 2 , 1, -2, -3], one can immediately
see that the zero crossing is between the third and fourth elements whereby
our margin of error is reduced in half. It follows that with the magnitudespectrum
approach, one need double the number of analysis points to obtain
the sa e accuracy as with the zero-crossing search.
In comparison to evaluating the magnitudes |P (z)j and |Q(z)j, the zerocrossing
approach has a significant advantage in accuracy. Consider, for ex
ample, the sequence 3 , 2 , - 1, -2. With the zero-crossing approach it is obvi
ous that the zero lies between 2 and - . However, by studying the corre¬
sponding magnitude sequence 3, 2 , 1, 2, one can only conclude that the zero
lies somewhere between the second and the last elements. In other words,
with the zero-crossing approach the accuracy is double in comparison to the
magnitude-based approach.
Furthermore, the information encoder according to the invention may use
long predictors such as m = 128. In contrast to that, the Chebyshev transform
performs sufficiently only when the length of A(z) is relatively small, for ex¬
ample < 20. For long predictors, the Chebyshev transform is numerically
unstable, whereby practical implementation of the algorithm is impossible.
The main properties of the proposed information encoder are thus that one
may obtain as high or better accuracy as the Chebyshev-based method since
zero crossings are searched and because a time domain to frequency do
main conversion is done, so that the zeros may be found with very low com¬
putational complexity.
As a result the information encoder according to the invention determines the
zeros (roots) both more accurately, but a so with low computational complexi
ty.
The information encoder according to the invention can be used in any signal
processing application which needs to determine the line spectrum of a se
quence. Herein, the information encoder is exemplary discussed in the con
text speech coding. The invention is applicable in a speech, audio and/or
video encoding device or application, which employs a linear predictor for
modelling the spectral magnitude envelope, perceptual frequency masking
threshold, temporal magnitude envelope, perceptual temporal masking
threshold, or other envelope shapes, or other representations equivalent to
an envelope shape such as an autocorrelation signal, which uses a line spec
trum to represent the information of the envelope, for encoding, analysis or
processing, which needs a method for determining the ine spectrum from an
input signal, such as a speech or general audio signal, and where the input
signal is represented as a digital filter or other sequence of numbers.
The information signal may be for instance an audio signal or a video signal.
The frequency values may be line spectral frequencies or Imittance spectral
frequencies. The quantized frequency values transmitted within the bitstream
will enable a decoder to decode the bitstream in order to re-create the audio
signal or the video signal.
According to a preferred embodiment of the invention the converter compris
es a determining device to determine the poiynomials P(z) and Q(z) from the
predictive polynomial A(z).
According to preferred embodiment of the invention the converter comprises
a zero identifier for identifying the zeros of the strictly real spectrum derived
from P(z) and the strictly imaginary spectrum derived from Q(z).
According to a preferred embodiment of the invention the zero identifier is
configured for identifying the zeros by
a) starting with the rea spectrum at nu frequency;
b) increasing frequency until a change of sign at the real spectrum is
found;
c) increasing frequency until a further change of sign at the imaginary
spectrum is found; and
d) repeating steps b) and c) until all zeros are found.
Note that Q(z) and thus the imaginary part of the spectrum always has a zero
at the null frequency. Since the roots are overlapping, P(z) and thus the real
part of the spectrum will then always be non-zero at the null frequency. One
can therefore start with the real part at the null frequency and increase the
frequency until the first change of sign is found, which indicates the first zerocrossing
and thus the first frequency value.
Since the roots are interlaced, the spectrum of Q(z) will have the next change
in sign. One can thus increase the frequency until a change of sign for the
spectrum of Q(z) is found. This process then may be repeated, alternating
between the spectraP(z) and Q(z), until all frequency values have been
found. The approach used for locating the zero-crossing in the spectra is thus
similar to the approach applied in the Chebyshev-domain [6, 7].
Since the zeros of P (z) and Q(z) are interlaced, one can alternate between
searching for zeros on the real and complex parts, such that one finds all ze
ros in one pass, and reduce complexity by half in comparison to a full search.
According to a preferred embodiment of the invention the zero identifier is
configured for identifying the zeros b interpolation.
In addition to the zero-crossing approach one can readily apply interpolation
such that one can estimate the position of the zero with even higher accura
cy, for example, as it is done in conventional methods, e.g. [7].
According to a preferred embodiment of the invention the converter compris¬
es a zero-padding device for adding one or more coefficients having a value
"0" to the polynomials P(z) and Q(z) so as to produce a pair of elongated po l
ynomials Pe(z) and Q (z). Accuracy can be further improved by extending the
length of the evaluated spectrum. Based on information about the system, it
is actually possible in some cases to determine a minimum distance between
the frequency values, and thus determine the minimum length of the spec
trum with which all frequency values can be found [8].
According to a preferred embodiment of the invention the converter is config
ured in such way that during converting the linear prediction coefficients to
frequency values of a spectral frequency representation of the predictive po l
ynomial A(z) at least a part of operations with coefficients known to be have
the value "0" of the elongated polynomials Pe(z) and Qe(z) are omitted.
Increasing the length of the spectrum does however also increase computa¬
tional complexity. The largest contributor to the complexity is the time domain
to frequency domain transform, such as a fast Fou er transform, of the coef
ficients of A(z). Since the coefficient vector has been zero-padded to the desired
length, it is however very sparse. This fact can readily be used to re
duce complexity. This is a rather simple problem in the sense that one knows
exactly which coefficients are zero, whereby on each iteration of the fast Fo u
rier transform one can simply omit those operations which involve zeros. A p
plication of such sparse fast Fourier transform s straightforward and any
programmer skilled in the art can implement it. The complexity of such an
implementation is 0(N log2( 1 + m + I)), where is the length of the spectrum
and and I are defined as before.
According to a preferred embodiment of the invention the converter comprises
a composite polynomial former configured to establish a composite poly
nomial Ce(Pe(z), Q e(z)) from the elongated polynomials Pe(z) and Qe(z)
According to a preferred embodiment of the invention the converter is conf ig
ured in such way that the strictly real spectrum derived from P(z) and the
strictly imaginary spectrum from Q(z) are established by a single Fourier
transform by transforming the composite polynomial Ce(Pe(z), Qe(z)).
According to a preferred embodiment invention the converter comprises a
Fourier transform device for Fourier transforming the pair of polynomials P(z)
and Q(z) or one or more polynomials derived from the pair of polynomials
P(z) and Q(z) into a frequency domain and an adjustment device for adjust¬
ing a phase of the spectrum derived from P(z) so that it is strictly real and for
adjusting a phase of the spectrum derived from Q(z) so that it is strictly imag¬
inary. The Fourier transform device may be based on the fast Fourier transform
or on the discrete Fourier transform.
According to a preferred embodiment of the invention the adjustment device
is configured as a coefficient shifter for circular shifting of coefficients of the
pair of polynomials P(z) and Q(z) or one or more polynomials derived from
the pair of polynomials P(z) and Q(z).
According to a preferred embodiment of the invention the coefficient shifter is
configured for circular shifting of coefficients in such way that an original mid
point of a sequence of coefficients is shifted to the first position of the sequence.
in theory, it is we known that the Fourier transform of a symmetric sequence
is real-va!ued and antisymmetric sequences have purely imaginary Fourier
spectra. In the present case, our input sequence is the coefficients of poly
nomial P(z) or Q(z) which is of length m + 1, whereas one would prefer to
have the discrete Fourier transform of a much greater length N » (m + ) The
conventional approach for creating longer Fourier spectra is zero-padding of
the input signal. However, zero-padding the sequence has to be carefully
implemented such that the symmetries are retained.
First a polynomial P(z) with coefficients
[ o, Pi, P2, Pi. Po]
is considered.
The way FFT algorithms are usually applied requires that the point of sym
metry is the first element, whereby when applied for example in MATLAB one
can write
fft([p 2, i , o, Po, P ])
to obtain a real-valued output. Specifically, a circular shift may be applied,
such that the point of symmetry corresponding to the mid-point element, that
is, coefficient p2 is shifted left such that it is at the first position. The coef fi
cients which were on the left side of p2 are then appended to the end of the
sequence.
For a zero-padded sequence
[Po. Pi , P2, Pi, Po, 0 , 0 . . . 0]
one can apply the same process. The sequence
[p?, P , Po, 0 , 0 . . . 0 , po, P ]
will thus have a real-valued discrete Fourier transform. Here the number of
zeros in the input sequences is N - m - ! if N is the desired length of the
spectrum.
Correspondingly, consider the coefficients
[q0, q1 0, - q , - q ]
corresponding to polynomial Q(z). By applying a circular shift such that the
former midpoint comes to the first position, one obtains
[0, -qi, - q0, qo, qi]
which has a purely imaginary discrete Fourier transform. The zero-padded
transform can then be taken for the sequence
[0, - , - q0, 0, 0 . . . 0, q0,
Note that the above applies only for cases where the length of the sequence
is odd, whereby m + is even. For cases where + is odd, one have two
options. Either one can implement the circular shift in the frequency domain
or apply a DFT with half-samples (see below).
According to a preferred embodiment of the invention the adjustment device
is configured as a phase shifter for shifting a phase of the output of the Fourier
transform device.
According to a preferred embodiment of the invention the phase shifter is
configured for shifting the phase of the output of the Fourier transform device
by multiplying a k-th frequency bin with exp(i2irkh/N), wherein N is the length
of the sample and h = (m+l)/2.
It is well-known that a circular shift in the time-domain is equivalent with a
phase-rotation in the frequency-domain. Specifically, a shift of h = (m + l)/2
steps n the time domain corresponds to multiplication of the k-th frequency
bin with exp(-i2Trkh/N ) , where N is the length of the spectrum. Instead of the
circular shift, one can thus apply a multiplication in the frequency-domain to
obtain exactly the same result. The cost of this approach is a slightly in
creased complexity. Note that h = (m + l)/2 is an integer number only when m
+ is even. When + I is odd, the circular shift would require a delay by ra
tional number of steps, which is difficult to implement directly. Instead, one
can apply the corresponding shift in the frequency domain by the phaserotation
described above.
According to preferred embodiment of the invention the converter comprises
a Fourier transform device for Fourier transforming the pair of polynomials
P(z) and Q(z) or one or more polynomials derived from the pair of polynomi¬
als P(z) and Q(z) into a frequency domain with half samples so that the spec
trum derived from P(z) is strictly real and so that the spectrum derived from
Q(z) is strictly imaginary.
An alternative is to implement a DFT with half-samples. Specifically, whereas
the conventional DFT is defined as
(2)
one can define the half-sample DFT as
,V -
Xk = exp - i2nk (n - - / N ) (3)
A fast implementation as FFT can readily be devised for this formulation.
The benefit of this formulation is that now the point of symmetry is at n = 1/2
instead of the usual n = 1. With this half-sample DFT one would then with a
sequence
[2, 1, 0, 0 , 1, 2]
obtain a real-valued Fourier spectrum.
In the case of odd m+l, for a polynomial P(z) with coefficients p0, , 2, P2.
Pi, po one can then with a half-sample DFT and zero padding obtain a real
valued spectrum when the input sequence is
[P2, P , Po, 0 , 0 . . . 0, po, p , p2]
Correspondingly, for a polynomial Q(z) one can apply the half-sample DFT
on the sequence
[- q , -, ~ qc 0 , 0 . . . 0, q0, qi, q2]
to obtain a purely imaginary spectrum
With these methods, for any combination of and I , one can obtain a real
valued spectrum for a polynomial P(z) and a purely imaginary spectrum for
any Q(z). n fact, since the spectra of P(z) and Q(z) are purely real and imag
inary, respectively, one can store them in a single complex spectrum, which
then corresponds to the spectrum of P(z) + Q(z) = 2A(z). Scaling by the fac
tor 2 does not change the location of roots, whereby it can be ignored. One
can thus obtain the spectra of P(z) and Q(z) by evaluating only the spectrum
of A(z) using a single FFT One only need to apply the circular shift, as ex¬
plained above, to the coefficients of A(z).
For example, with m = 4 and I = 0, the coefficients of A(z) are
which one can zero-pad to an arbitrary length N by
[a0, ai, a2, a3, a , 0 , 0 . . . 0].
If one then applies a circular shift of (m + l)/2 = 2 steps, one obtains
[a2, a3, a4, 0, 0 . . . 0, a0,
By taking the DFT of this sequence, one has the spectrum of P(z) and Q(z) in
the real and complex parts of the spectrum.
According to a preferred embodiment of the invention the converter comprises
a composite polynomial former configured to establish a composite poly
nomial C(P(z), Q(z)) from the polynomials P(z) and Q(z).
According to a preferred embodiment of the invention the converter is conf ig
ured in such way that the strictly real spectrum derived from P(z) and the
strictly imaginary spectrum from Q(z) are established by a single Fourier
transform, for example a fast Fourier transform (FFT), by transforming a
composite polynomial C(P(z), Q(z))
The polynomials P (z) and Q(z) are symmetric and antisymmetric, respective
ly, with the axis of symmetry at z + 2. It follows that the spectra of
z m+l P z) and z + 2Q(z), respectively, evaluated on the unit circle
z = qcr q ) , are real and complex valued, respectively. Since the zeros are on
the unit circle, one can find them by searching for zero-crossings. Moreover,
the evaluation on the unit-circle can be implemented simply by an fast Fouri
er transform.
As the spectra corresponding to z
+ 2P (z) and z~ + Q(z) are real and
complex, respectively, 2 is one can implement them with a single fast Fourier
transform. Specifically, if one take the sum z m+ 2(P (z) + Q(z)) then the real
and complex parts of the spectra correspond to z~ + 2 P(z) and z +l 2 Q(z),
respectively. Moreover, since
(m+ )/2 ( p z ) + Q(z)) = 2z m+l 2 A(z), (4)
one can directly take the FFT of 2z m+ 2 A(z) to obtain the spectra corre
sponding to z m+ 2 P(z) and z~ m+i 2 Q(z), without explicitly determining P(z)
and Q(z). Since one is interested only in the locations of zeros, 1 can omit
multiplication by the scalar 2 and evaluate z
+ 2 A(z) by FFT instead. Ob¬
serve that since A(z) has only m + 1 non-zero coefficients, one can use FFT
pruning to reduce complexity [ 1 1]. To ensure that all roots are found, one
must use an FFT of sufficiently high length N that the spectrum is evaluated
on at least one frequency between every two zeros.
According to a preferred embodiment of the invention the converter compris¬
es a limiting device for limiting the numerical range of the spectra of the poly
nomials P(z) and Q(z) by multiplying the polynomials P(z) and Q(z) or one or
more polynomials derived from the polynomials P(z) and Q(z) with a filter
polynomial B(z), wherein the filter polynomial B(z) is symmetric and does not
have any roots on a unit circle.
Speech codecs are often implemented on mobile device with limited re
sources, whereby numerical operations must be implemented with fixed-point
representations. It is therefore essential that algorithms implemented operate
with numerical representations whose range is limited. For common speech
spectral envelopes, the numerical range of the Fourier spectrum is, however,
so large that one needs a 32-bit implementation of the FFT to ensure that the
iocation of zero-crossings are retained.
A 16-bit FFT can, on the other hand, often be implemented with lower com
plexity, whereby it would be beneficial to limit the range of spectral values to
fit within that 16-bit range. From the equations |P (e' ) | <2|A(e ) | and
|Q(e'^)| < 2|A(e'^)| it is known that by limiting the numerical range of B(z)A(z)
one also limits the numerical range of B(z)P (z) and B(z)Q(z). If B(z) does not
have zeros on the unit circle, then B(z)P (z) and B(z)Q(z) will have the same
zero-crossing on the unit circle as P (z) and Q(z). Moreover, B(z) has to be
symmetric such that z_ +i+ P (z)B(z) and z~ m+ +n Q(z)B(z) remain symmet¬
ric and antisymmetric and their spectra are purely real and imaginary, re
spectively. Instead of evaluating the spectrum of z + A(z) one can thus
evaluate z n+ + 2A(z)B(z), where B(z) is an order n symmetric polynomial
without roots on the unit circle. In other words, one can apply the same appreach
as described above, but first multiplying A(z) with filter B(z) and apply
ing a modified phase-shift z- + 2.
The remaining task is to design a filter B(z) such that the numerical range of
A(z)B(z) is limited, with the restriction that B(z) must be symmetric and without
roots on the unit circle. The simplest filter which fulfills the requirements is
an order 2 linear-phase filter
B1(z) = b0 + biz 1 + b2z 2 (5)
where e R are the parameters and |b2| > 2 |b | . By adjusting one can
modify the spectral tilt and thus reduce the numerical range of the product
A(z)B (z) A computationally very efficient approach is to choose b such that
the magnitude at 0-frequency and Nyquist is equal, |A(1)Bi(1 ) j =
|A(-1 )Bi(-1 ) whereby one can choose for example
b0 = A(1) - A(-1) and b, = 2 (A(1) + A(-1)). (6)
This approach provides an approximately flat spectrum.
One observes (see also Fig. 5) that whereas A(z) has a high-pass character,
Bi(z) is low-pass, whereby the product A(z)Bi(z) has, as expected, equal
magnitude at 0- and Nyquist-frequency and it is more or less flat. Since Bi(z)
has only one degree of freedom, one obviously cannot expect that the prod
uct would be completely flat. Still, observe that the ratio between the highest
peak and lowest valley of B (z)L(z) maybe much smaller than that of A(z).
This means that one have obtained the desired effect; the numerical range of
Bi(z)A(z) is much smaller than that of A(z).
A second, slightly more complex method is to calculate the autocorrelation G
of the impulse response of A(0.5z). Here multiplication by 0.5 moves the zeros
of A(z) in the direction of origo, whereby the spectral magnitude is re¬
duced approximately by half. By applying the Levinson- Durbin on the auto
correlation rk, one obtains a filter H(z) of order n which is minimum-phase.
One can then define B2(z) = z H(z)H(z ) to obtain a |B2(z)A(z)| which is a p
proximately constant. One will note that the range of |B2(z)A(z)| is smaller
than that of IBi(z)A(z)|. Further approaches for the design of B(z) can be
readily found in classical literature of FIR design [18].
According to a preferred embodiment of the invention the converter compris
es a limiting device for limiting the numerical range of the spectra of the elon
gated polynomials Pe(z) and Qe(z) or one or more polynomials derived from
the elongated polynomials Pe(z) and Qe(z) by multiplying the elongated poly
nomials Pe(z) and Qe(z) with a filter polynomial B(z), wherein the filter poly
nomial B(z) is symmetric and does not have any roots on a unit circle B(z)
can be found as explained above.
In a further aspect the problem is solved by a method for operating an infor
mation encoder for encoding an information signal. The method comprises
the steps of:
analyzing the information signal in order to obtain linear prediction coeff i
cients of a predictive polynomial A(z);
converting the linear prediction coefficients of the predictive polynomial A(z)
to frequency vaiues fi...f n of a spectral frequency representation of the pre¬
dictive polynomial A(z), wherein the frequency values f . ..f are determined
by analyzing a pair of polynomials P(z) and Q(z) being defined as
P(z) = A(z) + z_m_l A(z _ ) and
Q(z) = A(z) - z~ m A(z- 1) ,
wherein m is an order of the predictive polynomial A(z) and is greater or
equal to zero, wherein the frequency values f ...fn are obtained by establish¬
ing a strictly real spectrum derived from P(z) and a strictly imaginary spec¬
trum from Q(z) and by identifying zeros of the strictly real spectrum derived
from P(z) and the strictly imaginary spectrum derived from Q(z);
obtaining quantized frequency fq . .fq values from the frequency values
f --- ; and
prod ucing a bitstream comprising the quantized frequency values fq ...fqn.
yoreover, the program is noticed by a computer program for, when running
on a processor, executing the method according to the invention.
Preferred embodiments of the invention are subsequently discussed with re
spect to the accompanying drawings, in which:
Fig. 1 illustrates an embodiment of an information encoder according
to the invention in a schematic view;
Fig. 2 illustrates an exemplary relation of A(z), P (z) and Q(z);
Fig. 3 illustrates a first embodiment of the converter of the information
encoder according to the invention in a schematic view;
Fig. 4 illustrates a second embodiment of the converter of the infor
mation encoder according to the invention in a schematic view;
Fig. 5 illustrates an exemplary magnitude spectrum of a predictor
A(z), the corresponding flattening filters Bi(z) and B2(z) and the
products A(z)Bi(z) and A(z)B 2(z);
Fig. 8 illustrates a third embodiment of the converter of the information
encoder according to the invention in a schematic view;
Fig. 7 illustrates a fourth embodiment of the converter of the infor¬
mation encoder according to the invention in a schematic view;
and
Fig. 8 illustrates a fifth embodiment of the converter of the information
encoder according to the invention in a schematic view.
Fig. 1 illustrates an embodiment of an information encoder 1 according to the
invention in a schematic view.
The information encoder 1 for encoding an information signal IS, comprises:
an analyzer 2 for analyzing the information signal IS in order to obtain linear
prediction coefficients of a predictive polynomial A(z);
a converter 3 for converting the linear prediction coefficients of the predictive
polynomial A(z) to frequency values f ...f n of a spectral frequency representa¬
tion RES, IES of the predictive polynomial A(z), wherein the converter 3 is
configured to determine the frequency values ...f by analyzing a pair of
polynomials P(z) and Q(z) being defined as
P(z) = A(z) + z- - A(z " 1) and
Q(z) = A(z) - z-m-' A(z ~ 1 ) ,
wherein m is an order of the predictive polynomial A(z) and is greater or
equal to zero, wherein the converter 3 is configured to obtain the frequency
values fi...f n by establishing a strictly real spectrum RES derived from P(z)
and a strictly imaginary spectrum IES from Q(z) and by identifying zeros of
the strictly real spectrum RES derived from P(z) and the strictly imaginary
spectrum IES derived from Q(z);
a quantizer 4 for obtaining quantized frequency fqi . . .fq values from the fre¬
quency values f ...f ; and
a bitstream producer 5 for producing a bitstream BS comprising the quan¬
tized frequency values fq ...f q .
The information encoder 1 according to the invention uses a zero crossing
search, whereas the spectra! approach for finding the roots according to prior
art relies on finding valleys in the magnitude spectrum. However, when
searching for valleys, the accuracy is poorer than when searching for zerocrossings.
Consider, for example, the sequence [4, 2 , 1, 2 , 3]. Clearly, the
smallest value is the third element, whereby the zero would ie somewhere
between the second and the fourth element. In other words, one cannot de
termine whether the zero is on the right or left side of the third element. How
ever, if one considers the sequence [4, 2, 1, -2, -3], one can immediately
see that the zero crossing is between the third and fourth elements, whereby
our margin of error is reduced in half. It follows that with the magnitudespectrum
approach, one need double the number of analysis points to obtain
the same accuracy as with the zero-crossing search.
n comparison to evaluating the magnitudes |P (z)| and |Q(z)|, the zerocrossing
approach has a significant advantage in accuracy. Consider, for ex¬
ample, the sequence 3, 2 , - 1, -2. With the zero-crossing approach it is obvi
ous that the zero lies between 2 and - 1. However, by studying the corre
sponding magnitude sequence 3, 2 , 1, 2, one can only conclude that the zero
lies somewhere between the second and the last elements. In other words,
with the zero-crossing approach the accuracy is double in comparison to the
magnitude-based approach.
Furthermore, the information encoder according to the invention may use
long predictors such as m = 128. In contrast to that, the Chebyshev transform
performs sufficiently only when the length of A(z) is relatively small, for ex
ample m < 20. For long predictors, the Chebyshev transform is numerically
unstable, whereby practical implementation of the algorithm is impossible.
The main properties of the proposed information encoder 1 are thus that one
may obtain as high or better accuracy as the Chebyshev-based method since
zero crossings are searched and because a time domain o frequency o
main conversion is done, so that the zeros may be found with very low com
putational complexity.
As a result the information encoder 1 according to the invention determines
the zeros (roots) both more accurately, but a so with low computational com¬
plexity.
The information encoder 1 according to the invention can be used in any signal
processing application which needs to determine the line spectrum of a
sequence. Herein, the information encoder 1 is exemplary discussed in the
context speech coding. The invention is applicable in a speech, audio and/or
video encoding device or application, which employs a linear predictor for
modelling the spectral magnitude envelope, perceptual frequency masking
threshold, temporal magnitude envelope, perceptual temporal masking
threshold, or other envelope shapes, or other representations equivalent to
an envelope shape such as an autocorrelation signal, which uses a line spec
trum to represent the information of the envelope, for encoding, analysis or
processing, which needs a method for determining the line spectrum from an
input signal, such as a speech or general audio signal, and where the input
signal is represented as a digital filter or other sequence of numbers.
The information signal IS may be for instance an audio signal or a video sig
nal.
Fig. 2 illustrates an exemplary relation of A(z), P (z) and Q(z). The vertical
dashed lines depict the frequency values f . . ,f . Note that the magnitude is
expressed on a linear axis instead of the decibel scale in order to keep ze
ro-crossings visible. We can see that the line spectral frequencies occur at the
zeros crossings of P (z) and Q(z). Moreover, the magnitudes of P (z) and
Q z) are smaller or equal than 2|A(z)j everywhere:|P (e ) | £ 2jA(e i ) and
lQ(e i ) | £ 2|A(e i ) |
Fig. 3 illustrates a first embodiment of the converter of the information e n
coder according to the invention in a schematic view.
According to a preferred embodiment of the invention the converter 3 com
prises a determining device 6 to determine the polynomials P(z) and Q(z)
from the predictive polynomial A(z).
According to a preferred embodiment invention the converter comprises a
Fourier transform device 8 for Fourier transforming the pair of polynomials
P(z) and Q(z) or one or more polynomials derived from the pair of polynomi
als P(z) and Q(z) into a frequency domain and an adjustment device 7 for
adjusting a phase of the spectrum RES derived from P(z) so that it is strictly
real and for adjusting a phase of the spectrum IES derived from Q(z) so that
it is strictly imaginary. The Fourier transform device may 8 be based on the
fast Fourier transform or on the discrete Fourier transform.
According to a preferred embodiment of the invention the adjustment device
7 is configured as a coefficient shifter 7 for circular shifting of coefficients of
the pair of polynomials P(z) and Q(z) or one or more polynomials derived
from the pair of polynomials P(z) and Q(z).
According to a preferred embodiment of the invention the coefficient shifter 7
is configured for circular shifting of coefficients in such way that an original
midpoint of a sequence of coefficients is shifted to the first position of the se
quence.
n theory, it is well known that the Fourier transform of a symmetric sequence
is real -valued and antisymmetric sequences have purely imaginary Fourier
spectra. n the present case, our input sequence is the coefficients of polynomial
P(z) or Q(z) which is of length m + I , whereas one would prefer to
have the discrete Fourier transform of a much greater length N » (m + I). The
conventional approach for creating longer Fourier spectra is zero-padding of
the input signal. However, zero-padding the sequence has to be carefully
implemented such that the symmetries are retained.
First a polynomial P(z) with coefficients
is considered.
The way fast Fourier transform algorithms are usually applied requires that
the point of symmetry is the first element, whereby when applied for example
in MATLAB one can write
fft([P2, Pi , Po, Po, Pi])
to obtain a real-valued output. Specifically, a circular shift may be applied,
such that the point of symmetry corresponding to the mid-point element, that
is, coefficient p2 is shifted left such that it is at the first position. The coeffi
cients which were on the left side of p2 are then appended to the end of the
sequence.
For a zero-padded sequence
[Po, P , R 2, Pi, Po, 0, 0 . . . 0]
one can apply the same process. The sequence
[P2, P , Po, 0 , 0 . . . 0 , po, pi]
will thus have a real-valued discrete Fourier transform. He e the number of
zeros in the input sequences is N - m I if N is the desired length of the
spectrum.
Correspondingly, consider the coefficients
corresponding to polynomial Q(z). By applying a circular shift such that the
former midpoint comes to the first position, one obtains
which has a purely imaginary discrete Fourier transform. The zero-padded
transform can then be taken for the sequence
[0, -qi, - q0 , 0, 0 . . . 0, q0 , q
Note that the above applies only for cases where the length of the sequence
is odd, whereby m + is even. For cases where m + is odd, one have two
options. Either one can implement the circular shift in the frequency domain
or apply a DFT with half-samples.
According to preferred embodiment of the invention the converter 3 compris¬
es a zero identifier 9 for identifying the zeros of the strictly real spectrum RES
derived from P(z) and the strictly imaginary spectrum IES derived from Q(z).
According to a preferred embodiment of the invention the zero identifier 9 is
configured for identifying the zeros by
a) starting with the real spectrum RES at nu i frequency;
b) increasing frequency until a change of sign at the real spectrum RES
is found;
c) increasing frequency until a further change of sign at the imaginary
spectrum IES is found; and
d) repeating steps b) and c) until a!i zeros are found.
Note that Q(z) and thus the imaginary part ES of the spectrum always has a
zero at the null frequency. Since the roots are overlapping, P(z) and thus the
rea part RES of the spectrum will then always be non-zero at the null fre¬
quency. One can therefore start with the real part RES at the null frequency
and increase the frequency until the first change of sign is found, which indi¬
cates the first zero-crossing and thus the first frequency value f .
Since the roots are interlaced, the spectrum ES of Q(z) will have the next
change in sign. One can thus increase the frequency until a change of sign
for the spectrum IES of Q(z) is found. This process then may be repeated,
alternating between the spectra of P(z) and Q(z), until all frequency values
f ..fn, have been found. The approach used for locating the zero-crossing in
the spectra RES and IES is thus similar to the approach applied in the Chebyshev-
domain [6, 7].
Since the zeros of P (z) and Q(z) are interlaced, one can alternate between
searching for zeros on the real parts RES and complex parts IES, such that
one finds all zeros in one pass, and reduce complexity by half in comparison
to a full search.
According to a preferred embodiment of the invention the zero identifier 9 is
configured for identifying the zeros by interpolation.
In addition to the zero-crossing approach one can readily apply interpolation
such that one can estimate the position of the zero with even higher acc ra
cy, for example, as it is done in conventional methods, e.g. [7].
Fig. 4 illustrates a second embodiment of the converter 3 of the information
encoder 1 according to the invention in a schematic view.
According to a preferred embodiment of the invention the converter 3 com¬
prises a zero-padding device 0 for adding one or more coefficients having a
value "0" to the polynomials P(z) and Q(z) so as to produce a pair of elongat
ed polynomials Pe(z) and Qe(z). Accuracy can be further improved by extending
the length of the evaluated spectrum RES, IES. Based on information
about the system, it is actually possible in some cases to determine a mini
mum distance between the frequency values f . . .f n , and thus determine the
minimum length of the spectrum RES, IES with which all frequency values
f ... , can be found [8].
According to a preferred embodiment of the invention the converter 3 is con¬
figured in such way that during converting the linear prediction coefficients to
frequency values . . , of a spectral frequency representation RES, IES of
the predictive polynomial A(z) at least a part of operations with coefficients
known to be have the value "0" of the elongated polynomials Pe(z) and Q (z)
are omitted.
Increasing the length of the spectrum does however also increase computa¬
tional complexity. The largest contributor to the complexity is the time domain
to frequency domain transform, such as a fast Fourier transform, of the coef
ficients of A(z). Since the coefficient vector has been zero-padded to the desired
length, it is however very sparse. This fact can readily be used to re
duce complexity. This is a rather simple problem in the sense that one knows
exactly which coefficients are zero, whereby on each iteration of the fast Fou
rier transform one can simply omit those operations which involve zeros. Application
of such sparse fast Fourier transform is straightforward and any
programmer skilled in the art can implement it. The complexity of such an
implementation is 0(N og ( 1 + m + I)), where N is the length of the spectrum
and m and I are defined as before.
According to a preferred embodiment of the invention the converter compris
es a limiting device for limiting the numerical range of the spectra of the
elongated polynomials Pe(z) and Qe(z) or one or more polynomials derived
from the elongated polynomials Pe(z) and Qe(z) by multiplying the elongated
polynomials Pe(z) and Qe(z) with a filter polynomial B(z), wherein the filter
polynomial B(z) is symmetric and does not have any roots on a unit circle.
B(z) can be found as explained above.
Fig. 5 illustrates an exemplary magnitude spectrum of a predictor A(z), the
corresponding flattening filters Bi (z) and B2(z) and the products A(z)Bi(z) and
A(z )B (z). The horizontal dotted line shows the level of A(z)Bi(z) at the 0-
and Nyquist-frequencies.
According to a preferred embodiment (not shown) of the invention the con
verter 3 comprises a limiting device 11 for limiting the numerical range of the
spectra RES, IES of the polynomials P(z) and Q(z) by multiplying the poly¬
nomials P(z) and Q(z) or one or more polynomials derived from the polyno
mials P(z) and Q(z) with a filter polynomial B(z), wherein the filter polynomial
B(z) is symmetric and does not have any roots on a unit circle.
Speech codecs are often implemented on mobile device with limited re¬
sources, whereby numerical operations must be implemented with fixed- point
representations. t is therefore essentia! that algorithms implemented operate
with numericai representations whose range is limited. For common speech
spectral envelopes, the numerical range of the Fourier spectrum is, however,
so large that one needs a 32-bit implementation of the FFT to ensure that the
location of zero-crossings are retained.
A 16-bit FFT can, on the other hand, often be implemented with lower com
plexity, whereby it would be beneficial to limit the range of spectral values to
fit within that 16-bit range. From the equations |P (e ) | 2|A(e' ) | and
|Q(e ) | <2|A(e' ) | it is known that by limiting the numerical range of B(z)A(z)
one also limits the numerical range of B(z)P (z) and B(z)Q(z). If B(z) does not
have zeros on the unit circle, then B(z)P (z) and B(z)Q(z) will have the same
zero-crossing on the unit circle as P (z) and Q(z). Moreover, B(z) has to be
symmetric such that z m+ +n 2P (z)B(z) and z_ m+ + Q(z)B(z) remain symmet
ric and antisymmetric and their spectra are purely real and imaginary, re¬
spectively. Instead of evaluating the spectrum of z( +1 A(z) one can thus
evaluate z + + A(z)B(z), where B(z) is an order n symmetric polynomial
without roots on the unit circle. In other words, one can apply the same ap¬
proach as described above, but first multiplying A(z) with filter B(z) and applying
a modified phase-shift z- m+ + 2.
The remaining task is to design a filter B(z) such that the numerical range of
A(z)B(z) is limited, with the restriction that B(z) must be symmetric and with
out roots on the unit circle. The simplest filter which fulfills the requirements is
an order 2 linear-phase filter B (z) = b0 + b z - 1 + b z 2, where b 6 R are the
parameters and |b21> 2 |b i | . By adjusting k one can modify the spectral tilt
and thus reduce the numerical range of the product A(z)B (z). A computa
tionally very efficient approach is to choose b such that the magnitude at 0-
frequency and Nyquist is equal, |A(1)Bi(1)[ = |A(-1 )Bi(-1)j, whereby one can
choose for example b0 = A(1 ) - A(-1 ) and b = 2 (A(1) + A(-1)).
This approach provides an approximately flat spectrum.
One observes from Fig. 5 that whereas A(z) has a high-pass character, Bi(z)
is low-pass, whereby the product A(z)Bi(z) has, as expected, equal magnitude
at 0- and Nyquist-frequency and it is more or less flat. Since z has
only one degree of freedom, one obviously cannot expect that the product
would be completely flat. Still, observe that the ratio between the highest
peak and lowest valley of B1(z)A(z) maybe much smaller than that of A(z).
This means that one have obtained the desired effect; the numerical range of
Bi(z)A(z) is much smaller than that of A(z).
A second, slightly more complex method is to calculate the autocorrelation r
of the impulse response of A(0.5z). Here multiplication by 0.5 moves the ze¬
ros of A(z) in the direction of origo, whereby the spectral magnitude is reduced
approximately by half. By applying the Levinson- Durbin on the auto
correlation rk, one obtains a filter H(z) of order n which is minimum-phase.
One can then define B2(z) = z nH(z)H(z _ ) to obtain a |B2(z)A(z)| which is a p
proximately constant. One will note that the range of |B2(z)A(z)| is smaller
than that of |B1(z)A(z)|. Further approaches for the design of B(z) can be
readily found in classical literature of FIR design [18].
Fig. 6 illustrates a third embodiment of the converter 3 of the information e n
coder 1 according to the invention in a schematic view.
According to a preferred embodiment of the invention the adjustment device
12 is configured as a phase shifter 12 for shifting a phase of the output of the
Fourier transform device 8.
According to a preferred embodiment of the invention the phase shifter 2 is
configured for shifting the phase of the output of the Fourier transform device
8 by multiplying a k-th frequency bin with exp(i2nkh/N), wherein N is the
length of the sample and h = (m+l)/2.
It is well-known that a circular shift in the time-domain is equivalent with a
phase-rotation in the frequency-domain. Specifically, a shift of h = ( + l)/2
steps in the time domain corresponds to multiplication of the k-th frequency
bin with cr(- 2p /N ) , where N is the length of the spectrum. Instead of the
circular shift, one can thus apply a multiplication in the frequency-domain to
obtain exactly the same result. The cost of this approach is a slightly increased
complexity. Note that h = (m + l)/2 is an integer number only when m
+ is even. When + is odd, the circular shift would require a delay by ra¬
tional number of steps, which is difficult to implement directly. Instead, one
can apply the corresponding shift in the frequency domain by the phaserotation
described above.
Fig. 7 illustrates a fourth embodiment of the converter 3 of the information
encoder 1 according to the invention in a schematic view.
According to a preferred embodiment of the invention the converter 3 comprises
a composite polynomial former 13 configured to establish a composite
polynomial C(P(z), Q(z)) from the polynomials P(z) and Q(z).
According to a preferred embodiment of the invention the converter 3 is con
figured in such way that the strictly real spectrum derived from P(z) and the
strictly imaginary spectrum from Q(z) are established by a single Fourier
transform, for example a fast Fourier transform (FFT), by transforming a
composite polynomial C(P(z), Q(z)).
The polynomials P (z) and Q(z) are symmetric and antisymmetric, respectivey,
with the axis of symmetry at z + 2. It follows that the spectra of
z + ) 2P(z) and z + Q(z), respectively, evaluated on the unit circle
z = exp(iO), are rea and complex valued, respectively. Since the zeros are on
the unit circle, one can find them by searching for zero-crossings. Moreover,
the evaluation on the unit-circle can be implemented simply by an fast Fouri
er transform.
As the spectra corresponding to z m+l 2P (z) and z + Q(z) are real and
complex, respectively, 2 is one can implement them with a single fast Fourier
transform. Specifically, if one take the sum z , ,+l (P (z) + Q(z)) then the real
and complex parts of the spectra correspond to m+ 2 P(z) and z- + 2 Q(z),
respectively. Moreover, since z (m 2 (P (z) + Q(z)) = 2z +l 2 A(z), one can
directly take the FFT of 2z + 2 A(z) to obtain the spectra corresponding to
z m+l 2 P(z) and z m+ 2 Q(z), without explicitly determining P(z) and Q(z).
Since one is interested only in the locations of zeros, 1 can omit multiplication
by the scalar 2 and evaluate z + 2 A(z) by FFT instead. Observe that since
A(z) has only + 1 non-zero coefficients, one can use FFT pruning to re
duce complexity [ 1 1]. To ensure that all roots are found, one must use an
FFT of sufficiently high length N that the spectrum is evaluated on at least
one frequency between every two zeros.
According to a preferred embodiment (not shown) of the invention the con¬
verter 3 comprises a composite polynomial former configured to establish a
composite polynomial Ce(Pe(z), Qe(z)) from the elongated polynomials Pe(z)
According to a preferred embodiment (not shown) of the invention the con
verter is configured in such way that the strictly real spectrum derived from
P(z) and the strictly imaginary spectrum from Q(z) are established by a single
Fourier transform by transforming the composite polynomial Ce(Pe(z), Qe(z)).
Fig 8 illustrates a fifth embodiment of the converter 3 of the information e n
coder 1 according to the invention in a schematic view.
According to preferred embodiment of the invention the converter 3 compris
es a Fourier transform device 14 for Fourier transforming the pair of polyno
mials P(z) and Q(z) or one or more polynomials derived from the pair of polynomials
P(z) and Q(z) into a frequency domain with half samples so that the
spectrum derived from P(z) is strictly real and so that the spectrum derived
from Q(z) is strictly imaginary.
An alternative is to implement a DFT with half-samples. Specifically, whereas
the conventional DFT is defined as
one can define the half-sample DFT as
A fast implementation as FFT can readily be devised for this formulation.
The benefit of this formulation is that now the point of symmetry is at n = 1/2
instead of the usual n = 1. With this half-sample DFT one would then with a
sequence
[2, 1, 0 , 0 , 1, 2]
obtain a real-valued Fourier spectrum RES.
In the case of odd m+ , for a polynomial P z) with coefficients p , i , 2 R2,
, o one can then with a half-sample DFT and zero padding obtain a real
valued spectrum RES when the input sequence is
[P2, P , Po, 0, 0 . . . 0 , , P , p2] .
Correspondingly, for a polynomial Q(z) one can apply the half-sample DFT
on the sequence
~ 2, -qi, -qo, 0, 0 . . . 0, q0, i , q2]
to obtain a purely imaginary spectrum IES.
With these methods, for any combination of m and , one can obtain a real
valued spectrum for a polynomial P(z) and a purely imaginary spectrum for
any Q(z). In fact, since the spectra of P(z) and Q(z) are purely real and imag
inary, respectively, one can store them in a single complex spectrum, which
then corresponds to the spectrum of P(z) + Q(z) = 2A(z). Scaling by the fac¬
tor 2 does not change the location of roots, whereby it can be ignored. One
can thus obtain the spectra of P(z) and Q(z) by evaluating only the spectrum
of A(z) using a single FFT. One only need to apply the circular shift, as ex
plained above, to the coefficients of A(z).
For example, with = 4 and I = 0 , the coefficients of A(z) are
which one can zero-pad to an arbitrary length N by
[a , , , a2, a3, a4 0 , 0 . . . 0]
If one then applies a circular shift of ( + l)/2 = 2 steps, one obtains
[a?, a3, a4, 0 , 0 . . . 0 a0, a,].
By taking the DFT of this sequence, one has the spectrum of P(z) and Q(z) in
the rea parts RES and complex parts IES of the spectrum.
The overall algorithm in the case where m + is even can be stated as fol
lows. Let the coefficients of A(z), denoted by a , reside in a buffer of length
N.
. Apply a circular shift on ak of (m + l)/2 steps to the left.
2. Calculate the fast Fourier transform of the sequence ak and denote it
by Ak.
3. Until all frequency values have been found, start with k = 0 and alte r
nate between
(a) While sign(real(A k )) = sign(real(A k+1)) increase k := k + 1. Once the
zero-crossing has been found, store k in the list of frequency values.
(b) While sign(imag(A k )) = sign(imag(A +1)) increase k := k + 1. Once the
zero-crossing has been found, store k in the list of frequency values.
4 . For each frequency vaiue, interpolate between Ak and A +i to deter
mine the accurate position.
Here the functions sign(x), real(x) and imag(x) refer to the sign of x , the real
part of x and the imaginary part of x , respectively.
For the case of + I odd, the circular shift is reduced to on y (m + - 1)/2
steps left and the regular fast Fourier transform is replaced by the halfsample
fast Fourier transform.
Alternatively, we can always replace the combination of circular shift and st
Fourier transform, with fast Fourier transform and a phase-shift in frequency
domain.
For more accurate locations of roots, it is possible to use the above proposed
method to provide a first guess and then apply a second step which refines
the root loci. For the refinement, we can apply any classical polynomial root
finding method such as Durand-Kerner, Aberth-Ehrlich's, Laguerre's the
Gauss-Newton method or others [ 11-17].
In one formulation, the presented method consists of the following steps:
(a) For a sequence of length + i + 1 zero-padded to length N , where m
+ I is even, apply a circular shift of (m + l)/2 steps to the left, such that the
buffer length is N and corresponds to the desired length of the output spectrum,
or
for a sequence of length m + + 1 zero-padded to length N , where
m + is odd, apply a circular shift of ( + - 1)/2 steps to the left, such that
the buffer length is N and corresponds to the desired length of the output
spectrum.
(b) If + i is even, apply a regular DFT on the sequence. If m+ s odd,
apply a half-sampled DFT on the sequence as described by Eq. 3 or an
equivalent representation.
(c) If the input signal was symmetric or antisymmetric, search for zerocrossings
of the frequency domain representation and store the locations in a
list.
If the input signal was a composite sequence B(z) = P (z) + Q(z),
search for zero-crossings in both the rea and the imaginary part of the f re
quency domain representation and store the locations in a list. If the input
signal was a composite sequence B(z) = P (z)+Q(z), and the roots of P (z)
and Q(z) alternate or have similar structure, search for zero-crossings by a l
ternating between the real and the imaginary part of the frequency domain
representation and store the locations in a list.
In another formulation, the presented method consists of the following steps
(a) For an input signal which is of the same form as in the previous point,
apply the DFT on the input sequence.
(b) Apply a phase-rotation to the frequency-domain values, which is
equivalent to a circular shift of the input signal by (m + l)/2 steps to the left.
(c) Apply a zero-crossing search as was done in the previous point.
With respect to the encoder 1 and the methods of the described embodi
ments the following is mentioned:
Although some aspects have been described in the context of an apparatus,
it is clear that these aspects also represent a description of the correspond
ing method, where a block or device corresponds to a method step or a fea¬
ture of a method step. Analogously, aspects described in the context of a
method step also represent a description of a corresponding block or item or
feature of a corresponding apparatus.
Depending on certain implementation requirements, embodiments of the in¬
vention can be implemented in hardware or in software. The implementation
can be performed using a digital storage medium, for example a floppy disk,
a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH
memory, having electronically readable control signals stored thereon, which
cooperate (or are capable of cooperating) with a programmable computer
system such that the respective method is performed.
Some embodiments according to the invention comprise a data carrier having
electronically readable control signals, which are capable of cooperating
with a programmable computer system, such that one of the methods de
scribed herein is performed.
Generally, embodiments of the present invention can be implemented as a
computer program product with a program code, the program code being
operative for performing one of the methods when the computer program
product runs on a computer. The program code may for example be stored
on a machine readable carrier.
Other embodiments comprise the computer program for performing one of
the methods described herein, stored on a machine readable carrier or a
non-transitory storage medium.
In other words, an embodiment of the inventive method is, therefore, a computer
program having a program code for performing one of the methods de¬
scribed herein, when the computer program runs on a computer.
A further embodiment of the inventive methods is, therefore, a data carrier (or
a digital storage medium, or a computer-readable medium) comprising, recorded
thereon, the computer program for performing one of the methods de
scribed herein.
A further embodiment of the inventive method is, therefore, a data stream or
a sequence of signals representing the computer program for performing one
of the methods described herein. The data stream or the sequence of signals
may for example be configured to be transferred via a data communication
connection, for example via the Internet.
A further embodiment comprises a processing means, for example a com¬
puter, or a programmable logic device, configured to or adapted to perform
one of the methods described herein.
A further embodiment comprises a computer having installed thereon the
computer program for performing one of the methods described herein.
In some embodiments, a programmable logic device (for example a field pro¬
grammable gate array) may be used to perform some or all of the functionali
ties of the methods described herein. In some embodiments, a field pro¬
grammable gate array may cooperate with a microprocessor in order to per¬
form one of the methods described herein. Generally, the methods are advantageously
performed by any hardware apparatus.
While this invention has been described i terms of several embodiments,
there are alterations, permutations, and equivalents which fall within the
scope of this invention. It should also be noted that there are many alternative
ways of implementing the methods and compositions of the present in¬
vention. It is therefore intended that the following appended claims be inter¬
preted as including all such alterations, permutations and equivalents as fall
within the true spirit and scope of the present invention.
Reference signs:
1 information encoder
2 analyzer
3 converter
4 quantizer
5 bitstream producer
6 determining device
7 coefficient shifter
8 Fourier transform device
9 zero identifier
10 zero-padding device
11 limiting device
12 phase shifter
13 composite polynomial former
14 half sample Fourier transforming device
S information signal
RES re a spectrum
IES imaginary spectrum
f ... frequency values
fqi . . ,fq quantized frequency values
BS bitstream
References:
[1] B. Bessette, R. Salami, R . Lefebvre, M. Jelinek, J. Rotola-Pukkila, J .
Vainio, H. Mikko!a, and K. Jarvinen, "The adaptive multirate wideband
speech codec (AMR-WB)", Speech and Audio Processing, IEEE
Transac- tions on, vol. 10 , no. 8, pp. 620-636, 2002.
ITU-T G.71 , "Frame error robust narrow-band and wideband embed¬
ded variable bit-rate coding of speech and audio from 8-32 kbit/s",
2008.
M. Neuendorf, P. Gournay, M. Mu!trus, J. Lecomte, B Bessette, R.
Geiger, S Bayer, G. Fuchs, J . Hilpert, N. Rettelbach, R. Salami, G.
Schulier, R. Lefebvre, and B. Grill, "Unified speech and audio coding
scheme for high quality at low bitrates", in Acoustics, Speech and Sig
nal Processing. ICASSP 2009. IEEE nt Conf, 2009, pp. 1-4.
T. Backstrom and C. Magi, "Properties of line spectrum pair polynomi
als - a review", Signal Processing, vo . 86, no. 11, pp. 3286-3298,
November 2006.
G. Kang and L. Fransen, "Application of line-spectrum pairs to low-bitrate
speech encoders", in Acoustics, Speech, and Signal Processing,
IEEE international Conference on ICASSP'85., vol. 10. IEEE, 1985,
pp. 244-247.
P. Kabal and R. P. Ramachandran, "The computation of line spectral
frequencies using Chebyshev polynomials", Acoustics, Speech and
Signal Processing, IEEE Transactions on, vo . 34, no. 6 , pp. 141 9-
1426, 1986.
3GPP TS 26. 190 V7.0.0, "Adaptive multi-rate (AMR-WB) speech co
dec", 2007.
T . Backstrom, C. Magi, and P. AIku, "Minimum separation of line spec
tral frequencies", IEEE Signal Process Lett., vol. 14, no. 2 , pp. 145-
147, February 2007.
T. Backstrom, "Vandermonde factorization of Toeplitz matrices and
applications in filtering and warping," IEEE Trans Signal Process., vol.
6 1. no 24, pp. 6257-6263, 201 3 .
V. F. Pisarenko, "The retrieval of harmonics from a covariance func
tion", Geophysical Journal of the Royal Astronomical Society, vol. 33,
no. 3 , pp. 347-366, 1973.
E. Durand, Solutions Numeriques des Equations Algebriques. Paris:
Masson, 1960.
I . Kerner, "Ein Gesamtschrittverfahren zur Berechnung der Nullstellen
von Polynomen", Numerische Mathematik, vol. 8 , no. 3, pp. 290-294,
May 1966.
O. Aberth, "Iteration methods for finding all zeros of a polynomial sim
ultaneously", Mathematics of Computation, vol. 27, no. 122, pp. 339-
344, April 1973.
L . Ehrlich, "A modified newton method for polynomials", Communica¬
tions of the ACM, vol. 10 , no. 2 , pp. 107-1 08, February 1967.
D. Starer and A . Nehorai, "Polynomial factorization algorithms for
adaptive root estimation", in Int. Conf. on Acoustics, Speech, and Sig
nal Processing, vol 2 . Glasgow, UK: IEEE, May 1989, pp. 5 8—
1161 .
, "Adaptive polynomial factorization by coefficient matching", IEEE
Transactions on Signal Processing, vol. 39, no. 2 , pp. 527-530, Feb
ruary 1991 .
[ 1 ] G. H. Golub and C. F. van Loan, Matrix Computations, 3rd ed. John
Hopkins University Press, 998.
[ 8] T. Saramaki, "Finite impulse response filter design", Handbook for Digital
Signal Processing, pp. 155-277, 1993.
Ciaims
1 Information encoder for encoding an information signal (IS), the infor
mation encoder ( 1) comprising:
an analyzer (2) for analyzing the information signal (IS) in order to obtain
linear prediction coefficients of a predictive polynomial A(z);
a converter (3) for converting the linear prediction coefficients of the pre
dictive polynomial A(z) to frequency values f-|.. .f of a spectral frequency
representation of the predictive polynomial A(z), wherein the converter (3)
is configured to determine the frequency values f i ...f by analyzing a pair
of polynomials P(z) and Q(z) being defined as
P(z) = A(z) + z m " A(z " 1 ) and
Q(z) = A(z) - z- m - A(z- 1) ,
wherein m is an order of the predictive polynomial A(z) and is greater or
equal to zero, wherein the converter (3) is configured to obtain the fre¬
quency values (f ...f ) by establishing a strictly real spectrum (RES) de¬
rived from P(z) and a strictly imaginary spectrum (IES) from Q(z) and by
identifying zeros of the strictly real spectrum (RES) derived from P(z) and
the strictly imaginary spectrum (IES) derived from Q(z);
a quantizer (4) for obtaining quantized frequency (fq ...f qn) values from the
frequency values (fi . . .fn) ; and
a bitstream producer (5) for producing a bitstream comprising the quan¬
tized frequency values (fq ...fqn) .
2 . Information encoder according to the preceding claim, wherein the con¬
verter (3) comprises a determining device (6) to determine the polynomi¬
als P(z) and Q(z) from the predictive polynomial A(z).
3 . Information encoder according to one of the preceding claims, wherein
the converter (3) comprises a zero identifier (9) for identifying the zeros of
the strictly real spectrum (RES) derived from P(z) and the strictly imagi¬
nary spectrum (IES) derived from Q(z).
4 . Information encoder according to the preceding claim, wherein the zero
identifier (9) is configured for identifying the zeros by
a) starting with the real spectrum (RES) at null frequency;
b) increasing frequency until a change of sign at the real spectrum (RES)
is found;
c) increasing frequency until a further change of sign at the imaginary
spectrum (IES) is found; and
d) repeating steps b) and c) until all zeros are found.
5 . Information encoder according to claim 3 or claim 4 , wherein the zero
identifier is configured for identifying the zeros by interpolation.
6 . Information encoder according to one of the preceding claims, wherein
the converter (3) comprises a zero-padding device (10) for adding one or
more coefficients having a value "0" to the polynomials P(z) and Q(z) so
as to produce a pair of elongated polynomials Pe(z) and Q (z).
7 . Information encoder according to claim 5 or claim 6 , wherein the convert¬
er (3) is configured in such way that during converting the linear prediction
coefficients to frequency values (f ...fn) of the spectral frequency repre
sentation (RES, IES) of the predictive polynomial A(z) at least a part of
operations with coefficients known to be have the value "0" of the elon
gated polynomials Pe(z) and Qe(z) are omitted.
8 . Information encoder according to one of the claims 5 to 7 , wherein the
converter (3) comprises a composite polynomial former ( 3) configured to
establish a composite polynomial C (P (z), Q (z)) from the elongated po l
ynomials Pe(z) and Qe(z).
9 . Information encoder according to the preceding claim, wherein the converier
(3) is configured in such way that the strictly real spectrum (RES)
derived from P(z) and the strictly imaginary spectrum (IES) from Q(z) are
established by a single Fourier transform by transforming the composite
polynomial Ce(Pe(z), Qe(z)).
0 . Information encoder according to one of the preceding claims, wherein
the converter (3) comprises a Fourier transform device (8) for Fourier
transforming the pair of polynomials P(z) and Q(z) or one or more poly
nomials derived from the pair of polynomials P(z) and Q(z) into a f requen
cy domain and an adjustment device (7, 12) for adjusting a phase of the
spectrum (RES) derived from P(z) so that it is strictly real and for adjust
ing a phase of the spectrum (IES) derived from Q(z) so that it is strictly
imaginary.
11. Information encoder according to the preceding claim, wherein the adjustment
device (7, 12) is configured as a coefficient shifter (7) for circular
shifting of coefficients of the pair of polynomials P(z) and Q(z) or the one
or more polynomials derived from the pair of polynomials P(z) and Q(z).
12 . Information encoder according to the preceding claim, wherein the coefficient
shifter (7) is configured for circular shifting of coefficients in such
way that an original midpoint of a sequence of coefficients is shifted to the
first position of the sequence.
13. Information encoder according to claim 10 , wherein the adjustment device
(7, 12) is configured as a phase shifter (12) for shifting a phase of the
output of the Fourier transform device (8).
14. Information encoder according to the preceding claim, wherein the phase
shifter (12) is configured for shifting the phase of the output of the Fourier
transform device (8) by multiplying a k-th frequency bin with exp(i2irkh/N) ,
wherein N is the length of the sample and h = (m+l)/2.
15. Information encoder according to one of the claims 1 to 9 , wherein the
converter (3) comprises a Fourier transform device ( ) for Fourier transforming
the pair of polynomials P(z) and Q(z) or one or more polynomials
derived from the pair of polynomials P(z) and Q(z) into a frequency do¬
main with half samples so that the spectrum (RES) de ved from P(z) is
strictly real and so that the spectrum (IES) derived from Q(z) is strictly im
aginary.
16. Information encoder according to one of the preceding claims, wherein
the converter (3) comprises a composite polynomial former ( 13) config¬
ured to establish a composite polynomial C(P(z), Q(z)) from the polyno
mials P(z) and Q(z).
7 . Information encoder according to the preceding claim, wherein the con
verter (3) is configured in such way that the strictly real spectrum (RES)
derived from P(z) and the strictly imaginary spectrum (lES )from Q(z) are
established by a single Fourier transform by transforming the composite
polynomial C(P(z), Q(z)).
. Information encoder according to one of the preceding claims, wherein
the converter (3) comprises a limiting device ( 1 1) for limiting the numerical
range of the spectra (RES, IES) of the polynomials P(z) and O(z) by mul
tiplying the polynomials P(z) and Q(z) or one or more polynomials derived
from the polynomials P(z) and Q(z) with a filter polynomial B(z), wherein
the filter polynomial B(z) is symmetric and does not have any roots on a
unit circle.
19 . Information encoder according to one of the claims 6 to 18 , wherein the
converter (3) comprises a limiting device ( 1 ) for limiting the numerical
range of the spectra (RES, IES) of the elongated polynomials P (z) and
Qe(z) or one or more polynomials derived from the elongated polynomials
Pe(z) and Qe(z) by multiplying the elongated polynomials Pe(z) and Q (z)
with a filter polynomial B(z), wherein the filter polynomial B(z) is symmet
ric and does not have any roots on a unit circle.
20. Method for operating an information encoder (1) for encoding an infor
mation signal (IS), the method comprises the steps of:
analyzing the information signal (IS) in order to obtain linear prediction
coefficients of a predictive polynomial A(z);
converting the linear prediction coefficients of the predictive polynomial
A(z) to frequency values (fi. ..f ) of a spectral frequency representation
(RES, IES) of the predictive polynomial A(z), wherein the frequency val¬
ues (f ..f n) are determined by analyzing a pair of polynomials P(z) and
Q(z) being defined as
P(z) = A(z) + z~ ~ A(z " 1 ) and
Q(z) = A(z) - - m- A(z " 1) ,
wherein m is an order of the predictive polynomial A(z) and is greater or
equal to zero, wherein the frequency values (fi ...fn) e obtained by establishing
a strictly real spectrum (RES) derived from P(z) and a strictly
imaginary spectrum (!ES) from Q(z) and by identifying zeros of the strictly
rea spectrum (RES) derived from P(z) and the strictly imaginary spectrum
(IES) derived from Q (z);
obtaining quantized frequency (fq i . . .f n) values from the frequency values
producing a bitstream (BS) comprising the quantized frequency values
2 1.Computer program for, when running on a processor, executing the
method according to the preceding claim.
| # | Name | Date |
|---|---|---|
| 1 | Form 5 [30-08-2016(online)].pdf | 2016-08-30 |
| 2 | Form 3 [30-08-2016(online)].pdf | 2016-08-30 |
| 3 | Form 18 [30-08-2016(online)].pdf_14.pdf | 2016-08-30 |
| 4 | Form 18 [30-08-2016(online)].pdf | 2016-08-30 |
| 5 | Drawing [30-08-2016(online)].pdf | 2016-08-30 |
| 6 | Description(Complete) [30-08-2016(online)].pdf | 2016-08-30 |
| 7 | 201617029566.pdf | 2016-09-21 |
| 8 | Form 26 [04-10-2016(online)].pdf | 2016-10-04 |
| 9 | 201617029566-Power of Attorney-061016.pdf | 2016-10-09 |
| 10 | 201617029566-Correspondence-061016.pdf | 2016-10-09 |
| 11 | Other Patent Document [20-02-2017(online)].pdf | 2017-02-20 |
| 12 | 201617029566-OTHERS-230217.pdf | 2017-02-25 |
| 13 | 201617029566-Correspondence-230217.pdf | 2017-02-25 |
| 14 | Form 3 [01-03-2017(online)].pdf | 2017-03-01 |
| 15 | Form 3 [07-07-2017(online)].pdf | 2017-07-07 |
| 16 | 201617029566-FORM 3 [08-01-2018(online)].pdf | 2018-01-08 |
| 17 | 201617029566-FORM 3 [21-01-2019(online)].pdf | 2019-01-21 |
| 18 | 201617029566-FORM 3 [09-07-2019(online)].pdf | 2019-07-09 |
| 19 | 201617029566-FER.pdf | 2019-11-28 |
| 20 | 201617029566-PETITION UNDER RULE 137 [19-03-2020(online)].pdf | 2020-03-19 |
| 21 | 201617029566-FORM 3 [19-03-2020(online)].pdf | 2020-03-19 |
| 22 | 201617029566-FORM 4(ii) [19-05-2020(online)].pdf | 2020-05-19 |
| 23 | 201617029566-OTHERS [28-08-2020(online)].pdf | 2020-08-28 |
| 24 | 201617029566-FER_SER_REPLY [28-08-2020(online)].pdf | 2020-08-28 |
| 25 | 201617029566-DRAWING [28-08-2020(online)].pdf | 2020-08-28 |
| 26 | 201617029566-COMPLETE SPECIFICATION [28-08-2020(online)].pdf | 2020-08-28 |
| 27 | 201617029566-CLAIMS [28-08-2020(online)].pdf | 2020-08-28 |
| 28 | 201617029566-ABSTRACT [28-08-2020(online)].pdf | 2020-08-28 |
| 29 | 201617029566-FORM 3 [07-09-2020(online)].pdf | 2020-09-07 |
| 30 | 201617029566-PETITION UNDER RULE 137 [28-10-2020(online)].pdf | 2020-10-28 |
| 31 | 201617029566-FORM 3 [28-10-2020(online)].pdf | 2020-10-28 |
| 32 | 201617029566-Correspondence to notify the Controller [28-10-2020(online)].pdf | 2020-10-28 |
| 33 | 201617029566-FORM-26 [17-11-2020(online)].pdf | 2020-11-17 |
| 34 | 201617029566-Written submissions and relevant documents [02-12-2020(online)].pdf | 2020-12-02 |
| 35 | 201617029566-FORM 3 [03-03-2021(online)].pdf | 2021-03-03 |
| 36 | 201617029566-PatentCertificate15-04-2021.pdf | 2021-04-15 |
| 37 | 201617029566-IntimationOfGrant15-04-2021.pdf | 2021-04-15 |
| 38 | 201617029566-US(14)-HearingNotice-(HearingDate-18-11-2020).pdf | 2021-10-17 |
| 39 | 201617029566-RELEVANT DOCUMENTS [20-09-2023(online)].pdf | 2023-09-20 |
| 1 | SEARCH201617029566_22-11-2019.pdf |
| 2 | NPL201617029566_22-11-2019.pdf |