Specification
DESCRIPTION
MOVING PICTURE CODING METHOD, AND MOVING PICTURE DECODING METHOD
TECHNICAL FIELD
The present invention relates to a moving picture coding
method and a moving picture decoding method and, more
particularly, to a method for coding or decoding pictures
constituting a moving picture with reference to other pictures of
the moving picture.
BACKGROUND ART
Generally, in coding of pictures constituting a moving
picture, each picture is divided into plural blocks, and
compressive coding (hereinafter, also referred to simply as
"coding") of image information possessed by each picture is
carried out for every block, utilizing redundancies in the space
direction and time direction of the moving picture. As a coding
process utilizing redundancy in the space direction, there is
intra-picture coding utilizing correlation of pixel values in a
picture. As a coding process utilizing redundancy in the time
direction, there is inter-picture predictive coding utilizing
correlation of pixel values between pictures. The inter-picture
predictive coding is a process of coding a target picture to be
coded, with reference to a picture that is positioned timewise
forward the target picture (forward picture), or a picture that
is positioned timewise backward the target picture (backward
picture).
The forward picture is a picture whose display time is
earlier that that of the target picture, and it is positioned
forward the target picture on a time axis indicating the display
times of the respective pictures (hereinafter, referred to as
"display time axis"). The backward picture is a picture whose
display time is later than that of the target picture, and it is
positioned backward the target picture on the display time axis.
Further, in the following description, a picture to be referred
to in coding the target picture is called a reference picture.
In the inter-picture predictive coding, specifically, a
motion vector of the target picture with respect to the reference
picture is detected, and prediction data for image data of the
target picture is obtained by motion compensation based on the
motion vector. Then, redundancy of difference data between the
prediction data and the image data of the target picture in the
space direction of the picture is removed, thereby to perform
compressive coding for the amount of data of the target picture.
On the other hand, as a process for decoding a coded picture,
there are intra-picture decoding corresponding to the intra-
picture coding, and inter-picture decoding corresponding to the
inter-picture coding. In the inter-picture decoding, the same
picture as a picture that is referred to in the inter-picture
coding is referred to. That is, a picture Xtg that is coded with
reference to pictures Xra and Xrb is decoded with reference to
the pictures Xra and Xrb.
Figures 43(a)-43(c) are diagrams illustrating plurai
pictures constituting a moving picture.
In figure 43(a), part of plural pictures constituting one
moving picture Mpt, i.e., pictures F(k)~F(k+2n-1) [k,n:
integers], are shown. Display times t(k)~t(k+2n-1) are set on
the respective pictures F(k)~F(k+2n-1) . As shown in figure
43(a), the respective pictures are successively arranged from one
having earlier display time on a display time axis X indicating
display times Tdis of the respective pictures, and these pictures
are grouped for every predetermined number (n) of pictures. Each
of these picture groups is called a GOP (Group of Pictures) , and
this is a minimum unit of random access to coded data of a moving
picture. In the following description, a picture group is
sometimes abbreviated as a GOP.
For example, an (i)th picture group Gp(i) is constituted by
pictures F(k)~F(k+n-1) . An (i+1)th picture group Gp(i+1) is
constituted by pictures F(n+k)~F(k+2n-1) .
Each picture is divided into plural slices each comprising
plural macroblocks. For example, a macroblock is a rectangle
area having 16 pixels in the vertical direction and 16 pixels in
the horizontal direction. Further, as shown in figure 43(b), a
picture F(k+1) is divided into plural slices SL1~SLm [m: natural
number]. A slice SL2 is constituted by plural macroblocks MB1~
MBr [r: natural number] as shown in figure 43(c).
Figure 44 is a diagram for explaining coded data of a moving
picture, illustrating a structure of a stream obtained by coding
the respective pictures constituting the moving picture.
A stream Smp is coded data corresponding to one image
sequence (e.g., one moving picture). The stream Smp is composed
of an area (common information area) Cstr wherein bit streams
corresponding to common information such as a header are arranged,
and an area (GOP area) Dgop wherein bit streams corresponding to
the respective GOPs are arranged. The common information area
Cstr includes sync data Sstr and a header Hstr corresponding to
the stream. The GOP area Dgop includes bit streams Bg(1)~Bg(i-
1), Bg(i), Bg(i+1)~Bg(I) corresponding to picture groups (GOP)
Gp(1)~Gp(i-1), Gp(i), Gp(i+1)~Gp(I) [i,I: integers].
Each bit stream corresponding to each GOP is composed of an
area (common information area) Cgop wherein bit streams
corresponding to common information such as a header are arranged,
and an area (picture area) Dpct wherein bit streams corresponding
to the respective pictures are arranged. The common information
area Cgop includes sync data Sgop and a header Hgop corresponding
to the GOP. A picture area Dpct of the bit stream Bg(i)
corresponding to the picture group G(i) includes bit streams
Bf(k'), Bf(k"+1), Bf(k'+2), Bf(k'+3), ..., Bf(k'+s) corresponding
to pictures F(k')/ F(k'+1), F(k'+2), F(k'+3), ..., F(k'+s) [k',s:
integers]. The pictures F(k'), F(k'+1), F(k'+2), F(k'+3), ...,
F(k'+s) are obtained by rearranging, in coding order, the
pictures F(k)~F(k+n-1) arranged in order of display times.
Each bit stream corresponding to each picture is composed of
an area (common information area) Cpct wherein bit streams
corresponding to common information such as a header are arranged,
and an area (slice area) Dslc wherein bit streams corresponding
to the respective slices are arranged. The common information
area Cpct includes sync data Spct and a header Hpct corresponding
to the picture. For example, when the picture F(k'+1) in the
arrangement in order of coding times (coding order arrangement)
is the picture F(k+1) in the arrangement in order of display
times (display order arrangement), the slice area Dslc in the bit
stream Bf(k'+1) corresponding to the picture F(k'+1) includes bit
streams Bs1~Bsm corresponding to the respective slices SL1~SLm.
Each bit stream corresponding to each slice is composed of
an area (common information area) Cslc wherein bit streams
corresponding to common information such as a header are arranged,
and an area (macroblock area) Dmb wherein bit streams
corresponding to the respective macroblocks are arranged. The
common information area Cslc includes sync data Sslc and a header
Hslc corresponding to the slice. For example, when the picture
F(k'+1) in the coding order arrangement is the picture F(k+1) in
the display order arrangement, the macroblock area Dmb in the bit
stream Bs2 corresponding to the slice SL2 includes bit streams
Bm1~Bmr corresponding to the respective macroblocks MB1~MBr.
As described above, coded data corresponding to one aoving
picture (i.e., one image sequence) has a hierarchical structure
comprising a stream layer corresponding to a stream Smp as the
coded data, GOP layers corresponding to GOPs constituting the
stream, picture layers corresponding to pictures constituting
each of the GOPs, and slice layers corresponding to slices
constituting each of the pictures.
By the way, in moving picture coding methods such as MPEG
(Moving Picture Experts Group)-1, MPEG-2, MPEG-4, ITU-T
recommendation H.263, H.26L, and the like, a picture to be
subjected to intra-picture coding is called an I picture, and a
picture to be subjected to inter-picture predictive coding is
called a P picture or a B picture.
Hereinafter, definitions of an I picture, a P picture, and a
B picture will be described.
An I picture is a picture to be coded without referring to
another picture. A P picture or B picture is a picture to be
coded with reference to another picture. To be exact, a P
picture is a picture for which either I mode coding or P mode
coding can be selected when coding each block in the picture. A
B picture is a picture for which one of I mode coding, P mode
coding, and B mode coding can be selected when coding each block
in the picture.
The I mode coding is a process of performing intra-picture
coding for a target block in a target picture without referring
to another picture. The P mode coding is a process of performing
inter-picture predictive coding for a target block in a target
picture with reference to an already-coded picture. The B mode
coding is a process of performing inter-picture predictive coding
for a target block in a target picture with reference to two
already-coded pictures.
A picture to be referred to during the P mode coding or B
mode coding is an I picture or a P picture other than the target
picture, and it may be either a forward picture positioned
forward the target picture or a backward picture positioned
backward the target picture.
However, there are three ways of combining two pictures to
be referred to during the B mode coding. That is, there are
three cases of B mode coding as follows: a case where two forward
pictures are referred to, a case where two backward pictures are
referred to, and a case where one forward picture and one
backward picture are referred to.
Figure 45 is a diagram for explaining a moving picture
coding method such as MPEG described above. Figure 45
illustrates relationships between target pictures and the
corresponding reference pictures (pictures to be referred to when
coding the respective target pictures).
Coding of the respective pictures F(k)~F(k+7), ..., F(k+17)
~F(k+21) constituting the moving picture is carried out with
reference to other pictures as shown by arrows Z. To be specific,
a picture at the end of one arrow Z is coded by inter-picture
predictive coding with reference to a picture at the beginning of
the same arrow Z. In figure 45, the pictures F(k)~F(k+7), —,
F(k+17)~F(k+21) are identical to the pictures F(k)~F(k+4), ...,
F(k+n-2)~F(k+n+4) , ..., F(k+2n-2), F(k+2n-l) shown in figure
43(a). These pictures are successively arranged from one having
earlier display time on the display time axis X. The display
times of the pictures F(k)~F(k+7), ..., F(k+17) ~F{k+21) are
times t(k)~t(k+7), ..., t(k+17)~t(k+21). The picture types of
the pictures F(k)~F(k+7) are I, B, B, P, B, B, P, B, and the
picture types of the pictures F(k+17) ~F(k+21) are B, P, B, B, P.
For example, when performing B mode coding for the second B
picture F(k+1) shown in figure 45, the first I picture F(k) and
the fourth P picture F(k+3) are referred to. Further, when
performing P mode coding for the fourth P picture F(k+3) shown in
figure 45, the first I picture F(k) is referred to.
Although a forward picture is referred to in P mode coding
of a P picture in figure 45, a backward picture may be referred
to. Further, although a forward picture and a backward picture
are referred to in B mode coding of a B picture in figure 45, two
forward pictures or two backward pictures may be referred to.
Furthermore, in a moving picture coding method such as MPEG-
4 or H.26L, a coding mode called "direct mode" may be selected
when coding a B picture.
Figures 46(a) and 46(b) are diagrams for explaining inter-
picture predictive coding to be performed with the direct mode.
Figure 46(a) shows motion vectors to be used in the direct mode.
In figure 46(a), pictures P1, B2, B3, and P4 correspond to
the pictures F(k+3)~F(k+6) [k=-2] shown in figure 45, and times
t(l), t(2), t(3), and t(4) (t(1)
Since the bit stream analysis unit 201, the mode decoding
unit 203, and the prediction error decoding unit 202 operate in
the same way as described for decoding of the picture P13,
repeated description is not necessary.
The motion compensation decoding unit 205 generates motion
compensation data from the inputted information such as the
motion vector. The bit stream analysis unit 201 outputs the
motion vector and the reference picture index to the motion
compensation decoding unit 205. The picture P11 is obtained by
predictive coding using the pictures P7, B9 and P10 as candidate
pictures for forward reference, and the picture P13 as a
candidate picture for backward reference. At decoding the target
picture, these reference candidate pictures have already been
decoded, and are stored in the reference picture memory 207.
Hereinafter, a description will be given of how the pictures
stored in the reference picture memory 207 change with time, and
a method for determining a reference picture, with reference to
figure 3.
The reference picture memory 207 is controlled by the memory
control unit 204 on the basis of information Ih indicating what
kind of reference has been carried out in coding P pictures and B
pictures, which information is extracted from the header
information of the bit stream.
When decoding of the picture P11 is started, pictures P13,
P4, P7, P10, and B9 are stored in the reference picture memory
207 as shown in figure 3. The picture B11 is decoded using the
pictures P7, B9, and P10 as candidate pictures for forward
reference, and the picture P13 as a backward reference picture.
The decoded picture B11 is stored in the memory area where the
picture P4 had been stored, because the picture P4 is not used as
a candidate for a reference picture when decoding the picture B11
and the following pictures.
In this case, which candidate picture has been referred to
in detecting,the forward motion vector can be determined from the
reference picture information added to the motion vector.
To be specific, when the picture P10 has been referred to in
coding the target block of the picture B11, information
indicating that the candidate picture (picture P10) just previous
to the target picture has been used as a reference picture (e.g.,
reference picture index [0]) is described in the bit stream of
the target block. Further, when the picture B9 has been referred
to in coding the target block, information indicating that the
candidate picture which is two-pictures previous to the target
picture has been used as a reference picture (e.g., reference
picture index [1]) is described in the bit stream of the target
block. Furthermore, when the picture P7 has been referred to in
coding the target block of the picture P13, information
indicating that the candidate picture which is three-pictures
previous to the target picture has been used as a reference
picture (e.g., reference picture index [2]) is described in, the
bit stream of the target block.
Accordingly, it is possible to know which one of the
candidate pictures has been used as a reference picture in coding
the target block, from the reference picture index.
When the selected mode is bidirectional predictive coding,
the motion compensation decoding unit 205 determines which one of
the pictures P7, B9 and P10 has been used for forward reference,
from the reference picture index. Then, the motion compensation
decoding unit 205 obtains a forward motion compensation image
from the reference picture memory 207 on the basis of the forward
motion vector, and further, it obtains a backward motion
compensation image from the reference picture memory 207 on the
basis of the backward motion vector.
Then, the motion compensation decoding unit 205 performs
addition and averaging of the forward motion compensation image
and the backward motion compensation image to generated a motion
compensation image.
Next, a process of generating a motion compensation image
using forward and backward motion vectors will be described.
(Bidirectional Prediction Mode)
Figure 17 illustrates a case where the target picture to be
decoded is the picture B11, and bidirectional predictive decoding
is performed on a block (target block) BLa01 to be decoded, in
the picture B11.
Initially, a description will be given of a case where the
forward reference picture is the picture P10, and the backward
reference picture is the picture P13.
In this case, the forward motion vector is a motion vector
MVe01 indicating an area CRe01 in the picture P10, which area
corresponds to the block BLa01. The backward motion vector is a
motion vector MVg01 indicating an area CRg01 in the picture P13,
which area corresponds to the block BLa01.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRe01 in the picture P10 as a
forward reference image, and an image in the area CRgOl in the
picture P13 as a backward reference image, from the reference
picture memory 207, and performs addition and averaging of image
data on the images in the both areas CRe01 and CRg01 to obtain a
motion compensation image corresponding to the target block BLaOl.
Next, a description will be given of a case where the
forward reference picture is the picture B9, and the backward
reference picture is the picture P13.
In this case, the forward motion vector is a motion vector
MVf01 indicating an area CRf01 in the picture B9, which area
corresponds to the block BLa01. The backward motion vector is a
motion vector MVgOl indicating an area CRg01 in the picture P13,
which area corresponds to the block BLa01.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRf01 in the picture B9 as a forward
reference image, and an image in the area CRg01 in the picture
P13 as a backward reference image, from the reference picture
memory 207, and performs addition and averaging of image data for
the images in the both areas CRf01 and CRg01 to obtain a motion
compensation image corresponding to the target block BLa01.
(Direct Mode)
Further, when the coding mode is the direct mode, the motion
compensation decoding unit 205 obtains a motion vector (base
motion vector) of a block that is included in the backward
reference picture P13 for the target picture B11 and is placed
relatively in the same position as the target block, which motion
vector is stored in the motion vector storage unit 226. The
motion compensation decoding unit 205 obtains a forward reference
image and a backward reference image from the reference picture
memory 207 by using the base motion vector. Then, the motion
compensation decoding unit 205 performs addition and averaging of
image data for the forward reference image and the backward
reference image, thereby generating a motion compensation image
corresponding to the target block. In the following description,
a block in a picture, whose relative position with respect to a
picture is equal to that of a specific block in another picture
is simply referred to as a block which is located in the same
position as a specific block in a picture.
Figure 18(a) shows a case where the block BLa10 in the
picture B11 is decoded in the direct mode with reference to the
picture P10 that is just previous to the picture B11 (first
example of direct mode decoding).
A base motion vector to be used for direct mode decoding of
the block BLa10 is a forward motion vector (base motion vector)
MVh10 of a block (base block) BLg10 located in the same position
as the block BLa10, which block BLg10 is included in the picture
(base picture) P13 that is backward referred to when decoding the
block BLalO. The forward motion vector MVh10 indicates an area
CRh10 corresponding to the base block BLg10, in the picture P10
that is just previous to the picture B11.
In this case, as a forward motion vector MVk10 of the target
block BLa10 to be decoded, a motion vector which is parallel to
the base motion vector MVhlO and indicates an area CRk10 included
in the picture P10 and corresponding to the target block BLalO,
is employed. Further, as a backward motion vector MVi10 of the
target block BLa10 to be decoded, a motion vector which is
parallel to the base motion vector MVhlO and indicates an area
CRi10 included in the picture P13 and corresponding to the target
block BLa10, is employed.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRk10 of the forward reference
picture P10 as a forward reference image, and an image in the
area CRi10 of the backward reference picture P13 as a backward
reference image, from the reference picture memory 207, and
performs addition and averaging of image data of the both images
to obtain a motion compensation image (prediction image)
corresponding to the target block BLa10.
In this case, the magnitude (MVF) of the forward motion
vector MVk10 and the magnitude (MVB) of the backward motion
vector MVi10 are obtained by the above-described formulae (1) and
(2), using the magnitude (MVR) of the base motion vector MVh10.
The magnitudes MVF and MVB of the respective motion vectors
show the horizontal component and vertical component of the
motion vector, respectively.
Further, TRD indicates a time-basis distance between the
backward reference picture P13 for the target block BLa10 in the
picture B11, and the picture P10 which is forward referred to
when decoding the block (base block) BLg10 in the backward
reference picture (base picture) P13. Further, TRF is the time-
basis distance between the target picture B11 and the just-
previous reference picture P10, and TRB is the time-basis
distance between the target picture B11 and the picture P10 which
is referred to when decoding the block BLg10 in the backward
reference picture P13.
Figure 18(b) shows a case where a block BLa20 in the picture B11 is decoded in the direct mode with reference to the picture
P10 that is just previous to the picture B11 (second example of
direct mode decoding).
In this second example of direct mode decoding, in contrast
with the first example of direct mode decoding shown in figure
18(a), a picture which is forward referred to in decoding the
base block (i.e., a block placed in the same position as the
target block, in the backward reference picture for the target
block) is the picture P7.
That is, a base motion vector to be used for direct mode
decoding of the block BLa20 is a forward motion vector MVh20 of a
block BLg20 located in the same position as the block BLa20,
which block BLg20 is included in the picture P13 that is backward
referred to when decoding the block BLa20. The forward motion
vector MVh20 indicates an area CRh20 corresponding to the base
block BLg20, in the picture P7 that is positioned forward the
target picture B11.
In this case, as a forward motion vector MVk20 of the target
block BLa20 to be decoded, a motion vector, which is parallel to
the base motion vector MVh20 and indicates an area CRk20 included
in the picture P10 and corresponding to the target block BLa20,
is employed. Further, as a backward motion vector MV120 of the
target block BLa20 to be decoded, a motion vector, which is
parallel to the base motion vector MVh20 and indicates an area
CRi20 included in the picture P13 and corresponding to the target
block BLa20, is employed.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRk20 of the forward reference
picture P10 as a forward reference image, and an image in the
area CRi20 of the backward reference picture P13 as a backward
reference image, from the reference picture memory 207, and
performs addition and averaging of image data of the both images
to obtain a motion compensation image (prediction image)
corresponding to the target block BLa20.
In this case, the magnitude (MVF) of the forward motion
vector MVk20 and the magnitude (MVB) of the backward motion
vector MVi20 are obtained by the above-described formulae (1) and
(2), using the magnitude (MVR) of the base motion vector MVh20,
as described for the first example of direct mode decoding.
Figure 19(a) shows a case where a block BLa30 in the picture B11 is decoded in the direct mode with reference to the picture
P7 which is positioned forward the picture P10 that is positioned
just previous to the picture B11 (third example of direct mode
decoding).
In this third example of direct mode decoding, in contrast
with the first and second examples of direct mode coding shown in
figures 18(a) and 18(b), a picture to be forward referred to in
decoding the target block is not a picture just previous to the
target picture, but a picture that is forward referred to in
decoding the base block (a block in the same position as the
target block) in the base picture. The base picture is a picture
that is backward referred to in decoding the target block.
That is, a base motion vector to be used in direct mode
decoding of the block BLa30 is a forward motion vector MVh30 of a
block BLg30 located in the same position as the block BLa30,
which block BLg30 is included in the picture P13 that is backward
referred to in decoding the block BLa30. The forward motion
vector MVh30 indicates an area CRh30 corresponding to the base
block BLg30, in the picture P7 that is positioned forward the
target picture B11.
In this case, as a forward motion vector MVk30 of the target
block BLa30 to be decoded, a motion vector, which is parallel to
the base motion vector MVh30 and indicates an area CRk30 included
in the picture P7 and corresponding to the target block BLa30, is
employed. Further, as a backward motion vector MVi30 of the
target block BLa30 to be decoded, a motion vector, which is
parallel to the base motion vector MVh30 and indicates an area
CRi30 included in the picture P13 and corresponding to the target
block BLa30, is employed.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRk30 of the forward reference
picture P7 as a forward reference image, and an image in the area
CRi30 of the backward reference picture P13 as a backward
reference image, from the reference picture memory 207, and
performs addition and averaging of image data of the both images
to obtain a motion compensation image (prediction image)
corresponding to the target block BLa30.
In this case, the magnitude (MVF) of the forward motion
vector MVk30 and the magnitude (MVB) of the backward motion
vector MVi30 are obtained by the above-described formulae (2) and
(3), using the magnitude (MVR) of the base motion vector MVh30.
When the picture to be referred to in decoding the block
BLg30 has already been deleted from the reference picture memory
207, the forward reference picture P10 that is timewise closest
to the target picture is used as a forward reference picture in
the third example of direct mode decoding. In this case, the
third example of direct mode decoding is identical to the first
example of direct mode decoding.
Figure 19(b) shows a case where a block BLa40 in the picture B11 is decoded in the direct mode by using a motion vector whose
magnitude is zero (fourth example of direct mode decoding).
In this fourth example of direct mode decoding, the
magnitude of the reference motion vector employed in the first
and second examples shown in figures 18(a) and 18(b) is zero.
In this case, as a forward motion vector MVk40 and a
backward motion vector MVi40 of the block BLa40 to be decoded, a
motion vector whose magnitude is zero is employed.
That is, the forward motion vector MVk40 indicates an area
(block) CRk40 of the same size as the target block, which area is
included in the picture P10 and placed at the same position as
the target block BLa40. Further, the backward motion vector
MVi40 indicates an area (block) CRi40 of the same size as the
target block, which area is included in the picture P13 and
placed at the same position as the target block BLa40.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area (block) CRk40 of the forward
reference picture P10 as a forward reference image, and an image
in the area (block) CRi40 of the backward reference picture P13
as a backward reference image, from the reference picture memory
207, and performs addition and averaging of image data of the
both images to obtain a motion compensation image (prediction
image) corresponding to the target block BLa40. This method is
applicable to, for example, a case where a block which is
included in the picture P13 as a backward reference picture of
the picture B11 and is located in the same position as the block
BLa40 is a block having no motion vector like an intra-frame-
coded block.
The data of the motion compensation image thus generated is
output to the addition unit 208. The addition unit 208 adds the
inputted prediction error data and the motion compensation image
data to generate decoded image data. The decoded image data so
generated is output through the switch 210 to the reference
picture memory 207, and the decoded image is stored in the
reference picture memory 207.
The memory control unit 204 controls the reference picture
memory 207 on the basis of the header information Ih indicating
what kind of reference has been carried out in coding the P
pictures and B pictures extracted from the header information of
the bit stream.
As described above, the blocks in the picture B11 are
successively decoded. When all of the blocks in the picture B11
have been decoded, decoding of the picture B12 takes place.
In the B picture decoding described above, a specific block
is sometimes treated as a skip block. Hereinafter, decoding of a
skip block will be briefly described.
When it is found that a specific block is treated as a skip
block during decoding of an inputted bit stream, from a skip
identifier or a block number information that is described in the
bit stream, motion compensation, i.e., acquisition of a
prediction image corresponding to a target block, is carried out
in the direct mode.
For example, as shown in figure 6(b), when the blocks
MB(r+l) and MB(r+2) between the block MB(r) and the block MB(r+3)
in the picture B11 are treated as skip blocks, the bit stream
analysis unit 201 detects the skip identifier Sf from the bit
stream Bs. When the skip identifier Sf is input to the mode
decoding unit 223, the mode decoding unit 223 instructs the
motion compensation decoding unit 205 to perform motion
compensation in the direct mode.
Then, the motion compensation decoding unit 205 obtains the
prediction images of the blocks MB(r+1) and MB(r+2), on the basis
of an image (forward reference image) of a block which is
included in the forward reference picture P10 and placed in the
same position as the block treated as a skip block, and an image
(backward reference image) of a block in the same position as the
block treated as a skip block, and then outputs the data of the
prediction images to the addition unit 208. The prediction error
decoding unit 202 outputs data whose value is zero, as difference
data of the blocks treated as skip blocks. In the addition unit
208> since the difference data of the blocks treated as skip
blocks is zero, the data of the prediction images of the blocks
MB(r+1) and MB(r+2) are output to the reference picture memory
207 as decoded images of the blocks MB(r+1) and MB(r+2).
Furthermore, in the direct mode processing shown in figure
18(a) (first example), the direct mode processing shown in figure
18(b) (second example), and the direction mode processing shown
in figure 19(a) (third example), all of blocks whose difference
data become zero are not necessarily treated as skip blocks.
That is, a target block is subjected to bidirectional prediction
using a picture that is positioned just previous to the target
picture as a forward reference picture, and a motion vector whose
magnitude is zero, and only when the difference data of the
target block becomes zero, this target block may be treated as a
skip block.
In this case, when it is found, from the skip identifier or
the like in the bit stream Bs, that a specific block is treated
as a skip block, motion compensation should be carried out by
bidirectional prediction whose motion is zero, using a just-
previous reference picture as a forward reference picture.
(Decoding Process for Picture B12)
Since the bit stream analysis unit 201, the mode decoding
unit 223, and the prediction error decoding unit 202 operate in
the same way as described for decoding of the picture P10,
repeated description is not necessary.
The motion compensation decoding unit 205 generates motion
compensation image data from the inputted information such as the
motion vector. The motion vector MV and the reference picture
index Rp are input to the motion compensation decoding unit 205.
The picture P12 has been coded using the pictures P7, P10 and B11
as candidate pictures for forward reference, and the picture P13
as a candidate picture for backward reference. At decoding the
target picture, these candidate pictures have already been
decoded, and are stored in the reference picture memory 207.
The timewise change of the pictures stored in the reference
picture memory 207, and the method for determining a reference
picture are identical to those in the case of decoding the
picture B11 described with respect to figure 3.
When the coding mode is bidirectional predictive coding, the
motion compensation decoding unit 205 determines which one of the
pictures P7, P10 and B11 has been used for forward reference,
from the reference picture index. Then, the motion compensation
decoding unit 205 obtains a forward reference image from the
reference picture memory 207 on the basis of the forward motion
vector, and further, it obtains a backward reference image from
the reference picture memory 207 on the basis of the backward
motion vector. Then, the motion compensation decoding unit 205
performs addition and averaging of image data of the forward
reference image and the backward reference image to generated a
motion compensation image corresponding to the target block.
(Bidirectional Prediction Mode)
Figure 20 illustrates a case where the target picture to be
decoded is the picture B12, and bidirectional predictive decoding
is performed for a block (target block) BLa02 to be decoded, in
the picture B12.
Initially, a description will be given of a case where the
forward reference picture is the picture B11, and the backward
reference picture is the picture P13.
In this case, the forward motion vector is a motion vector
MVe02 indicating an area CRe02 in the picture B11 corresponding
to the block BLa02. The backward motion vector is a motion
vector MVg02 indicating an area CRg02 in the picture P13
corresponding to the block BLa02.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRe02 in the picture B11 as a
forward reference image, and an image in the area CRg02 in the
picture P13 as a backward reference image, from the reference
picture memory 207, and performs addition and averaging of image
data of the images in the both areas CRe02 and CRg02 to obtain a
motion compensation image corresponding to the target block BLa02.
Next, a description will be given of a case where the
forward reference picture is the picture P10, and the backward
reference picture is the picture P13.
In this case, the forward motion vector is a motion vector
MVf02 indicating an area CRf02 in the picture P10, corresponding
to the block BLa02. The backward motion vector is a motion
vector MVg02 indicating an area CRg02 in the picture P13,
corresponding to the block BLa02.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRf02 in the picture P10 as a
forward reference image and an image in the area CRg02 in the
picture P13 as a backward reference image from the reference
picture memory 207, and performs addition and averaging of image
data of the images in the both areas CRf02 and CRg02 to obtain a
motion compensation image corresponding to the target block BLa02.
(Direct Mode)
Further, when the coding mode is the direct mode, the motion
compensation decoding unit 205 obtains a motion vector (base
motion vector) of a reference block (a block whose relative
position is the same as that of the target position) in the
backward reference picture P13 for the target picture B12, which
motion vector is stored in the motion vector storage unit 226.
The motion compensation decoding unit 205 obtains a forward
reference image and a backward reference image from the reference
picture memory 207 by using the base motion vector. Then, the
motion compensation decoding unit 205 performs addition and
averaging of image data of the forward reference image and the
backward reference image, thereby generating amotion
compensation image corresponding to the target block.
Figure 21(a) shows a case where the block BLa50 in the
picture B12 is decoded in the direct mode with reference to the
picture B11 that is just previous to the picture B12 (first
example of direct mode decoding).
A base motion vector to be used for direct mode decoding of
the block BLa50 is a forward motion vector MVj50 of the base
block (i.e., the block BLg50 placed in the same position as the
block BLa50) in the picture P13 that is backward referred to when
decoding the block BLa50. The forward motion vector MVj50
indicates an area CRj50 corresponding to the base block BLg50 in
the picture P10 that is positioned forward and close to the
picture B11.
In this case, as a forward motion vector MVk50 of the target
block BLa50 to be decoded, a motion vector which is parallel to
the base motion vector MVj50 and indicates an area CRk50 included
in the picture B11 and corresponding to the target block BLa50,
is employed. Further, as a backward motion vector MVi50 of the
target block BLa50 to be decoded, a motion vector which is
parallel to the base motion vector MVj50 and indicates an area
CRi50 included in the picture P13 and corresponding to the target
block BLa50, is employed.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRk50 of the forward reference
picture B11 as a forward reference image and an image in the area
CRi50 of the backward reference picture P13 as a backward
reference image from the reference picture memory 207, and
performs addition and averaging of image data of the both images
152
to obtain a motion compensation image (prediction image)
corresponding to the target block BLa50.
In this case, the magnitude (MVF) of the forward motion
vector MVkSO and the magnitude (MVB) of the backward motion
vector MVi50 are obtained by the above-described formulae (1) and
(2) using the magnitude (MVR) of the base motion vector MVh10.
The magnitudes MVF and MVB of the respective motion vectors
show the horizontal component and vertical component of the
motion vector, respectively.
Figure 21(b) shows a case where a block BLa60 in the picture
B12 is decoded in the direct mode with reference to the picture
B11 that is positioned forward the picture B12 (second example of
direct mode decoding).
In this second example of direct mode decoding, in contrast
with the first example of direct mode decoding shown in figure
21(a), a picture which is forward referred to in decoding the
base block (i.e., a block placed in the same position as the
target block, in the backward reference picture for the target
block) is the picture P7.
That is, a base motion vector to be used for direct mode
decoding of the block BLa60 is a forward motion vector MVj60 of
the reference block (the block BLg60 in the same position as the
block BLa60) in the picture P13 that is backward referred to when
decoding the block BLa60. The forward motion vector MVj60
indicates an area CRj60 corresponding to the base block BLg60, in
the picture P7 that is positioned forward the target picture B12.
In this case, as a forward motion vector MVk60 of the target
block BLa60 to be decoded, a motion vector, which is parallel to
the base motion vector MVj60 and indicates an area CRk60 included
in the picture B11 and corresponding to the target block BLa60,
is employed. Further, as a backward motion vector MVi60 of the
target block BLa60 to be decoded, a motion vector, which is
parallel to the base motion vector MVj60 and indicates an area
CRi60 included in the picture P13 and corresponding to the target
block BLa60, is employed.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRk60 of the forward reference
picture B11 as a forward reference image and an image in the area
CRi60 of the backward reference picture P13 as a backward
reference image from the reference picture memory 207, and
performs addition and averaging of image data of the both images
to obtain a motion compensation image (prediction image)
corresponding to the target block BLa60.
In this case, the magnitude (MVF) of the forward motion
vector MVk60 and the magnitude (MVB) of the backward motion
vector MVi60 are obtained by the above-described formulae (1) and
(2), using the magnitude (MVR) of the base motion vector MVj60,
as described for the first example of direct mode decoding.
Figure 22(a) shows a case where a block BLa70 in the picture
B12 is decoded in the direct mode with reference to the picture
P7 which is positioned forward the forward picture P10 that is
closest to the picture B12 (third example of direct mode
decoding).
In this third example of direct mode decoding, in contrast
with the first and second examples of direct mode coding shown in
figures 21(a) and 21(b), a picture to be forward referred to in
decoding the target block is not a picture just previous to the
target picture, but a picture that is forward referred to in
decoding the base block in the base picture. The base picture is
a picture that is backward referred to in decoding the target
block,
That is, a base motion vector to be used in direct mode
decoding of the block BLa70 is a forward motion vector MVj70 of a
base block BLg70 (a block in the same position as the block
BLa70) in the picture P13 that is backward referred to in
decoding the block BLa70. The forward motion vector MVj70
indicates an area CRj70 corresponding to the base block BLg70 in
the picture P7 that is positioned forward the target picture B12.
In this case, as a forward motion vector MVk70 of the target
block BLa70 to be decoded, a motion vector which is parallel to
the base motion vector MVj70 and indicates an area CRk70 included
in the picture P7 and corresponding to the target block BLa70, is
employed. Further, as a backward motion vector MVi70 of the
target block BLa70, a motion vector which is parallel to the base
motion vector MVj70 and indicates an area CRi70 included in the
picture P13 and corresponding to the target block BLa70, is
employed.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area CRk70 of the forward reference
picture P7 as a forward reference image and an image in the area
CRi70 of the backward reference picture P13 as a backward
reference image from the reference picture memory 207, and
performs addition and averaging of image data of the both images
to obtain a motion compensation image (prediction image)
corresponding to the target block BLa70.
In this case, the magnitude (MVF) of the forward motion
vector MVk70 and the magnitude (MVB) of the backward motion
vector MVi70 are obtained by the above-described formulae (2) and
(3), using the magnitude (MVR) of the base motion vector MVj70.
When the picture to be referred to in decoding the block
BLg70 has already been deleted from, the reference picture memory
207, the forward reference picture P10 that is timewise closest
to the target picture is used as a forward reference picture in
the third example of direct mode decoding. In this case, the
third example of direct mode decoding is identical to the first
example of direct mode decoding.
Figure 22(b) shows a case where a block BLa80 in the picture
B12 is decoded in the direct mode by using a motion vector whose
magnitude is zero (fourth example of direct mode decoding).
In this fourth example of direct mode decoding, the
magnitude of the reference motion vector employed in the first
and second examples shown in figures 21(a) and 21(b) is zero.
In this case, as a forward motion vector MVk80 and a
backward motion vector MVi80 of the block BLa80 to be decoded, a
motion vector whose magnitude is zero is employed.
That is, the forward motion vector MVk80 indicates an area
(block) CRk80 of the same size as the target block, which area is
included in the picture B11 and placed at the same position as
the target block BLa80. Further, the backward motion vector
MVi80 indicates an area (block) CRi80 of the same size as the
target block, which area is included in the picture P13 and
placed at the same position as the target block BLa80.
Accordingly, the motion compensation decoding unit 205
obtains an image in the area (block) CRk80 of the forward
reference picture B11 as a forward reference image and an image
in the area (block) CRi80 of the backward reference picture P13
as a backward reference image from the reference picture memory
207, and performs addition and averaging of image data of the
both images to obtain a motion compensation image (prediction
image) corresponding to the target block BLa80. This method is
applicable to, for example, a case where a block which is
included in the picture P13 as a backward reference picture of
the picture B11 and is located in the same position as the block
BLa80 is a block having no motion vector like an intra-frame-
coded block.
The data of the motion compensation image thus generated is
output to the addition unit 208. The addition unit 208 adds the
inputted prediction error data and the motion compensation image
data to generate decoded image data. The decoded image data so
generated is output through the switch 210 to the reference
picture memory 207.
As described above, the blocks in the picture B12 are
successively decoded. The image data of the respective pictures
stored in the reference picture memory 207 are rearranged in
order of time to be output as output image data Od.
Thereafter, the pictures following the picture B12, which
are arranged in decoding order times as shown in figure 16(a),
are successively decoded according to the picture type, in like
manner as described for the pictures P13, B11, and B12. Figure
16(b) shows the pictures rearranged in order of display times.
During decoding of the inputted bit stream, if it is found
that a specific block is treated as a skip block, from a skip
identifier or a block number information that is described in the
bit stream, motion compensation, i.e., acquisition of a
prediction image corresponding to a target block, is carried out
in the direct mode as in the case of decoding the picture B11.
As described above, in the moving picture decoding apparatus
20 according to the second embodiment, when decoding a block in a
B picture, a prediction image corresponding to the target block
is generated using an already-decoded P picture and an already-
decoded B picture as candidate pictures for forward reference, on
the basis of information (reference picture index) indicating
candidate pictures which are forward referred to in coding the
target block, which information is included in the bit stream
corresponding to the target block to be decoded. Therefore, it
is possible to correctly decode a block in a target B picture
which has been coded using a B picture as a candidate picture for
forward reference.
Further, in the moving picture decoding apparatus 20, when a
target block to be decoded which is included in a B picture has
been coded in the direct mode, a motion vector of the target
block is calculated on the basis of a motion vector of a block
that is placed in the same position as the target block.
Therefore, it is not necessary for the decoding end to obtain the
information indicating the motion vector of the block coded in
the direct mode, from the coding end.
Furthermore, in the moving picture decoding apparatus 20,
the data of the already-decoded pictures which are stored in the
reference picture memory are managed on the basis of the
information indicating the candidate pictures which are used in
coding p pictures and B pictures, which information is included
as header information in the bit stream. For example, at the
completion of decoding one picture, data of pictures which are
not to be Used as reference pictures in decoding the following
pictures are successively deleted, whereby the picture memory can
be used with efficiency.
Further, when decoding a target block in a P picture, it is
possible to determine which one of plural candidate pictures is
used as a reference picture (i.e., which one of the candidate
pictures is referred to in detecting the motion vector of the
target block to be decoded) from the reference picture
information added to the motion vector information.
Likewise, when decoding a target block in a B picture, it is
possible to determine which one of plural candidate pictures for
forward reference is used as a reference picture (i.e., which one
of the candidate pictures is referred to in detecting the forward
motion vector of the target block to be decoded) from the
reference picture information added to the motion vector
information.
While in this second embodiment the direct mode is used as
one of the plural coding modes for B pictures, the direct mode is
not necessarily used as the coding mode for B pictures. In this
case, the motion vector storage unit 226 in the moving picture
decoding apparatus 20 is dispensed with.
Further, while in this second embodiment four specific
methods are described as examples of direct mode (i.e., the first
example shown in figure 18(a) or 21(a), the second example shown
in figure 18(b) or 21(b), the third example shown in figure 19 (a)
or 22(a), and the fourth example shown in figure 19(b) or 22(b)),
the decoding apparatus performs decoding using a method suited to
a coding method which is used as direct mode by the coding
apparatus. More specifically, when plural methods are employed
as direct mode, the decoding apparatus performs decoding, using
information indicating which one of the plural methods is used as
specific direct mode, that is described in the bit stream.
In this case, the operation of the motion compensation
decoding unit 205 varies according to the information. For
example, when this information is added in block units for motion
compensation, the mode decoding unit 223 determines which one of
the four methods mentioned above is used as direct mode in coding,
and notifies the motion compensation decoding unit 205 of the
determined method. The motion compensation decoding unit 2 05
performs appropriate motion compensation predictive decoding
according to the determined method of direct mode.
Further, when the information (DM mode information)
indicating which one of the plural methods is used as direct mode
is described in the header of the entire sequence, the GOP header,
the picture header, or the slice header, the DM mode information
is transferred for every sequence, GOP, picture, or slice, from
the bit stream analysis unit 201 to the motion compensation
decoding unit 205, and the motion compensation decoding unit 205
changes the operation.
While in this second embodiment two B pictures are placed
between an I picture and a P picture or between adjacent P
pictures, the number of continuous B pictures may be three or
four.
Further, while in this second embodiment three pictures are
used as candidate pictures for a forward reference picture for a
P picture, the number of reference candidate pictures for a P
picture may be other than three.
Furthermore, while in this second embodiment two I or P
pictures and one B picture are used as candidate pictures for a
forward reference picture in decoding a B picture, forward
reference candidate pictures in decoding a B picture are not
restricted thereto.
Moreover, in this second embodiment, as a method for
managing the reference picture memory in decoding the picture P13,
picture B11, and picture B12, a method of collectively managing
the P pictures and B pictures to be used as candidates of a
reference picture, as shown in figure 3, is described. However,
the reference picture memory managing method may be any of the
four methods which are described for the first embodiment with
reference to figures 11 to 14, wherein all of the pictures to be
used as candidates for a reference picture are separated into P
pictures and B pictures to be managed.
In this case, the reference picture memory 207 has memory
areas for six pictures, i.e., P picture memory areas (#1)~(#4),
and B picture memory areas (#1) and (#2) . Further, these six
memory areas are not necessarily formed in one reference picture
memory, but each of the six memory areas may be constituted by
one independent reference picture memory.
Further, when the coding end employs a reference picture
index assigning method wherein it is determined, for each picture
to be coded, which of the P picture memory area and the B picture
memory area is given priority in assigning reference picture
indices as shown in figure 14, the moving picture decoding
apparatus can easily identify a picture which is used as a
reference picture among plural candidate pictures, on the basis
of the reference picture indices, by using information described
in the bit stream, which indicates the memory area taking
priority.
For example, when the target picture to be decoded is the
picture B11, since the forward reference picture that is timewise
closest to the target picture is the picture P10, reference
picture indices are assigned to the pictures stored in the P
picture memory with priority. Accordingly, a reference picture
index [0] is added as header information to the bit stream of the
target block when the picture P10 is used as a reference picture
in coding the target block of the picture B11. Likewise, a
reference picture index [1] is added as header information when
the picture P7 is used as a reference picture, and a reference
picture index [2] is added as header information when the picture
B9 is used as a reference picture. Accordingly, the moving
picture decoding apparatus can know which candidate picture is
used as a reference picture in coding the target block, according
to the reference picture index.
In this case, since information indicating that reference
picture indices are assigned to the candidate pictures in the P
picture memory with priority is included as header information in
the bit stream, identification of the reference picture is
further facilitated by using this information.
Further, when the target picture to be decoded is the
picture B12, since the forward reference picture that is timewise
closest to the target picture is the picture B11, reference
picture indices are assigned to the pictures stored in the B
picture memory with priority. Accordingly, a reference picture
index [0} is added as header information to the bit stream of the
target block when the picture B11 is used as a reference picture
in coding the target block of the picture B12. Likewise, a
reference picture index [1] is added as header information when
the picture P10 is used as a reference picture, and a reference
picture index [2] is added as header information when the picture
P7 is used as a reference picture. Accordingly, the moving
picture decoding apparatus can know which candidate picture is
used as a reference picture in coding the target block, according
to the reference picture index.
In this case, since information indicating that reference
picture indices are assigned to the candidate pictures in the B
picture memory with priority is included as header information in
the bit stream, identification of the reference picture is
further facilitated by using this information.
Furthermore, there are cases where, at the coding end, one
of the above-mentioned five methods for managing the reference
picture memory (refer to figures 3, 11 to 14) is previously
selected, or some of these five methods are used by switching
them. For example, when the coding end employs some of the
plural methods by switching them, the moving picture decoding
apparatus can determine the reference picture index, according to
information indicating which method is used for each picture,
that is described in the bit stream.
Furthermore, in this second embodiment, the five methods for
managing the reference picture memory (refer to figures 3, 11 to
14) are described for the case where there are three reference
candidate pictures for a P picture, and there are two P pictures
and one B picture as forward reference candidate pictures for a B
picture. However, the five methods for managing the reference
picture memory are also applicable to cases where the number of
reference candidate pictures is different to those mentioned
above. When the number of reference candidate pictures is
different from those mentioned for the second embodiment, the
capacity of the reference picture memory is also different from
that described for the second embodiment.
Moreover, in this second embodiment, in the method of
managing the reference picture memory wherein the stored
reference candidates are separated into P pictures and B pictures
(four examples shown in figures 11 to 14), the P pictures are
stored in the P picture memory area while the B pictures are
stored in the B picture memory area. However, a short-term
picture memory and a long-term picture memory which are defined
in H.263++ may be used as memory areas where pictures are stored.
For example, the short-term picture memory and the long-term
picture memory may be used as a P picture memory area and a B
picture memory area, respectively.
[Embodiment 3]
Figure 23 is a block diagram illustrating a moving picture
coding apparatus 30 according to a third embodiment of the
present invention.
The moving picture coding apparatus 30 can switch, according
to a control signal supplied from the outside, a method for
assigning reference picture indices to candidate pictures,
between a method of assigning reference picture indices to
candidate pictures according to an initialized rule (default
assignment method), and an adaptive assignment method of
assigning reference picture indices to candidate pictures by the
default assignment method and, further, adaptively changing the
assigned reference picture indices according to the coding status.
To be specific, one operation mode of the moving picture
coding apparatus 30 according to the third embodiment is the
operation of the moving picture coding apparatus 10 according to
the first embodiment. In other words, when the default
assignment method is selected as a reference picture index
assignment method of the moving picture coding apparatus 30, the
moving picture coding apparatus 30 performs the same processing
as that of the moving picture coding apparatus 10.
Hereinafter, the moving picture coding apparatus 30 will be
described in detail.
The moving picture coding apparatus 30 is provided with a
coding control unit 130, instead of the coding control unit 110
of the moving picture coding apparatus 10 according to the first
embodiment. The coding control unit 130 switches, according to
an external control signal Cont, a method for assigning reference
picture indices to candidate pictures, between a method of
assigning reference picture indices according to an initialized
rule (default assignment method), and a method including a first
step of assigning reference picture indices to candidate pictures
by the default assignment method, and a second step of adaptively
changing the reference picture indices which are assigned to the
candidate pictures by the default assignment method (adaptive
assignment method).
Further, the coding control unit 130 includes a detection
unit (not shown) which detects, for every target picture to be
coded, coding efficiency in a case where each of plural reference
candidate pictures is used as a reference picture. The coding
control unit 130 changes the reference picture index which is
assigned to each candidate picture by the default assignment
method, according to the coding efficiency detected by the
detection unit.
More specifically, the coding control unit 130 changes the
reference picture index which is assigned to each candidate
picture by the default assignment method, such that, among plural
candidate pictures for a target picture, a candidate picture
which provides a higher coding efficiency of the target picture
when it is used as a reference picture is given a smaller
reference picture index.
Then, the mode selection unit 139 selects, in the direct
mode, a picture that is assigned a reference picture index [0],
as a forward reference picture for a target block. In a
predictive coding mode other than the direct mode such as the
bidirectional predictive coding mode, the mode selection unit 139
selects a reference picture from among plural candidate pictures
according to the coding efficiency.
Other components of the moving picture coding apparatus 30
according to the third embodiment are identical to those of the
moving picture coding apparatus 10 according to the first
embodiment.
Hereinafter, the operation will be described.
In the moving picture coding apparatus 30, when the default
assignment method is selected as a method for assigning reference
picture indices to candidate pictures according to the external
control signal Cont, the operation of the moving picture coding
apparatus 30 is identical to the operation of the moving picture
coding apparatus 10 according to the first embodiment.
On the other hand, when the adaptive assignment method is
selected as a method for assigning reference picture indices to
candidate pictures according to the external control signal Cont,
the moving picture coding apparatus 30 performs, in the first
step, assignment of reference picture indices in like manner as
described for the moving picture coding apparatus 10.
When the adaptive assignment method is selected, the moving
picture coding apparatus 30 performs, in the second step,
adaptive change of the reference picture indices that are
assigned by the default assignment method.
Hereinafter, a description will be given of specific methods
of assigning reference picture indices in the case where the
adaptive assignment method is selected. In the following
description, it is assumed that a target picture is the picture
B12.
Initially, in the first step, as shown in figure 3,
reference picture indices are assigned to candidate pictures for
forward reference such that a smaller reference picture index is
assigned to a candidate picture that is closer to the target
picture. That is, a reference picture index [1] is assigned to
the reference picture P10, a reference picture index [0] is
assigned to the reference picture B11, and a reference picture
index [2] is assigned to the reference picture P7.
Next, in the second step, as shown in figure 24, the
reference picture index [1] of the reference picture P10 is
changed to [0], and the reference picture index [0] of the
reference picture B11 is changed to [1].
Such rewriting of reference picture indices is carried out
for every target picture, according to the coding efficiency.
Further, the moving picture coding apparatus 30 inserts
information indicating which of the default assignment method and
the adaptive assignment method is set as an assignment method, as
header information, in the bit stream. Further, when the
adaptive assignment method is set, information indicating how the
assignment of reference picture indices is carried out is also
inserted as header information in the bit stream.
As described above, in this third embodiment, the reference
picture index of the candidate picture which is to be used as a
forward reference picture in the direction mode, can be changed
to [0] .
That is, since, in the first embodiment, a smaller reference
picture index is given to a reference candidate picture that is
timewise closer to the target picture, only the picture B11 that
is timewise closest to the target picture B12 can be referred to
in the direct mode. In this third embodiment, however, any
picture other than the picture B11 closest to the target picture
B12 can be used as a forward reference picture, if the coding
efficiency is improved.
Further, in this case, since the picture to be referred to
in coding the picture B12 in the direct mode is not the picture
B11 but the picture B10, decoding of the picture B11 becomes
unnecessary. Accordingly, as shown in figure 25(a), a B picture
immediately after a P picture can be processed without decoding
it, whereby speedup of decoding is achieved when the picture B11
is not necessary. Further, since decoding can be carried out
even when the data of the picture B11 is lost due to transmission
error or the like, reliability of decoding is improved.
As described above, when a reference picture index can be
arbitrarily assigned to a candidate picture to intentionally
determine a picture to be referred to in the direct mode, a
predetermined picture can be processed without decoding it, as
shown in figure 25(a).
Furthermore, even when three B pictures are placed between P
pictures as shown in figure 25(b), a predetermined picture can be
processed without decoding it. Therefore, when a picture that is
not needed by the user is previously known at the coding end,
such picture can be omitted to reduce the processing time in
decoding.
In figure 25(b), even when the picture B3 is not decoded,
other pictures can be decoded.
That is, in the assignment method of the first embodiment,
since the picture B4 refers to the picture B3 in the direct mode,
the picture B3 must be decoded to decode the picture B4. In this
third embodiment, however, since a picture to be referred to in
the direct mode can be arbitrarily set, decoding of the picture
B3 can be dispensed with.
Furthermore, in this third embodiment, assignment of
reference picture indices is carried out such that a smaller
reference picture index is assigned to a candidate picture that
is timewise closer to the target picture, and a reference picture
to be used in the direct mode is determined according to the
reference picture indices. Therefore, the coding efficiency can
be improved by a reduction in the motion vector, and further, the
processing time can be reduced.
Furthermore, when the target block is processed in the
direct mode at the decoding end, since the forward reference
candidate picture to which the reference picture index [0] is
assigned is immediately used as a reference picture, decoding
time can be reduced.
Furthermore, while in this third embodiment a candidate
picture whose reference picture index should be changed to [0] is
determined according to the coding efficiency, a reference
picture index of a picture which is most likely to be referred to,
e.g., a P picture that is timewise closest to the target picture,
may be changed to [0].
Moreover, while in this third embodiment a picture to be
referred to in the direct mode is a picture whose reference
picture index is [0], the present invention is not restricted
thereto. For example, information indicating that a picture is
to be referred to in the direct mode is coded, and decoding may
be carried out in the direct mode on the basis of this
information.
[Embodiment 4]
Figure 26 is a block diagram for explaining a moving picture
decoding apparatus 40 according to a fourth embodiment of the
present invention.
The moving picture decoding apparatus 40 receives the bit
stream outputted from the moving picture coding apparatus 30 of
the third embodiment, and performs decoding of each picture, on
the basis of information indicating which of the default
assignment method and the adaptive assignment method should be
used when assigning reference picture indices (assignment method
instruction information), which information is included in the
bit streanu
That is, one operation mode of the moving picture decoding
apparatus 40 according to the fourth embodiment is the operation
of the moving picture coding apparatus 20 according to the second
embodiment. In other words, when the default assignment method
is used as a reference picture index assignment method in the
moving picture decoding apparatus 40, the operation of the moving
picture decoding apparatus 40 is identical to that of the moving
picture decoding apparatus 20.
Hereinafter, the moving picture decoding apparatus 40 will
be described in detail.
The moving picture decoding apparatus 40 is provided with a
memory control unit 244, instead of the memory control unit 204
of the moving picture decoding apparatus 20 according to the
second embodiment. The memory control unit 244 performs memory
management according to either the default assignment method or
the adaptive assignment method, on the basis of the assignment
method instruction information included in the bit stream as
header information.
Other components of the moving picture decoding apparatus 40
according to the fourth embodiment are identical to those of the
moving picture decoding apparatus 20 according to the second
embodiment.
Hereinafter, the operation will be described.
The moving picture decoding apparatus 40 operates in
accordance with the assignment method instruction information
that is included as header information in the bit stream supplied
from the moving picture coding apparatus 30.
That is, when the default assignment method is selected as a
reference picture index assignment method at the coding end, i.e.,
when information indicating that the default assignment method is
selected is included in the bit stream, the moving picture
decoding apparatus 40 operates in the same manner as the moving
picture decoding apparatus 20 of the second embodiment.
On the other hand, when the adaptive assignment method is
selected as a reference picture index assignment method at the
coding end, i.e., when information indicating that the adaptive
assignment method is selected is included in the bit stream, the
moving picture decoding apparatus 40 operates in accordance with
the adaptive assignment method. In this case, since information
indicating how the assignment of reference picture indices is
carried out is also included as header information in the bit
stream, assignment of reference picture indices is carried out
according to this information.
Hereinafter, a description will be given of the operation of
the moving picture decoding apparatus 40 in the case where the
adaptive assignment method is selected.
In the reference picture memory 207, as shown in figure 24,
reference candidate pictures stored in the respective memory area
are rewritten every time a target picture is processed.
To be specific, when the target picture to be decoded is the
picture B12, decoding of a target block in the picture B12 is
carried out with reference to a reference picture that is
selected from candidate pictures according to the header
information of the target block.
For example, when the coding mode for the target block is
the bidirectional predictive mode, a candidate picture which is
given the same reference picture index as the reference picture
index that is included in the header information of the target
block, is selected as a forward reference picture from among the
candidate pictures P10, B11, and P7. When the reference picture
index included in the header information of the target block is
[1], the candidate picture B11 is selected as a forward reference
picture. Then, the target block is subjected to bidirectional
predictive decoding with reference to the candidate picture B11
as a forward reference picture, and the picture P13 as a backward
reference picture.
Further, when the decoding mode of the target block is the
direct mode, a candidate picture (picture P10) which is given the
reference picture index [0] is selected as a forward reference
picture from among the candidate pictures P7, P10, and B9. Then,
the target block is decoded with reference to the candidate
picture P10 as a forward reference picture, and the picture P13
as a backward reference picture.
As described above, according to the fourth embodiment, the
reference picture memory 207 is managed as shown in figure 24,
that is, memory management is carried out using, as the reference
picture indices of the respective candidate pictures, those
obtained by changing the reference picture indices assigned by
the default assignment method, according to the coding status.
Therefore, it is possible to realize a decoding method adaptive
to a coding method in which the reference picture indices of the
candidate pictures are rewritten according to the coding
efficiency.
That is, since, in the second embodiment, a smaller
reference picture index is given to a reference candidate picture
that is timewise closer to the target picture, only the picture
B11 that is timewise closest to the target picture B12 can be
used as a reference picture in the direct mode. In this fourth
embodiment, however, a picture other than the picture B11 closest
to the target picture B12 can be used as a forward reference
picture.
Further, in this case, since the picture to be referred to
in decoding a block in the picture B12 in the direct mode is not
the picture B11 but the picture B10, decoding of the picture B11
becomes unnecessary. Accordingly, as shown in figure 25(a), a B
picture immediately after a P picture can be processed without
decoding it, whereby speedup of decoding is achieved when the
picture B11 is not necessary. Further, since decoding can be
carried out even when the data of the picture B11 is lost due to
transmission error or the like, reliability of decoding is
improved.
As described above, when a reference picture index to be
assigned to each reference candidate picture is arbitrarily
selected according to the coding status to intentionally
determine a picture to be referred to in the direct mode, a
predetermined picture can be processed without decoding it as
shown in figure 25(a).
Furthermore, even when three B pictures are placed between P
pictures as shown in figure 25(b), a predetermined picture can be
processed without decoding it. Therefore, if a picture that is
not needed by the user is previously known at the coding end,
such picture can be omitted to reduce the processing time for
decoding.
In figure 25(b), even when the picture B3 is not decoded,
other pictures can be decoded.
That is, since, in the second embodiment, the picture B4 is
decoded with reference to the picture B3 in the direct mode, the
picture B3 must be decoded. In this fourth embodiment, however,
since a picture to be referred to in the direct mode is
arbitrarily set at the coding end, decoding of the picture B3 can
be dispensed with.
Furthermore, when the target block is processed in the
direct mode at the decoding end, since the forward reference
candidate picture to which the reference picture index [0] is
assigned is immediately used as a reference picture, decoding
time can be reduced.
While in the first to fourth embodiments a B picture is not
referred to when coding or decoding a P picture, a B picture may
be referred to when coding or decoding a P picture.
Further, while in the first to fourth embodiments a time-
basis distance between pictures is calculated according to the
display times of the respective pictures, it may be calculated
according to information other than time information such as the
display times of pictures.
For example, a counter value that is incremented every time
a picture is processed is set, and a time-basis distance between
pictures may be calculated according to this count value.
To be specific, when time information is included in both of
a video stream and an audio stream corresponding to a single
contents, it is not easy to manage video data and audio data on
the basis of the time information so as to maintain
synchronization between these data, because a unit of time
information is small. However, management considering
synchronization between video data and audio data is facilitated
by managing arrangement of the respective pictures with the
counter value.
Furthermore, in the first to fourth embodiments, a header
section and a data section in a data processing unit, such as a
GOP or a picture, are not separated from each other, and they are
included in a bit stream corresponding to each data processing
unit to be transferred. However, the header section and the data
section may be separated from each other to be transferred in
different streams.
For example, when a stream is transferred in units of data
transfer such as packets into which the stream is divided, a
header section and a data section corresponding to a picture may
be transferred separately from each other. In this case, the
header section and the data section are not always included in
the same stream. However, in data transfer using packets, even
when the header section and the data section are not continuously
transferred, the corresponding header section and data section
are merely transferred in different packets, and the relationship
between the corresponding header section and data section is
stored in header information of each packet, and therefore, it is
substantially identical to that the header section and the data
section are included in the same bit stream.
Furthermore, while in the first to fourth embodiments the
reference picture indices are used as information for identifying
which one of plural reference candidate pictures is referred to
in coding a target block, the reference picture indices may be
used as information indicating the positions of plural forward
reference candidate pictures for a target picture to be coded or
decoded. To be specific, in the reference picture index
assignment methods according to the first and second embodiments
or the default assignment methods according to the third and
fourth embodiments, reference picture indices are assigned to the
plural forward reference candidate pictures such that a smaller
reference picture index is assigned to a candidate picture closer
to the target picture, and therefore, the position of each
forward reference candidate picture (i.e., the ordinal rank of
each forward reference candidate picture in nearness to the
target picture, among all forward reference candidate pictures)
can be detected according to the reference picture index assigned
to the forward reference candidate picture.
Furthermore, position identification information indicating
the positions of the respective pictures constituting a moving
picture on the display time axis may be included in the bit
stream corresponding to the moving picture, separately from the
reference picture indices indicating the relative positions of
the forward reference candidate pictures. The position
identification information is different from the time information
indicating the display times of pictures, and it is information
specifying the relative positions of the respective pictures.
Moreover, in the first to fourth embodiments, a picture that
is to be backward referred to when coding a block in a target
picture to be coded or decoded (backward reference picture for a
target picture) is used as a base picture in the direct mode.
However, a base picture to be used in the direct mode may be an
already-processed picture other than the backward reference
picture for the target picture, e.g., a picture to be forward
referred to when coding the block in the target picture.
[Embodiment 5]
Figure 27 is a block diagram for explaining a moving picture
coding apparatus 50 according to a fifth embodiment of the
present invention.
The moving picture coding apparatus 50 according to the
fifth embodiment is different from the moving picture coding
apparatus 10 according to the first embodiment in candidate
pictures for forward reference pictures to be referred to when
coding a P picture and a B picture, and coding modes for a B
picture.
That is, the moving picture coding apparatus 50 is provided
with, instead of the control unit 110 and the mode selection unit
109 according to the first embodiment, a control unit 150 and a
mode selection unit 159 which operate in different manners from
those described for the first embodiment.
To be specific, the control unit 150 according to the fifth
embodiment controls a reference picture memory 117 in such a
manner that, when coding a P picture, four pictures (I or P
pictures) which are positioned forward the P picture are used as
candidate pictures for forward reference, and when coding a B
picture, four pictures (I or P pictures) which are positioned
forward the B picture, a forward B picture that is closest to the
B picture, and a backward I or P picture are used as candidate
pictures.
Further, when coding a block (target block) in a P picture,
the mode selection unit 159 according to the fifth embodiment
selects, as a coding mode for the target block, one from among
the intra-picture coding, the inter-picture predictive coding
using a motion vector, and the inter-picture predictive coding
using no motion vector (a motion is treated as zero). When
coding a block (target block) in a B picture, the mode selection
unit 159 selects, as a coding mode for the target block, one from
among the intra-picture coding, the inter-picture predictive
coding using a forward motion vector, the inter-picture
predictive coding using backward motion vector, and the inter-
picture predictive coding using a forward motion vector and a
backward motion vector. That is, the mode selection unit 159 of
the moving picture coding apparatus 50 according to this fifth
embodiment is different from the mode selection unit 109 of the
moving picture coding apparatus 10 according to the first
embodiment only in that it does not use the direct mode, and
therefore, the moving picture coding apparatus 50 does not have
the motion vector storage unit 116 of the moving picture coding
apparatus 10.
Further, the moving picture coding apparatus 50 according to
the fifth embodiment is identical to the moving picture coding
apparatus 10 according to the first embodiment except the coding
control unit 150 and the mode selection unit 159.
Next, the operation will be described.
Input pictures are stored in the input picture memory 101,
in units of pictures in order of display times. As shown in
figure 29(a), input pictures P0, B1, B2, P3, B4, B5, P6, B7, B8,
P9, B10, B11, P12, B13, B14, P15, B16, B17, and P18 are stored in
the input picture memory 101 in order of display times.
The respective pictures stored in the input picture memory
101 are rearranged in coding order as shown in figure 29(b).
This rearrangement is carried out according to the relationships
between target pictures and reference pictures during inter-
picture predictive coding. That is, rearrangement of the input
pictures is carried out such that a second picture to be used as
a candidate for a reference picture when coding a first picture
should be coded prior to the first picture.
In this fifth embodiment, when coding a P picture (target
picture), four pictures (I or P pictures) which are positioned
timewise forward and close to the target picture are used as
candidates for a reference picture. Further, when coding a B
picture, four pictures (I or P pictures) which are positioned
timewise forward and close to the target picture, a B picture
which is positioned timewise forward and closest to the target
picture, and an I or P picture which is positioned timewise
backward and closest to the target picture, are used as
candidates for a reference picture.
The respective pictures rearranged in the input picture
memory 101 are read out for each unit of motion compensation. In
this fifth embodiment, the unit of motion compensation is a
rectangle area (macroblock) in which pixels are arranged in
matrix, having a size of 16 pixels in the horizontal direction X
16 pixels in the vertical direction. In the following
description, a macroblock is simply referred to as a block.
Hereinafter, coding processes for the pictures P15, B13, and
B14 will be described in this order.
(Coding Process for Picture P15)
Since the picture P15 is a P picture, this picture is
subjected to inter-picture predictive coding using forward
reference. Further, in coding a P picture, no B picture is used
as a reference picture.
Figure 28 shows the manner of picture management in the
reference picture memory 117.
For example, at start of coding the picture P15, in the
reference picture memory 117, the pictures P12, B11, P9, P6, and
P3 are stored in memory areas to which logical memory numbers are
assigned, in ascending order of the logical memory numbers.
These pictures have already been coded, and the image data stored
in the reference picture memory 117 are image data which have
been decoded in the moving picture coding apparatus 50.
Hereinafter, for simplification, a picture whose image data is
stored in the memory is referred to as a picture stored in the
memory.
The reference candidate pictures stored in the reference
picture memory 117 are assigned reference picture indices under
control of the coding control unit 150. The assignment of
reference picture indices is carried out not in order of picture
coding but in order of display times. To be specific, a smaller
reference picture index is assigned to a newer reference
candidate picture, i.e., a reference candidate picture which is
later in display order. However, in coding a P picture, no
reference picture indices are assigned to B pictures. Further,
in coding a B picture, a newest reference candidate picture is
assigned a code [b] indicating that this picture should be
treated as a backward reference picture.
According to the above-mentioned reference picture index
determining method, as shown in figure 28, reference picture
indices [0], [1], [2], and [3] are assigned to the pictures P12,
P9, P6, and P3, respectively, and no reference picture index is
assigned to the picture B11.
By the way, in coding a P picture, the coding control unit
150 controls the respective switches so that the switches 113,
114, and 115 are turned ON. A block in the picture P15 that is
read from the input picture memory 101 is input to the motion
vector detection unit 108, the mode selection unit 109, and the
difference calculation unit 102.
The motion vector detection unit 108 detects a motion vector
of the block in the picture P15, using the pictures P12, P9, P6,
and P3 to which the reference picture indices are assigned, among
the pictures stored in the input picture memory 117. In this
case, an optimum reference candidate picture is selected from
among the pictures P12, P9, P6, and P3, and detection of the
motion vector is carried out with reference to the selected
reference picture. Then, the detected motion vector is output to
the mode selection unit 159 and the bit stream generation unit
104. Further, information Rp indicating which one of the
pictures P12, P9, P6, and P3 is referred to in detecting the
motion vector, i.e., the reference picture index, is also output
to the mode selection unit 159.
The mode selection unit 159 determines a coding mode for the
block in the picture P15, using the motion vector detected by the
motion vector detection unit 108. The coding mode indicates a
method for coding the block. For example, for a block in a P
picture, a coding mode is selected from among the intra-picture
coding, the inter-picture predictive coding using a motion vector,
and the inter-picture predictive coding using no motion vector
(i.e., motion is regarded as 0). Generally, selection of a
coding mode is carried out so that coding error at a
predetermined amount of bits is minimized.
The coding mode Ms determined by the mode selection unit 159
is output to the bit stream generation unit 104. Further, when
the determined coding mode is the coding mode which performs
forward reference, the reference picture index is also output to
the bit stream generation unit 104.
Further, a prediction image Pd which is obtained on the
basis of the coding mode determined by the mode selection unit
152 is output to the difference calculation unit 102 and the
addition unit 106. However, when the intra-picture coding is
selected, no prediction image Pd is outputted. Further, when the
intra-picture coding is selected, the switch 111 is controlled so
that the input terminal Ta is connected to the output terminal
Tb2, and the switch 112 is controlled so that the output terminal
Td is connected to the input terminal Tc2.
Hereinafter, a description will be given of a case where the
inter-picture predictive coding is selected in the mode selection
unit 109. Since the operations of the difference calculation .
unit 102, prediction error coding unit 103, bit stream generation
unit 104, and prediction error decoding unit 105 are identical to
those mentioned for the first embodiment, repeated description is
not necessary.
When coding of all blocks in the picture P15 is completed,
the coding control unit 150 updates the logical memory numbers
and the reference picture indices corresponding to the pictures
stored in the reference picture memory 117.
That is, since the coded picture P15 is later in the order
of display times than any pictures stored in the reference
picture memory 117, the picture P15 is stored in the memory area
in which the logical memory number (0) is set. Then, the logical
memory numbers of the memory areas where other reference pictures
have already been stored are incremented by 1. Further, since
the next target picture to be coded is the picture B13 that is a
B picture, a reference picture index is also assigned to the
picture B11. Thereby, the pictures P15, P12, B11, P9, P6, and P3
are stored in the memory areas in which the logical memory
numbers (0)~(5) are set, respectively, and the reference picture
indices [0], [1], [2], [3], and [4] are assigned to the pictures
P12, B11, P9, P6, and P3, respectively. Since the next target
picture is a B picture, the picture P15 stored in the memory area
of the logical memory number 0 is assigned a code [b] indicating
that this picture is treated as a backward reference picture,
instead of the reference picture index.
(Coding Process for Picture B13>
Since the picture B13 is a B picture, this picture is
subjected to inter-picture predictive coding using bidirectional
reference. In this -case, four I or P pictures which are timewise
close to the target picture and a B picture which is timewise
closest to the target picture are used as candidate pictures for
forward reference, and an I or P picture which is timewise
closest to the target picture is used as a candidate picture for
backward reference. Accordingly, the candidate pictures for
forward reference for the picture B13 are the pictures P12, B11,
P9, P6, and P3, and the candidate picture for backward reference
for the picture B13 is the picture P15. These reference
candidate pictures are stored in the reference picture memory 117.
These reference candidate pictures are assigned logical memory
numbers and reference picture indices as shown in figure 28.
In coding a B picture, the coding control unit 150 controls
the respective switches so that the switches 113, 114, and 115
are turned ON. Accordingly, a block in the picture B11 that is
read from the input picture memory 101 is input to the motion
vector detection unit 108, the mode selection unit 109, and the
difference calculation unit 102.
The motion vector detection unit 108 detects a forward
motion vector and a backward motion vector of the block in the
picture B13, using the pictures P12, B11, P9, P6, and P3 stored
in the reference picture memory 117, as candidate pictures for
forward reference, and the picture P15 as a candidate picture for
backward reference. In this case, an optimum picture is selected
from among the pictures P12, B11, P9, P6, and P3, and detection
of the forward motion vector is carried out with reference to the
selected picture. Then, the detected motion vector is output to
the mode selection unit 159 and the bit stream generation unit
104. Further, information Rp indicating which one of the
pictures P12, B11, P9, P6, and P3 is referred to in detecting the
forward motion vector, i.e., the reference picture index, is also
output to the mode selection unit 159.
The operations of the mode selection unit 150, difference
calculation unit 102, bit stream generation unit 104, and
prediction error decoding unit 105 are identical to those for
coding the picture P15.
When coding of all blocks in the picture B13 is completed,
the coding control unit 150 updates the logical memory numbers
and the reference picture indices corresponding to the pictures
stored in the reference picture memory 117.
That is, since the picture B13 is positioned, in order of
display times, before the picture P15 stored in the reference
picture memory 117 and after the picture P12 stored in the
reference picture memory 17, the picture B13 is stored in the
memory area in which the logical memory number (1) is set.
Further, since the picture B11 is not used as a reference picture
in coding the subsequent pictures, the picture B11 is deleted.
. At this time, information indicating that the picture B11 is
deleted from the reference picture memory is output to the bit
stream generation unit 104 as a control signal Cs1. The bit
stream generation unit 104 describes this information as header
information in the bit stream. Further, the logical memory
number of the memory area corresponding to the picture P12 is
incremented by 1.
The next target picture to be coded is the picture B14 as a
B picture, Accordingly, the picture stored in the memory area
with the logical memory number (0) is used as a backward
reference picture, and reference picture indices are assigned to
the other pictures. Thereby, the pictures P15, B13, P12, P9, P6,
and P3 are stored in the memory areas corresponding to the
logical memory numbers (0)~(5), respectively, and the reference
picture indices [0], [1], [2], [3], and [4] are assigned to the
pictures B13, P12, P9, P6, and P3, respectively.
(Coding Process for Picture B14)
Since the picture B14 is a B picture, this picture is
subjected to inter-picture predictive coding using bidirectional
reference. As reference pictures for the picture B14, the
pictures B13, P12, P9, P6, and P3 are used as forward reference
pictures while the picture P15 is used as a backward reference
picture. In processing a B picture, the coding control unit 150
controls the respective switches so that the switches 113, 114,
and 115 are turned ON. Accordingly, a block in the picture B14
that is read from the input picture memory 101 is input to the
motion vector detection unit 108, the mode selection unit 109,
and the difference calculation unit 102.
The motion vector detection unit 108 detects a forward
motion vector and a backward motion vector of the block in the
picture B14, using the pictures B13, P12, P9, P6, and P3 stored
in the reference picture memory 117 as candidate pictures for
forward reference as well as the picture P15 as a candidate
picture for backward reference. In this case, an optimum picture
is selected from among the pictures B13, P12, P9, P6, and P3, and
detection of the forward motion vector is carried out with
reference to the selected picture. Then, the detected motion
vector is output to the mode selection unit 159 and the bit
stream generation unit 104. Further, information Rp indicating
which one of the pictures B13, P12, P9, P6, and P3 is referred to
in detecting the forward motion vector, i.e., the reference
picture index, is also output to the mode selection unit 159.
The operations of the mode selection unit 150, difference
calculation unit 102, bit stream generation unit 104, prediction
error decoding unit 105, and addition unit 106 are similar to
those for coding the picture P15.
When coding of all blocks in the picture B14 is completed,
the coding control unit 150 updates the logical memory numbers
and the reference picture indices corresponding to the pictures
stored in the reference picture memory 117.
That is, since the picture B14 is positioned, in order of
display times, before the picture P15 stored in the reference
picture memory 117, and later than the picture B13 stored in the
reference picture memory 117, the picture B14 is stored in the
memory area in which the logical memory number (1) is set.
Further, since the picture B13 is not used as a reference picture
in coding the subsequent pictures, the picture B13 is deleted.
At this time, information indicating that the picture B13 is
deleted from the reference picture memory is output to the bit
stream generation unit 104 as a control signal Cd1. The bit
stream generation unit 104 describes this information as header
information in the bit stream.
The next target picture to be coded is the picture P18 that
is a P picture. Accordingly, reference picture indices are
assigned to the pictures other than B pictures. Thereby, the
pictures P15, B14, P12, P9, and P6 are stored in the memory areas
corresponding to the logical memory numbers (0)~(5),
respectively, and the reference picture indices [0], [1], [2],
and [3] are assigned to the pictures P15, P12, P9, and P6,
respectively.
As described above, according to the fifth embodiment,
plural candidate pictures for forward reference for a target
picture to be coded are assigned reference picture indices such
that a smaller index is assigned to a candidate picture whose
display time is later (i.e., information for identifying which
one of the candidate pictures is used in detecting the forward
motion vector of the target block). Therefore, a candidate
picture which is most likely to be selected as a reference
picture among the plural candidate pictures is assigned a smaller
reference picture index. Accordingly, the amount of codes of the
reference picture indices can be minimized, resulting in an
increase in coding efficiency.
Hereinafter, the effects of this fifth embodiment will be
described taking a case where coding of a B picture is carried
out using another B pictures as a reference candidate picture,
together with the problems of the prior art.
For example, it is assumed that pictures of a moving picture
are arranged in display order as shown in figure 2 9(a), and four
P pictures and one B picture are used as candidate pictures for
forward reference in coding a target picture.
Figure 30 shows an example of management of pictures stored
in the reference picture memory. The candidate pictures are
stored in coding order, in the memory.
When coding the picture P15, in the reference picture memory,
the pictures B11, P12, P9, P6, and P3 are stored in the memory
areas, in ascending order of the logical memory numbers. Further
these candidate pictures are assigned the reference picture
indices [0], [1], [2], [3], and [4], respectively. Therefore, a
reference picture index is assigned to a B picture (picture B11
in this case) which is not used as a reference picture in coding
a P picture, and the reference picture index not to be used
causes degradation in coding efficiency.
Further, when coding the picture B13, in the reference
picture memory, the pictures P15, B11, P12, P9, P6, and P3 are
stored in the memory areasr in ascending order of the logical
memory numbers. The picture P15 is assigned a code [b]
indicating that this picture is used as a backward reference
picture, and the remaining pictures are assigned the reference
picture indices [0], [1], [2], [3], and [4], respectively.
Therefore, the reference picture index assigned to the picture B11 that is timewise far from the picture B13 (target picture) is
smaller than the reference picture index assigned to the picture
P12 that is timewise close to the target picture B13. In
performing motion detection, generally, a candidate picture that
is timewise closer to a target picture is more likely to be used
as a reference picture. Accordingly, when the reference picture
index of the picture B11 that is far from the target picture is
smaller than the reference picture index of the picture P12 that
is close to the target picture, coding efficiency is degraded.
Furthermore, when coding the picture B14, in the reference
picture memory, the pictures B13, P15, B11, P12, P9, and P6 are
stored in the memory areas, in ascending order of the logical
memory numbers. The picture B13 is assigned a code [b]
indicating that this picture is used as a backward reference
picture, and the remaining pictures are assigned the reference
picture indices [0], [1], [2], [3], and [4], respectively.
Therefore, the picture P15 which should actually be used as a
candidate picture for backward reference for the picture B14, is
used as a candidate picture for forward reference. Moreover, the
picture B13 which should actually be used as a candidate picture
for forward reference for the picture B14, is used as a candidate
picture for backward reference. As a result, it becomes
difficult to perform correct coding. Further, in coding the
picture B14, the picture B11 which is not used as a reference
picture exists in the reference picture memory.
On the other hand, according to the fifth embodiment of the
invention, as shown in figure 28, the reference candidate
pictures for the target picture are stored in display order in
the reference picture memory, and the candidate pictures for
forward reference are assigned the reference picture indices such
that a candidate picture whose display time is later is assigned
a smaller reference picture index, and therefore, a candidate
picture which is more likely to be selected as a reference
picture from among the candidate pictures is assigned a smaller
reference picture index. Thereby, the amount of codes of the
reference picture indices can be minimized, resulting in an
increase in coding efficiency.
Further, since, in coding a P picture, no reference picture
indices are assigned to B pictures, occurrence of reference
picture indices that will never be used is avoided, resulting in
a further increase in coding efficiency.
Moreover, when coding a B picture, no reference picture
index is assigned to the picture that is stored in the memory
area corresponding to the smallest logical memory number, and
this picture is used as a backward reference picture. Therefore,
in predictive coding of a B picture, a P picture to be used as a
backward reference picture is prevented from being used as a
forward reference picture.
Further, when a picture that is not used as a reference
picture is deleted from the reference picture memory, information
indicating this deletion is described in the bit stream.
Therefore, the decoding end can detect that the picture which is
not to be used as a reference picture in decoding a target
picture and the following pictures, is deleted from the reference
picture memory.
In this fifth embodiment, motion compensation is performed
in units of image spaces (macroblocks) each comprising 16 pixels
in the horizontal direction X 16 pixels in the vertical
direction, and coding of a prediction error image is performed in
units of image spaces (subblocks) each comprising 8 pixels in the
horizontal direction X 8 pixel's in the vertical direction.
However, the number of pixels in each macroblock (subblock) in
motion compensation (coding of a prediction error image) may be
different from that described for the fifth embodiment.
Further, while in this fifth embodiment the number of
continuous B pictures is two, the number of continuous B pictures
may be three or more.
Further, while in this fifth embodiment four pictures are
used as candidate pictures for a forward reference picture in
coding a P picture, the number of forward reference candidate
pictures for a P picture may be other than four.
Furthermore, while in this fifth embodiment four P pictures
and one B picture are used as candidate pictures for a forward
reference picture in coding a B picture, forward reference
candidate pictures for a B picture are not restricted thereto.
Furthermore, in this fifth embodiment, each of plural
pictures constituting a moving picture, which is a target picture
to be coded, is used as a reference picture when coding another
picture that follows the target picture. However, the plural
pictures constituting a moving picture may include pictures not
to be used as reference pictures. In this case, the pictures not
to be used as reference pictures are not stored in the reference
picture memory, whereby the same effects as described for the
fifth embodiment can be achieved.
Furthermore, while in this fifth embodiment coding of a B
picture is carried out using another B picture as a reference /
candidate picture, coding of a B picture may be carried out
without referring to another B picture. In this case, no B
pictures are stored in the reference picture memory. Also in
this case, the same effects as described for the fifth embodiment
can be achieved by assigning reference picture indices according
to the order of picture display times.
Furthermore, while in this fifth embodiment a single system
of reference picture indices are assigned, different systems of
reference picture indices may be assigned in the forward
direction and the backward direction, respectively.
Moreover, while in this fifth embodiment a smaller reference
picture index is assigned to a candidate picture for forward
reference whose display time is later, the reference picture
index assignment method is not restricted thereto so long as a
smaller reference picture index is assigned to a candidate
picture that is more likely to be selected as a reference picture.
Figure 31 is a conceptual diagram illustrating the structure
of a bit stream (format of a coded image signal) corresponding to
pictures to which reference picture indices are assigned.
A coded signal Pt equivalent to one picture includes header
information Hp placed at the beginning of the picture, and a data
section Dp that follows the header information Hp. The header
information Hp includes a control signal (RPSL). The data
section Dp includes coded data (bit stream) corresponding to each
block.
For example, a bit stream BLx is a bit stream of a block
that is coded in intra-picture coding mode, and a bit stream BLy
is a bit stream of a block that is coded in inter-picture
predictive coding mode other than intra-picture coding mode.
The block bit stream BLx includes header information Hbx,
information Prx relating to a coding mode, and coded image
information Dbx. The block bit stream BLy includes header
information Hby, information Pry relating to a coding mode, first
reference picture index R1d1, a second reference picture index
Rld2, a first motion vector MV1, a second motion vector MV2, and
coded image information Dby. Which of the first and second
reference picture indices Rldl and Rld2 should be used is
determined from the information Pry relating to the coding mode.
A reference picture index Rldl is assigned to a forward
reference candidate picture with priority over a backward
reference candidate picture. A reference picture index Rld2 is
assigned to a backward reference candidate picture with priority
over a forward reference candidate picture.
[Embodiment 6]
Figure 32 is a block diagram for explaining a moving picture
decoding apparatus 60 according to a sixth embodiment of the
present invention.
The moving picture decoding apparatus 60 according to the
sixth embodiment decodes the bit stream Bs outputted from the
moving picture coding apparatus 50 according to the fifth
embodiment.
The moving picture decoding apparatus 60 is different from
the moving picture decoding apparatus 20 according to the second
embodiment in candidate pictures for forward reference pictures
to be referred to when coding a P picture and a B picture, and
coding modes for a B picture.
That is, the moving picture decoding apparatus 60 is
provided, with, instead of the memory control unit 204 and the
mode decoding unit 223 according to the second embodiment, a
memory control unit 264 and a mode decoding unit 2 63 which
operate in different manners from those described for the second
embodiment.
To be specific, the memory control unit 2 64 according to the
sixth embodiment controls a reference picture memory 207 such
that, when decoding a P picture, four pictures (I or P pictures)
which are positioned forward the P picture are used as candidate
pictures for forward reference, and when decoding a B picture,
four pictures (I or P pictures) which are positioned forward the
B picture, a forward B picture that is closest to the B picture,
and a backward I or P picture are used as candidate pictures.
Further, when decoding a block (target block) in a P picture,
the mode decoding unit 263 according to the sixth embodiment
selects, as a coding mode for the target block, one from among
plural modes as follows: intra-picture decoding, inter-picture
predictive decoding using a motion vector, and inter-picture
predictive decoding using no motion vector (a motion is treated
as zero). When decoding a block (target block) in a B picture,
the mode decoding unit 2 63 selects, as a decoding mode for the
target block, one from among plural modes as follows: intra-
picture decoding, inter-picture predictive decoding using a
forward motion vector, inter-picture predictive decoding using
backward motion vector, and inter-picture predictive decoding
using a forward motion vector and a backward motion vector.
That is, the mode decoding unit 263 of the moving picture
decoding apparatus 60 according to this sixth embodiment is
different from the mode decoding unit 223 of the moving picture
decoding apparatus 20 according to the second embodiment only in
that it does not use a decoding process corresponding to the
direct mode, and therefore, the moving picture decoding apparatus
60 does not have the motion vector storage unit 226 of the moving
picture decoding apparatus 20.
Further, the moving picture decoding apparatus 60 according
to the sixth embodiment is identical to the moving picture
decoding apparatus 20 according to the second embodiment except
the memory control unit 264 and the mode decoding unit 263.
Next, the operation of the moving picture decoding apparatus
60 will be described.
The bit stream Bs outputted from the moving picture coding
apparatus 50 according to the fifth embodiment is input to the
moving picture decoding apparatus 60 shown in figure 32. In the
bit stream Bs, each P picture has been subjected to inter-picture
predictive coding, using four I or P pictures which are
positioned timewise forward and close to the P picture, as
reference candidate pictures. Further, each B picture has been
coded using four P pictures which are positioned timewise forward
and closest to the B picture, a B picture which is positioned
timewise forward the B picture, and an I or P picture which is
positioned timewise backward and closest to the B picture.
In this case, the order of the pictures in the bit stream is
as shown in figure 29(b).
Hereinafter, decoding processes for the pictures P15, B13,
and B14 will be described in this order.
The bit stream of the picture P15 is input to the bit stream
analysis unit 201. The bit stream analysis unit 201 extracts
various kinds of data from the inputted bit stream Bs. The
various kinds of data are information such as a coding mode, a
motion vector, and the like. The extracted information for mode
selection (coding mode) Ms is output to the mode decoding unit
263. Further, the extracted motion vector MV is output to the
motion compensation decoding unit 205. Furthermore, the
prediction error coded data Ed is output to the prediction error
decoding unit 202.
The mode decoding unit 263 controls the switches 209 and 210
with reference to the coding mode Ms extracted from the bit
stream. When the coding mode is inter-picture coding, the switch
209 is controlled such that the input terminal Te is connected to
the output terminal Tf1, and the switch 210 is controlled such
that the output terminal Th is connected to the input terminal
Tg1. When the coding mode is inter-picture predictive coding,
the switch 209 is controlled such that the input terminal Te is
connected to the output terminal Tf1, and the switch 210 is
controlled such that the output terminal Th is connected to the
input terminal Tg2.
Further, the mode decoding unit 263 outputs the coding mode
Ms also to the motion compensation decoding unit 205.
Hereinafter, a description will be given of the case where
the coding mode is inter-picture predictive coding.
The prediction error decoding unit 202 decodes the inputted
coded data Ed to generate prediction error data PDd. The
generated prediction error data PDd is output to the switch 209.
Since the input terminal Te of the switch 209 is connected to the
output terminal Tf1, the prediction error data PDd is output to
the addition unit 208.
The motion compensation decoding unit 205 generates a motion
compensation image from the inputted information such as the
motion vector. The information inputted to the motion
compensation decoding unit 205 is the motion vector MV and the
reference picture index Rp. The motion compensation decoding
unit 205 obtains a motion compensation image (prediction image)
from the reference picture memory 207, on the basis of the
inputted information. The picture P15 has been coded using the
pictures P12, P9, P6, and P3 as candidates for a reference
picture, and these candidate pictures have already been decoded
and are. stored in the reference picture memory 207.
Figure 28 shows the pictures stored in the reference picture
memory 207. As shown in figure 28, when decoding the picture P15,
the pictures P12,. B11, P9, P6, and P3 are stored in the reference
picture memory 207.
The memory control unit 264 assigns reference picture
indices to the reference candidate pictures stored in the
reference picture memory 117. This assignment of reference
picture indices is carried according to the order of picture
display times such that a smaller reference picture index is
assigned to a newer reference candidate picture. In decoding a P
picture, no reference picture indices are assigned to B pictures.
Accordingly, reference picture indices [0], [1], [2], and [3] are
assigned to the pictures P12, P9, P6, and P3, respectively, and
no reference picture index is assigned to the picture B11.
The motion compensation decoding unit 205 determines which
one of the pictures P12, P9, P6, and P3 is used as a reference
picture when coding the target block, from the reference picture
indices. Then, the motion compensation decoding unit 205 obtains
a prediction image (prediction data Pd) from the reference
picture memory 207 on the basis of the determined reference
picture and the motion vector to generate a motion compensation
image. The motion compensation image so generated is input to
the addition unit 208.
The addition unit 208 adds the prediction error data PDd and
the motion compensation image to generate a decoded image (data
Ad) . The decoded image so generated is output through the switch
210 to the reference picture memory 207.
When all of the macroblocks in the picture P15 have been
decoded, the memory control unit 264 updates the logical memory
numbers and the reference picture indices corresponding to the
pictures stored in the reference picture memory 207.
At this time, since, in order of time, the picture P15 is
later than any pictures stored in the reference picture memory
117, the picture P15 is stored in the memory area in which the
logical memory number (0) is set. Then, the logical memory
numbers of the memory areas where other reference pictures have
already been stored are incremented by 1.
Further, since the next target picture to be decoded is the
picture B13, a reference picture index is assigned to the picture B11. Thereby, the pictures P15, P12, B11, P9, P6, and P3 are
stored in the memory areas in which the logical memory numbers
(0)~(5) are set, respectively, and the reference picture indices
[0], [1], [2], [3], and [4] are assigned to the pictures P12, B11,
P9, P6, and P3, respectively.
Since the operations of the bit stream analysis unit 201,
the mode decoding unit 203, and the prediction error decoding
unit 202 are identical to those described for decoding of the
picture P15, repeated description is not necessary.
The motion compensation decoding unit 205 generates a motion
compensation image from the inputted information such as the
motion vector. The information inputted to the motion
compensation decoding unit 205 is the motion vector and the
reference picture index. The picture B13 has been coded using
the pictures P12, B11, P9, P6, and P3 as candidate pictures for
forward reference, and the picture P15 as a candidate picture for
backward reference. At decoding of the picture B13, these
candidate pictures have already been decoded and are stored in
the reference picture memory 207.
When the coding mode is forward predictive coding or
bidirectional predictive coding, the motion compensation decoding
unit 205 determines which one of the candidate pictures P12, B11,
P9, P6, and P3 is used as a forward reference picture when coding
the picture B13, on the basis of the reference picture indices.
Then, the motion compensation decoding unit 205 obtains a forward
motion compensation image from the reference picture memory 207
on the basis of the determined reference picture and the motion
vector. When the coding mode is bidirectional predictive coding
or backward predictive coding, the motion compensation decoding
unit 205 obtains a backward motion compensation image from the
reference picture memory 207 on the basis of the determined
reference picture and the backward motion vector. Then, the
motion compensation decoding unit 2 05 generates a motion
compensation image (prediction picture) using the forward motion
compensation image and the backward motion compensation image.
The motion compensation image so generated is output to the
addition unit 208. The addition unit 208 adds the inputted
prediction error image and motion compensation image to generate
a decoded image. The decoded image so generated is output
through the switch 210 to the reference picture memory 207.
When all of the blocks in the picture B13 have been decoded,
the memory control unit 264 updates the logical memory numbers
and the reference picture indices corresponding to the pictures
stored in the reference picture memory 207. Since the picture
B13 is forward the picture P15 stored in the reference picture
memory 207 in the order of display times and it is later than the
picture P12 stored in the reference picture memory 207, the
picture B13 is stored in the memory area in which the logical
memory number (1) is set.
Further, information indicating that the picture B11 is to
be deleted from the reference picture memory is described in the
bit stream, the memory control unit 264 controls the reference
picture memory 207 so as to delete the picture B11 from the
memory.
Further, the logical memory number of the memory area where
the other reference candidate picture P12 is stored is
incremented by 1. Thereby, the pictures P15, B13, P12, P9, P6,
and P3 are stored in the memory areas in which the logical memory
numbers (0)~(5) are set, respectively, and the reference picture
indices [0], [1], [2], [3], and [4] are assigned to the pictures
B13, P12, P9, P6, and P3, respectively.
Since the operations of the bit stream analysis unit 201,
the mode decoding unit 203, and the prediction error decoding
unit 202 are identical to those described for decoding of the
picture P15, repeated description is not necessary.
The motion compensation decoding unit 205 generates a motion
compensation image from the inputted information such as the
motion vector. The information inputted to the motion
compensation decoding unit 205 is the motion vector and the
reference picture index. The picture B14 has been coded using
the pictures B13, P12, P9, P6, and P3 as candidate pictures for
forward reference, and the picture P15 as a candidate picture for
backward reference. At decoding of the picture B14, these
candidate pictures have already been decoded and are stored in
the reference picture memory 207.
When the coding mode is forward predictive coding or
bidirectional predictive coding, the motion compensation decoding
unit 205 determines which one of the candidate pictures B13, P12,
P9, P6, and P3 is used as a forward reference picture when coding
the picture B14, on the basis of the reference picture indices.
Then, the motion compensation decoding unit 205 obtains a forward
motion compensation image from the reference picture memory 20 7
on the basis of the determined reference picture and the forward
motion vector. When the coding mode is bidirectional predictive
coding or backward predictive coding, the motion compensation
decoding unit 205 obtains a backward motion compensation image
from the reference picture memory 207 on the basis of the
determined reference picture and the backward motion vector.
Then, the motion compensation decoding unit 205 generates a
motion compensation image, using the forward motion compensation
image and the backward motion compensation image.
The motion compensation image so generated is output to the
addition unit 208. The addition unit 208 adds the inputted
prediction error image and motion compensation image to generate
a decoded image. The decoded image so generated is output
through the switch 210 to the reference picture memory 207.
When all of the blocks in the picture B14 have been decoded,
the memory control unit 264 updates the logical memory numbers
and the reference picture indices corresponding to the pictures
stored in the reference picture memory 207. Since the picture
B14 is forward the picture P15 stored in the reference picture
memory 207 in the order of display times and it is later than the
picture B13 stored in the input picture memory 207, the picture
B14 is stored in the memory area in which the logical memory
number (1) is set. Further, since information indicating that
the picture B13 is to be deleted from the reference picture
memory is described in the bit stream, the memory control unit
264 controls the reference picture memory 207 so as to delete the
picture B13 from the memory.
Since the next target picture to be decoded is the picture
P18 that is a P picture, reference picture indices are assigned
to pictures other than B pictures. Thereby, the pictures P15,
B14, P12, P9, and P6 are stored in the memory areas in which the
logical memory numbers (0)~(5) are set, respectively, and the
reference picture indices [0], [1], [2], [3], and [4] are
assigned to the pictures P15, P12, P9, and P6, respectively.
Furthermore, the decoded pictures are outputted from the
reference picture memory 207, as output images arranged in order
of display times.
Thereafter, the subsequent pictures are similarly decoded
according to the picture type.
As described above, according to the sixth embodiment,
reference picture indices are assigned to plural candidate
pictures for forward reference for a target picture to be decoded
such that a smaller reference picture index is assigned to a
candidate picture whose display time is later (i.e., information
for identifying which candidate picture is referred to in
detecting a forward motion vector of a target block), and a
reference picture is determined from among the plural candidate
pictures on the basis of the reference picture indices included
in the bit stream of the target picture. Therefore, a smaller
reference picture index is assigned to a candidate picture that
is more likely to be used as a reference picture. Accordingly,
it is possible to correctly decode a bit stream which is obtained
by a highly-efficient coding method that can minimize the amount
of codes corresponding to the reference picture indices.
Further, since, in decoding a P picture, no reference
picture indices are assigned to B pictures, it is possible to
correctly decode a bit stream which is obtained by a highly-
efficient coding method that can avoid occurrence of reference
picture indices which will never be used.
Furthermore, since, in decoding a B picture, a picture
stored in a memory area on which a smallest logic memory number
is set is used as a backward reference picture and no reference
picture index is assigned to this picture, it is possible to
correctly decode a bit stream which is obtained by a highly-
efficient coding method that can prevent a P picture from being
used as a forward reference picture in predictive coding of a B
picture.
Moreover, when information indicating that a picture which
will never be used as a reference picture is deleted from the
reference picture memory, is described in the bit stream, the
reference picture is deleted from the reference picture memory
according to the information, whereby the reference picture
memory can be effectively used.
Further, in this sixth embodiment, as an arrangement of
plural pictures constituting a moving picture, an arrangement of
pictures in which two B pictures are placed between adjacent P
pictures. However, the number of B pictures pieced between
adjacent P pictures may be other than two, for example, it may be
three or four.
Further, while in this sixth embodiment four pictures are
used as candidate pictures for forward reference for a P picture,
the number of forward reference candidate pictures for a P
picture may be other than four.
While in this sixth embodiment four P pictures and one B
picture are used as candidate pictures for forward reference for
a B picture, forward reference candidate pictures for a B picture
are not restricted thereto.
While in this sixth embodiment each of plural pictures
constituting a moving picture is used as a reference picture when
decoding another picture that follows this picture, plural
pictures constituting a moving picture, which are to be decoded,
may include pictures which will never be used as reference
pictures. In this case, the pictures useless as reference
pictures are not stored in the reference picture memory, whereby
the same effects as described for the sixth embodiment can be
achieved.
While in this sixth embodiment decoding of a B picture is
carried out using another B picture as a reference candidate
picture, decoding of a B picture may be carried out without
referring to another B picture. In this case, no B pictures are
stored in the reference picture memory. Also in this case, the
same effects as described for the sixth embodiment can be
achieved by assigning reference picture indices according to the
order of picture display times.
While in this sixth embodiment, for simplification, a memory
for managing reference candidate pictures, and a memory for
rearranging decoded pictures in display order to output them are
not separated but described as a single reference picture memory,
the moving picture decoding apparatus 60 may be provided with a
management memory for managing reference candidate pictures, and
a rearrangement memory for rearranging decoded pictures in
display order, respectively.
In this case, the management memory is controlled by the
memory controller 264, and outputs reference candidate pictures
to the motion compensation decoding unit 205. Further, the
rearrangement memory rearranges the decoded pictures arranged in
decoding order, in display order, and sequentially outputs the
pictures.
Further, in this sixth embodiment, assignment of reference
picture indices to candidate pictures is carried out according to
a single rule, i.e., one system of reference picture indices are
used. However, two systems of reference picture indices may be
used, as described for the fifth embodiment.
[Embodiment 7]
Figure 33 is a block diagram for explaining a moving picture
coding apparatus 70 according to a seventh embodiment of the
present invention.
This moving picture coding apparatus 70 is different from
the moving picture coding apparatus iO according to the first
embodiment in candidate pictures for forward reference pictures
to be referred to when coding a P picture and a B picture, and
coding modes for a B picture.
That is, the moving picture coding apparatus 70 is provided
with, instead of the control unit 110 and the mode selection unit
109 according to the first embodiment/ a coding control unit 170
and a mode selection unit 109 which operate in different manners
from those described for the first embodiment.
To be specific, the coding control unit 170 according to the
seventh embodiment controls a reference picture memory 117 such
that, when coding a P picture, three pictures (I or P pictures)
which are positioned forward the P picture are used as candidate
pictures for forward reference, and when coding a B picture, two
pictures (I or P pictures) which are positioned forward the B
picture, a forward B picture that is closest to the B picture,
and a backward I or P picture are used as candidate pictures.
However, a B picture, which is positioned forward an I or p
picture that is positioned forward and closest to the target
picture, is not referred to.
The coding control unit 170 controls the bit stream
generation unit 104 with a control signal Cd so that a flag
indicating whether or not a target picture is to be referred to
when coding subsequent pictures is inserted in the bit stream.
To be specific, the code generation unit 104 is controlled with
the control signal Cd so that information indicating that data of
the target picture should be stored in the reference picture
memory 117 at decoding as well as information indicating a period
of time for the storage are added to the bit stream.
Furthermore, when coding a block (target block) in a P
picture, the mode selection unit 109 according to the seventh
embodiment selects, as a coding mode for the target block, one
from among plural modes as follows: intra-picture coding, inter-
picture predictive coding using a motion vector, and inter-
picture predictive coding using no motion vector (a motion is
treated as zero). When coding a block (target block) in a B
picture, the mode selection unit 179 selects, as a coding mode
for the target block, one from among plural modes as follows:
intra-picture coding, inter-picture predictive coding using a
forward motion vector, inter-picture predictive coding using
backward motion vector, and inter-picture predictive coding using
a forward motion vector and a backward motion vector. That is,
the mode selection unit 179 of the moving picture coding
apparatus 70 according to this seventh embodiment is different
from the mode selection unit 109 of the moving picture coding
apparatus 10 according to the first embodiment only in that it
does not use the direct mode, and therefore, the moving picture
coding apparatus 10 does not have the motion vector storage unit
116 of the moving picture coding apparatus 10. Other
constituents of the moving picture coding apparatus 70 according
to the seventh embodiment are identical to those of the moving
picture coding apparatus 10 according to the first embodiment.
The moving picture coding apparatus 70 according to the
seventh embodiment is different from the moving picture decoding
apparatus 50 according to the fifth embodiment in that the coding
control unit 170 controls the bit stream generation unit 104 so
that a flag indicating whether or not a target picture is to be
referred to when coding subsequent pictures is inserted in the
bit stream. To be specific, the code generation unit 104 is
controlled with the control signal Cd so that a flag indicating
whether or not a target picture is to be referred to when coding
subsequent pictures is inserted in the bit stream corresponding
to the target picture. Further, the moving picture coding
apparatus 70 is different from the moving picture coding
apparatus 50 in candidate pictures to be referred to in coding a
P picture and a B picture. The moving picture coding apparatus
70 is identical to the moving picture coding apparatus 50 in
aspects other than those mentioned above.
Next, the operation of the moving picture coding apparatus
70 will be described.
Input image data Id are stored into the input picture memory
101, in units of pictures, in order of time.
Figure 34(a) shows the order of pictures inputted to the
input picture memory 101.
As shown in figure 34(a), the respective pictures are
successively inputted to the input picture memory 101, starting
from a picture P1. In figure 34(a), pictures P1, P4, P7, P10,
P13, P16, P19, and P22 are P pictures while pictures B2, B3, B5,
B6, B8, B9, B11, B12, B14, P15, B17, P18, B20, and B21 are B
pictures.
When coding a P picture, three pictures (I or P pictures)
which are timewise forward and close to the P picture are used as
candidates for a reference picture. Further, when coding a B
picture, two pictures (I or P pictures) which are timewise
forward and close to the B picture, one B picture that is forward
and closest to the B picture, and an I or P picture that is
forward the B picture, are used as candidates for a reference
picture. However, in coding a B picture, a B picture which is
positioned forward an I or P picture that is timewise forward and
closest to the B picture is not referred to. When coding an I
picture, other pictures are not referred to.
The data Id of the respective pictures inputted to the input
picture memory 101 are rearranged in coding order. Thereinafter
the data of each picture is referred to simply as a picture.
That is, the process of changing the order of the pictures
from input order to coding order is carried out on the basis of
the relationships between target pictures and reference pictures
in inter-picture predictive coding. In the rearrangement, the
respective pictures are rearranged so that a second picture to be
used as a candidate for a reference picture in coding a first
picture is coded prior to the first picture.
To be specific, the correspondences between the pictures P10
~P13 and the reference candidate pictures are shown by arrows in
figure 34(a). That is, when coding the P picture P10, the
pictures P1, P4, and P7 are referred to, and when coding the P
picture P13, the pictures P4, P7, and P10 are referred to.
Further, when coding the B picture B11, the pictures P7, P10, and
P13 are referred to, and when coding the B picture B12, the
pictures P7, P10, B11, and P13 are referred to.
Figure 34(b) shows the order of the pictures after
rearranging the pictures B2 to P22 shown in figure 34(a). After
the rearrangement, the respective pictures are arranged in order
of P4, B2, B3, P7, B5, B6, P10, B8, B9, P13, B11, B12, P16, B14,
B15, P19, B17, B18, and p22.
The respective pictures rearranged in the reference picture
memory 101 are successively read out, for each predetermined data
processing unit, in order of coding times. In this seventh
embodiment, the data processing unit is a unit of data on which
motion compensation is carried out and, more specifically, it is
a rectangle image space (macroblock) in which 16 pixels are
arranged in both the horizontal direction and the vertical
direction. In the following description, a macroblock is simply
referred to as a block.
Hereinafter, coding processes for the pictures P13, B11, and
B12 will be described in this order.
Since the picture P13 is a P picture, inter-picture
predictive coding using forward reference is carried out as a
coding process for the picture P13. In this case, three I or P
pictures which are positioned forward the target picture (picture
P13) are used as reference candidate pictures, and specifically,
the pictures P4, P7, and P10 are used. These reference candidate
pictures have already been coded, and the corresponding to
decoded image data Dd are stored in the reference picture memory
117.
In coding a P picture, the coding control unit 170 controls
the respective switches so that the switches 113, 114, and 115
are turned ON.
Data Md corresponding to a block in the picture P13, which
is read from the input picture memory 101, is input to the motion
vector detection unit 108, the mode selection unit 179, and the
difference calculation unit 102.
The motion vector detection unit 108 detects the motion
vector MV of the block in the picture P13, using the decoded
image data Rd of the pictures P4, P7, and P10 stored in the
reference picture memory 117. In this case, an optimum picture
is selected from among the pictures P4 P7, and P10, and detection
of the motion vector is carried out with reference to the
selected picture. Then, the detected motion vector MV is output
to the mode selection unit 179 and the bit stream generation unit
104. Further, information indicating which one of the pictures
P4, P7, and P10 is referred to in detecting the motion vector MV
(reference picture information) is also output to the mode
selection unit 179.
The mode selection unit 179 determines a coding mode for the
block in the picture P13, using the motion vector detected by the
motion vector detection unit 108.
To be specific, in the case of coding a P picture, a coding
mode is selected from among the following coding modes: intra-
picture coding, inter-picture predictive coding using a motion
vector, and an inter-picture predictive coding using no motion
vector (i.e., motion is regarded as 0). In determining a coding
mode, generally, a coding mode which minimizes coding errors when
a predetermined amount of bits is given to the block as an amount
of codes, is selected.
The coding mode Ms determined by the mode selection unit 179
is output to the bit stream generation unit 104. Further, when
the determined coding mode Ms is the coding mode which performs
forward reference, information indicating which one of the
pictures P4, P7, and P10 is referred to in detecting the forward
motion vector (forward motion vector) is also output to the bit
stream generation unit 104.
Then, prediction image data Pd, which is obtained from the
reference picture according to the coding mode Ms that is
determined by the mode selection unit 179, is output to the
difference calculation unit 102 and the addition unit 106.
However, when the intra-picture coding mode -is selected, no
prediction image data Pd is outputted. Further, when the intra-
picture coding is selected, the switches 111 and 112 are
controlled in the same manner as described for the fifth
embodiment.
Hereinafter, a description will be given of a case where the
inter-picture predictive coding mode is selected as the coding
mode Ms.
The difference calculation unit 102, the prediction error
coding unit 103, the bit stream generation unit 104, the
prediction error decoding unit 105, and the coding control unit
170 are identical to those described for the fifth embodiment.
However, in this seventh embodiment, information indicating
that the picture P13 is coded using forward three I or P pictures
as reference candidate pictures, is added as header information
of the picture P13. Further, since the picture P13 will be
referred to when coding another picture, information (flag)
indicating that decoded data Dd corresponding to the picture P13 should be stored in the reference picture memory 117 at decoding,
is also added as header information of the picture P13. Further,
information indicating that the picture P13 should be stored in
the reference picture memory until decoding of the picture P22 is
completed, is also added as header information of the picture P13.
The storage period for the picture P13 may be indicated by
time information of the picture P22 (e.g., time-basis positional
information such as a picture number, decoding time information,
or display time information), or period information from the
picture P13 to the picture P22 (e.g., the number of pictures).
The header information described above may be described as header
information in picture units, i.e., as header information for
every target picture to be coded. Alternatively, it may be
described as header information of the entire sequence, or as
header information in units of frames (e.g., in units of GOPs in
MPEG) .
When the coding mode for each block in the picture P13 is
one performing forward reference, information indicating which
one of the pictures P4, P7, and P10 is referred to in detecting
the forward motion vector (reference picture information) is also
added to the bit stream. For example, when the motion vector is
obtained with reference to the picture P10, information
indicating that the P picture just previous to the target picture
is used as a reference picture (reference picture index) is added
to the bit stream. When the motion vector is obtained with
reference to the picture P7, information indicating that the P
picture two-pictures previous to the target picture is used as a
reference picture (reference picture index) is added to the bit
stream. When the motion vector is obtained with reference to the
picture P4, information indicating that the P picture three-
pictures previous to the target picture is used as a reference
picture (reference picture index) is added to the bit stream.
For example, a reference picture index [0] may be used to
indicate that the P picture just previous to the target picture
is used as a reference picture, a reference picture index [1] may-
be used to indicate that the P picture two-pictures previous to
the target picture is used as a reference picture, and a
reference picture index [2] may be used to indicate that the P
picture three-pictures previous to the target picture is used as
a reference picture.
Further, information indicating that the P picture is
subjected to inter-picture predictive coding using three
reference candidate pictures is described as header information.
The remaining macroblocks in the picture P13 are coded in
like manner as described above. When all of the macroblocks in
the picture P13 have been coded, coding of the picture B11 takes
place.
Since the picture B11 is a B picture, inter-picture
predictive coding using bidirectional reference is carried out as
a coding process for the picture B11. In this case, two pictures
(I or P pictures which are timewise close to the target picture
(picture B11) and a B picture which is timewise closest to the
target picture are used as candidate pictures for forward
reference, and an I or P picture which is timewise closest to the
target picture is used as a candidate picture for backward
reference. However, a B picture which is positioned beyond an I
or P picture closest to the target picture is never be referred
to.
Accordingly, the pictures P7 and P10 are used as forward
reference pictures for the picture B11, and the picture P13 is
used as a backward reference picture for the picture B11. In
processing a first B picture between continuous two B pictures,
since this first B picture is used as a reference picture in
coding the other B picture, the coding control unit 17 0 controls
the respective switches so that the switches 113, 114, and 115
are turned ON. Accordingly, the image data Md corresponding to
the block in the picture B11, which is read from the input
picture memory 101, is input to the motion vector detection unit
108, the mode selection unit 179, and the difference calculation
unit 102.
The motion vector detection unit 108 detects a forward
motion vector and a backward motion vector corresponding to the
target block in the picture B11, with reference to the pictures
P7 and P10 stored in the reference picture memory 117, as
candidate pictures for forward reference, and the picture P13
stored in the reference picture memory 117, as a backward
reference picture. In this case, either the picture P7 or the
picture P10 is selected as a most suitable reference picture, and
detection of a forward motion vector is carried out according to
the selected picture. The detected motion vectors are output to
the mode selection unit 179 and the bit stream generation unit
104. Further, information indicating which one of the pictures
P7 and P10 is referred to in detecting the forward motion vector
(reference picture information) is also output to the mode
selection unit 179.
The mode selection unit 179 determines a coding mode for the
target block in the picture B11, using the motion vectors
detected by the motion vector detection unit 108. As a coding
mode for the B picture, one of the following coding modes is
selected: intra-picture coding mode, inter-picture predictive
coding mode using a forward motion vector, inter-picture
predictive coding mode using a backward motion picture, and
inter-picture predictive coding mode using bidirectional motion
vectors. Also in this case, a general method (mode) which
minimizes coding errors when a predetermined amount of bits are
given as the amount of codes, should be selected.
The coding mode determined by. the mode selection unit 179 is
output to the bit stream generation unit 104. Further,
prediction image data Pd, which is obtained from the reference
picture according to the coding mode Ms that is determined by the
mode selection unit 179, is output to the difference calculation
unit 102 and the addition unit 106. However, when the intra-
picture coding mode is selected by the mode selection unit 179,
no prediction image data Pd is outputted. Further, when the
intra-picture coding is selected, the switches 111 and 112 are
controlled in the same manner as described for the coding process
of the picture P13.
Hereinafter, a description will be given of a case where the
inter-picture predictive coding is selected by the mode selection
unit 179.
In this case, the operations of the difference calculation
unit 102, the prediction error coding unit 103, the bit stream
generation unit 104, the prediction error decoding unit 105, and
the coding control unit 170 are identical to those described for
the fifth embodiment.
When the coding mode is one performing forward reference,
information indicating which one of the pictures P7 and P10 is
referred to in detecting the forward motion vector (reference
picture information) is also added to the bit stream. For.
example, when picture P10 is referred to, reference picture
information indicating that a candidate picture just previous to
the target picture is used as a reference picture is added to the
bit stream. When the picture P7 is referred to, reference
picture information indicating that a candidate picture two-
pictures previous to the target picture is used as a reference
picture is added to the bit stream. For example, a reference
picture index [0] may be used to indicate that a candidate
picture just previous to the target picture is used as a
reference picture, and a reference picture index [1] may be used
to indicate that a candidate picture two-pictures previous to the
target picture is used as a reference picture.
Further, in this case, information indicating that the
target B picture is subjected to inter-picture predictive coding
using a forward B picture as a reference picture is not added as
header information. Furthermore, information indicating that the
forward reference candidate pictures for the target B picture are
two I or P pictures and one B picture is added as header
information. Moreover, information indicating that a B picture,
which is positioned forward an I or P picture that is positioned
forward and closest to the target B picture, is not referred to
is added as header information.
Thereby, it is possible to know the capacity of a reference
picture memory that is needed in decoding the bit stream Bs
generated in the moving picture coding apparatus 70 according to
the seventh embodiment. The header information described above
may be described as header information in units of pictures, i.e.,
as header information for every target picture to be coded.
Alternatively, it may be described as header information of the
entire sequence, or as header information in units of several
pictures (e.g., in units of GOPs in MPEG).
Further, since the picture B11 is used as a reference
picture when coding a picture positioned backward the picture B11,
information indicating that decoded image data Dd corresponding
to the picture B11 should be stored in the reference picture
memory 117 at decoding, is also added as header information.
Further, information indicating that the data Dd should be stored
in the reference picture memory 117 until decoding of the picture
B12 is completed, is also added as header information.
When all of the remaining blocks in the picture B11 have
been coded, coding of the picture B12 takes place.
(Coding Process for Picture B12)
Since the picture B12 is a B picture, inter-picture
predictive coding using bidirectional reference is carried out as
a coding process for the picture B12. In this case, two I or P
pictures which are timewise close to the target picture B12, and
a B picture which is timewise closest to the target picture B12
are used as candidate pictures for forward reference. Further,
an I or P picture which is timewise closest to the target picture
B12 is used as a candidate picture for backward reference. To be
specific, the pictures P7, P10, and B11 are used as candidate
pictures for forward reference for the picture B12, and the
picture P13 is used as a backward reference picture for the
picture B12.
Since the picture B12 is not used as a reference picture
when coding another picture, the coding control unit 170 controls
the respective switches with the control signal Cs1 so that the
switch 113 is turned ON and the switches 114 and 115 are turned
OFF. Accordingly, the image data Md corresponding to the block
in the picture B12, which is read from the input picture memory
101, is input to the motion vector detection unit 108, the mode
selection unit 179, and the difference calculation unit 102.
The motion vector detection unit 108 detects a forward
motion vector and a backward motion vector corresponding to the
macroblock in the picture B12, with reference to the pictures P7,
P10, and B11 stored in the reference picture memory 117, as
forward reference pictures, and the picture P13 stored in the
reference picture memory 117, as a backward reference picture.
In this case, a most suitable reference picture is selected
from among the pictures P7, P10, and B11, and detection of a
forward motion vector is carried out according to the selected
picture. The detected motion vectors are output to the mode
selection unit 179 and the bit stream generation unit 104.
Further, information indicating which one of the pictures P7, P10,
and B11 is referred to in detecting the forward motion vector
(reference picture information) is also output to the mode
selection unit 179.
The mode selection unit 179 determines a coding mode for the
block in the picture B12, using the motion vectors detected by
the motion Vector detection unit 108. As a coding mode for the B
picture, one of the following coding modes is selected: intra-
picture coding mode, inter-picture predictive coding mode using a
forward motion vector, inter-picture predictive coding mode using
a backward motion picture, and inter-picture predictive coding
mode using bidirectional motion vectors.
The coding mode Ms determined by the mode selection unit 179
is output to the bit stream generation unit 104. Further,
prediction image data Pd, which is obtained from the reference
picture according to the coding mode that is determined by the
A.
mode selection unit 179, is output to the difference calculation
unit 102 and the addition unit 106. However, when the intra-
picture coding mode is selected, no prediction image data Pd is
output.
Further, when the intra-picture coding mode is selected by
the mode selection unit 179, the switches 111 and 112 are
controlled in the same manner as described for the coding process
of the picture P13.
Hereinafter, a description will be given of a case where the
inter-picture predictive coding mode is selected by the mode
selection unit 179.
In this case, the operations of the difference calculation
unit 102, the prediction error coding unit 103, the bit stream
generation unit 104, the prediction error decoding unit 105, and
the coding control unit 170 are identical to those described for
the fifth embodiment.
When the coding mode is one performing forward reference.
information indicating which one of the pictures P7, P10, and B11
is referred to in detecting the forward motion vector (reference
picture information) is also added to the bit stream.
Further, information indicating that the target B picture
B12 is subjected to inter-picture predictive coding using a
forward B picture B11 as a candidate for a reference picture is
described as header information. Furthermore, information
indicating that the candidate pictures for forward reference are
two I or P pictures and one B picture is described as header
information.
Moreover, information indicating that the picture B12 is not
to be used as a reference picture when coding the following
pictures is added as header information.
Thereby, it is easily determined that there is no necessity
to store the decoded image data Dd corresponding to the picture
B12 in the reference picture memory at decoding, whereby
management of the reference picture memory is facilitated.
The above-mentioned header information may be described as
header information in units of pictures, i.e., as header
information for every target picture to be coded. Alternatively,
it may be described as header information of the entire sequence,
or as header information in units of several pictures (e.g., in
units of GOPs in MPEG).
The remaining blocks in the picture B12 are coded in the
same manner as described above.
Thereafter, the image data corresponding to the respective
pictures following the picture B12 are coded in like manner as
described above according to the picture type. For example, P
pictures are processed like the picture P13, and the first B
picture of the continuous B pictures (picture B14, B17, or the
like) is processed like the picture P11. Further, the second B
picture of the continuous B pictures (picture B15, B18, or the
like) is processed like the picture P12.
As described above, in the moving picture coding apparatus
70 according to the seventh embodiment, when coding a B picture
as a target picture, since a B picture is also used as a
candidate picture for forward reference as well as P pictures, a
forward reference picture that is positioned closest to the
target picture can be used as a forward reference picture.
Thereby, prediction accuracy in motion compensation for a B
picture can be increased, resulting in enhanced coding efficiency.
Moreover, when coding a B picture as a target picture,
information indicating whether or not the target picture is to be
used as a reference picture when coding (decoding) another
picture is added as header information. Further, when the target
picture is used as a reference picture when coding (decoding)
another picture, information indicating a period during which the
target picture should be stored in the reference picture memory
is added. Therefore, when decoding the bit stream Bs outputted
from the moving picture coding apparatus 70, the decoding end can
easily know which picture should be stored in the picture memory
and how long the storage period is, whereby management of the
reference picture memory at decoding is facilitated.
In this seventh embodiment, when a target B picture is coded
using another B picture as a reference picture, this is described
as header information of the target B picture. However, the
header information is not necessarily described in picture units.
It may be described as header information of the entire sequence,
or as header information in units of several pictures (e.g., GOP
in MPEG) .
In this seventh embodiment, motion compensation is performed
in units of macroblocks each comprising 16 pixels (horizontal
direction) X 16 pixels (vertical direction) , and coding of a
prediction error image data is performed in units of blocks each
comprising 4 pixels (horizontal direction) X 4 (vertical
direction), or in units of blocks each comprising 8 pixels
(horizontal direction) X 8 (vertical direction). However,
motion compensation and coding of prediction error image data may
be carried out in units of image spaces, each comprising
different number of pixels from those mentioned above.
Further, in this seventh embodiment, a coding mode for a P
picture is selected from among intra-picture coding mode, inter-
picture predictive coding mode using a motion vector, and inter-
picture predictive coding mode using no motion vector, while a
coding mode for a B picture is selected from among intra-picture
coding mode, inter-picture predictive coding mode using a forward
motion vector, inter-picture predictive coding mode using a
backward motion vector, and inter-picture predictive coding mode
using bidirectional motion vectors. However, selection of a
coding mode for a P picture or a B picture is not restricted to
that mentioned for the seventh embodiment.
Further, while this seventh embodiment employs an image
sequence in which two B pictures are inserted between an I
picture and a P picture or between adjacent P pictures, the
number of B pictures inserted between an I picture and a P
picture or between adjacent P pictures in an image sequence may-
be other than two, for example, it may be three or four.
Furthermore, while in this seventh embodiment three pictures
are used as candidate pictures for forward reference when coding
a P picture, the number of forward reference candidate pictures
for a P picture is not restricted thereto.
Furthermore, while in this seventh embodiment two P pictures
and one B picture are used as candidate pictures for forward
reference when coding a B picture, forward reference candidate
pictures to be used in coding a B picture are not restricted
thereto. For example, forward reference candidate pictures for a
B picture may be one P picture and two B pictures, or two P
pictures and two B pictures, or three pictures which are timewise
closest to the target picture regardless of the picture type.
When, in coding a B picture, only one picture that is
closest to the target B picture is used as a reference picture,
it is not necessary to describe information indicating which
picture is referred to in coding a target block in the B picture
(reference picture information), in the bit stream.
Further, in this seventh embodiment, when coding a B picture,
a B picture which is positioned forward an I or P picture that is
positioned forward and closest to the target B picture, is not
referred to. However, when coding a B picture, a B picture which
is positioned forward an I or P picture that is positioned
forward and closest to the target B picture, may be used as a
reference picture.
[Embodiment 8]
Figure 35 is a block diagram for explaining a moving picture
decoding apparatus 80 according to an eighth embodiment of the
present invention.
The moving picture decoding apparatus 80 according to the
eighth embodiment decodes the bit stream Bs outputted from the
moving picture coding apparatus 70 according to the seventh
embodiment.
The moving picture decoding apparatus 80 is different from
the moving picture decoding apparatus 20 according to the second
embodiment in candidate pictures for forward reference pictures
to be referred to when coding a P picture and a B picture, and
decoding modes for a B picture.
That is, the moving picture decoding apparatus 80 is
provided with, instead of the memory control unit 204 and the
mode decoding unit 223 according to the second embodiment, a
memory control unit 284 and a mode decoding unit 283 which
operate in different manners from those described for the second
embodiment.
To be specific, the memory control unit 284 according to the
eighth embodiment controls a reference picture memory 287 such
that, when decoding a P picture, three pictures (I or P pictures)
which are positioned forward the P picture are used as candidate
pictures for forward reference, and when decoding a B picture,
two pictures (I or P pictures) which are positioned forward the B
picture, a forward B picture that is closest to the B picture,
and a backward I or P picture are used as candidate pictures.
However, a B picture which is positioned forward an I or P
picture that is positioned forward and closest to the target
picture, is not referred to.
The memory control unit 284 controls the reference picture
memory 287, with a control signal Cm, on the basis of a flag
indicating whether or not the target picture is to be referred to
in coding a picture that follows the target picture, which flag
is inserted in the code strong corresponding to the target
picture.
To be specific, information (flag) indicating that the data
of the target picture should be stored in the reference picture
memory 287 at decoding, and information indicating a period
during which the data of the target picture should be stored, are
included in the bit stream corresponding to the target picture.
Further, when decoding a block (target block) in a P picture,
the mode decoding unit 283 according to the eighth embodiment
selects, as a coding mode for the target block, one from among
the following modes: intra-picture decoding, inter-picture
predictive decoding using a motion vector, and inter-picture
predictive decoding using no motion vector (a motion is treated
as zero). When decoding a block (target block) in a B picture,
the mode decoding unit 283 selects, as a decoding mode for the
target block, one from among the following modes: intra-picture
decoding, inter-picture predictive decoding using a forward
motion vector, inter-picture predictive decoding using backward
motion vector, and inter-picture predictive decoding using a
forward motion vector and a backward motion vector. That is, the
mode decoding unit 283 of the moving picture decoding apparatus
80 according to this eighth embodiment is different from the mode
decoding unit 223 of the moving picture decoding apparatus 20
according to the second embodiment only in that it does not use
the direct mode, and therefore, the moving picture decoding
apparatus 80 does not have the motion vector storage unit 226 of
the moving picture decoding apparatus 20. Other constituents of
the moving picture decoding apparatus 80 according to the seventh
embodiment are identical to those of the moving picture decoding
apparatus 20 according to the second embodiment.
Further, the moving picture decoding apparatus 80 according
to the eighth embodiment different from the moving picture
decoding apparatus 60 according to the sixth embodiment in that
the memory control unit 284 controls the bit stream generation
unit 104 so that a flag indicating whether or not the target
picture is to be referred to in coding a picture after the target
block is inserted in the bit stream corresponding to the target
picture. Further, in the moving picture decoding apparatus 80,
candidate pictures to be referred to in decoding a P picture and
a B picture are also different from those employed in the moving
picture decoding apparatus according to the sixth embodiment.
Other constituents of moving picture decoding apparatus 80
according to the seventh embodiment are identical to those of the
moving picture decoding apparatus 60 according to the sixth
embodiment.
Next, the operation of the moving picture decoding apparatus
80 will be described.
The bit stream Bs outputted from the moving picture coding
apparatus 70 according to the seventh embodiment is input to the
moving picture decoding apparatus 80.
In this eighth embodiment, when decoding a P picture, three
pictures (I or P pictures) which are timewise forward and close
to the P picture are used as candidates for a reference picture.
On the other hand, when decoding a B picture, two pictures (I or
P pictures) which are positioned timewise forward and close to
the B picture, a B picture which is positioned forward and
closest to the B picture, and an I or P picture which is
positioned backward the target picture, are used as candidate
pictures for a reference picture. However, in decoding a B
picture, a B picture which is positioned forward an I or P
picture that is positioned forward and closest to the target
picture, is not referred to. Further, in decoding an I picture,
other pictures are not referred to.
Further, information indicating which of the candidate
pictures is used as a reference picture in decoding a P picture
or a B picture is described as header information Ih of the bit
stream Bs, and the header information Ih is extracted by the bit
stream analysis unit 201.
The header information Ih is output to the memory control
unit 284. The header information may be described as header
information of the entire sequence, header information in units
of several pictures (e.g., GOP in MPEG), or header information in
picture units.
The pictures in the bit stream Bs inputted to the moving
picture decoding apparatus 80 are arranged in order or picture
decoding as shown in figure 36(a). Hereinafter, decoding
processes for the pictures P13, B11, and B12 will be specifically
described in this order.
When the bit stream corresponding to the picture P13 is
input to the bit stream analysis unit 201, the bit stream
analysis unit 201 extracts various kinds of data from the
inputted bit stream. The various kinds of data are information
(coding mode) Ms relating to mode selection, information of the
motion vector MV, the above-described header information, and the
like. The extracted coding mode Ms is output to the mode
decoding unit 283. Further, the extracted motion vector MV is
output to the motion compensation decoding unit 205. Furthermore,
the coded data Ed extracted by the bit stream analysis unit 201
is output to the prediction error decoding unit 202.
The mode decoding unit 283 controls the switches 209 and 210
with reference to the mode selection information (coding mode) Ms
extracted from the bit stream. When the coding mode Ms is intra-
picture coding mode and when the coding mode Ms is inter-picture
predictive coding mode, the switches 209 and 210 are controlled
in like manner as described for the sixth embodiment.
Further, the mode decoding unit 283 outputs the coding mode
Ms to the motion compensation decoding unit 205.
Hereinafter, a description will be given of the case where
the coding mode-is inter-picture predictive coding mode.
Since the operations of the prediction error decoding unit
202, the motion compensation decoding unit 205, and the addition
unit 208 are identical to those described for the sixth
embodiment, repeated description is not necessary.
Figure 37 shows how the pictures, whose data are stored in
the reference picture memory 2 07, change with time.
When decoding of the picture P13 is started, the pictures B8,
P7, and P10 are stored in areas R1, R2, and R3 of the reference
picture memory 207. The picture P13 is decoded using the
pictures P7 and P10 as candidates for a reference picture, and
the picture P13 is stored in the memory area Rl where the picture
B8 had been stored. Such rewriting of image data of each picture
in the reference picture memory is carried out based on the
header information of each picture which is added to the bit
stream. This header information indicates that the picture P7
should be stored in the reference picture memory 207 until.
decoding of the picture P13 is completed, the picture P10 should
be stored in the memory until decoding of the picture P16 is
completed, and the picture B8 should be stored in the memory
until decoding of the picture B9 is completed.
In other words, since it can be decided that the picture B8
is not necessary for decoding of the picture P13 and the
following pictures, the picture P13 is written over the reference
picture memory area Rl where the picture B8 is stored.
Further, since information indicating that the picture P13
should be stored in the reference picture memory until decoding
of the picture P19 is completed is described as header
information of the picture P13, the picture P13 is stored in the
reference picture memory at least until that time.
As described above, the blocks in the picture P13 are
successively decoded. When all of the coded data corresponding
to the blocks in the picture P13 have been decoded, decoding of
the picture B11 takes place.
Since the operations of the bit stream analysis unit 201,
the mode decoding unit 203, and the prediction error decoding
unit 202 are identical to those described for decoding of the
picture P13, repeated description is not necessary.
The motion compensation decoding unit 205 generates motion
compensation image data Pd from the inputted information such as
the motion vector. That is, the information inputted to the
motion compensation decoding unit 205 is the motion vector MV and
reference picture index corresponding to the picture B11. The
picture B11 has been coded using the picture P10 as a forward
reference picture, and the picture P13 as a backward reference
picture. Accordingly, in decoding of the picture B11, these
candidate pictures P10 and P13 have already been decoded, and the
corresponding decoded image data DId are stored in the reference
picture memory 207.
When the coding mode is bidirectional predictive coding mode,
the motion compensation decoding unit 205 obtains a forward
reference image from the reference picture memory 207 on the
basis of the information indicating the forward motion vector,
and obtains a backward reference image from the memory 207 on the
basis of the information indicating the backward motion vector.
Then, the motion compensation decoding unit 205 performs addition
and averaging of the forward reference image and the backward
reference image to generated a motion compensation image. Data
Pd of the motion compensation image so generated is output to the
addition unit 208.
The addition unit 208 adds the inputted prediction error
image data PDd and motion compensation image data Pd to output
addition image data Ad. The addition image data Ad so generated
is outputted as decoded image data DId, through the switch 210 to
the reference picture memory 207.
The memory control unit 284 controls the reference picture
memory 207 on the basis of information indicating which candidate
pictures are referred to in coding the P picture and the B
picture, which information is header information of the bit
stream.
Figure 37 shows how the pictures stored in the reference
picture memory 207 change with time.
When decoding of the picture P11 is started, the pictures
P13, P7, and P10 are stored in the reference picture memory 207.
The picture Pll is decoded using the pictures P10 and P13 as
reference pictures, and the picture P11 is stored in the memory
area R2 where the picture P7 had been stored. Such rewriting of
each picture in the reference picture memory 207 is carried out
based on the header information of each picture which is added to
the bit stream. This header information indicates that the
picture P7 should be stored in the reference picture memory 20 7
until decoding of the picture P13 is completed, the picture P10
should be stored in the memory until decoding of the picture PI6
is completed, and the picture P13 should be stored in the memory
until decoding of the picture P19 is completed.
In other words, since it is decided that the picture P7 is
not necessary for decoding of the picture P13 and the following
pictures, the picture P11 is stored in the reference picture
memory area R2 where the picture P7 is stored.
Further, since information indicating that the picture B11
should be stored in the reference picture memory 207 until
decoding of the picture B12 is completed is described as header
information of the picture B11, the picture B11 is stored in the
reference picture memory 207 at least until that time.
As described above, the coded data corresponding to the
blocks in the picture B11 are successively decoded. When all of
the coded data corresponding to the blocks in the picture B11
have been decoded, decoding of the picture B12 takes place.
(Decoding Process for Picture B12)
Since the operations of the bit stream analysis unit 201,
the mode decoding unit 203, and the prediction error decoding
unit 202 are identical to those described for decoding of the
picture P13, repeated description is not necessary.
The motion compensation decoding unit 205 generates motion
compensation image data Pd from the inputted information such as
the motion vector. That is, the information inputted to the
motion compensation decoding unit 205 is the motion vector MV and
reference picture index corresponding to the picture B12. The
picture B12 has been coded using the pictures P10 and B11 as
candidates for a forward reference picture, and the picture P13
as a backward reference picture. These reference candidate
pictures P10, B11, and P13 have already been decoded, and the
corresponding decoded image data are stored in the reference
picture memory 207.
When the coding mode is bidirectional predictive coding mode,
the motion compensation decoding unit 205 determined which one of
the pictures P10 and B11 is used as a forward reference picture
in coding the picture B12, according to the reference picture
indices, and obtains a forward reference image from the reference
picture memory 207 according to the information indicating the
forward motion vector. Further, the motion compensation decoding
unit 205 obtains a backward reference image from the memory 207
according to the information indicating the backward motion
vector. Then, the motion compensation decoding unit 205 performs
addition and averaging of the forward reference image and the
backward reference image to generated a motion compensation image.
Data Pd of the motion compensation image so generated is output
to the addition unit 208.
The addition unit 208 adds the inputted prediction error
image data PDd and motion compensation image data Pd to output
addition image data Ad. The addition image data Ad so generated
is outputted as decoded image data DId, through the switch 210 to
the reference picture memory 207.
The memory control unit 284 controls the reference picture
memory 207 on the basis of information indicating which reference
pictures are used in coding the P picture and the B picture,
v.hich information is extracted from the header information of the
bit stream.
Figure 37 shows how the pictures stored in the reference
picture memory 207 change with time. When decoding of the
picture B12 is started, the pictures P13, B11, and P10 are stored
in the reference picture memory 207. The picture B12 is decoded
using the pictures P13, B11, and P10 as reference candidate
pictures. Since information indicating that the picture B12 is
not to be used as a reference picture when decoding another
picture is described as header information, the decoded data of
the picture B12 is not stored in the reference picture memory 207
but outputted as output image data Od.
As described above, the coded data corresponding to the
blocks in the picture B12 are successively decoded. The decoded
image data of the respective pictures which are stored in the
reference picture memory 207, and the decoded image data which
are not stored in the reference picture memory 207 are rearranged
in order of their display times as shown in figure 36(b), and
outputted as output image data Od.
Thereafter, the coded data corresponding to the respective
pictures are decoded in like manner as described above according
to the picture type.
To be specific, the coded data of the P pictures are decoded
like the picture P13, and the first B picture (picture B14, B17,
or the like) of the continuous B pictures is decoded like the
picture P11. Further, the second B picture (picture B15, B18, or
the like) of the continuous B pictures is decoded like the
picture P12.
As described above, in the moving picture decoding apparatus
80 according to the eighth embodiment, since a B picture is used
as a reference candidate picture when decoding a B picture, a bit
stream, which is obtained in a coding process that uses a B
picture as well as P pictures as forward reference candidate
pictures when coding a B picture, can be accurately decoded.
Further, since the reference picture memory is controlled using
information obtained from the bit stream, indicating which
reference pictures are used in coding a P picture and a B picture,
the reference picture memory can be effectively utilized. That
is, image data of pictures to be used as reference pictures in
the following decoding process are maintained in the reference
picture memory, while image data of pictures not to be used as
reference pictures in the following decoding process are
successively erased from the memory, whereby the reference
picture memory can be effectively utilized.
While this eighth embodiment employs a bit stream
corresponding to an image sequence in which two B pictures are
inserted between adjacent P pictures, the number of B pictures
positioned between adjacent P pictures may be other than two, for
example, it may be three or four.
Furthermore, while in this eighth embodiment two pictures
are used as candidate pictures for forward reference when
decoding a P picture, the number of forward reference candidate
pictures to be referred to in decoding a P picture is not
restricted thereto.
Furthermore, in this eighth embodiment, when decoding a B
picture, one P picture and one B picture are used as candidate
pictures for forward reference, and a B picture which is
positioned forward an I or P picture that is timewise closest to
the target B picture, is not used as a reference picture.
However, pictures to be used as reference candidate pictures in
decoding a B picture may be other than those described for the
eighth embodiment. Further, when decoding a B picture, a B
picture which is positioned forward an I or P picture that is
timewise closest to the target B picture, may be used as a
reference picture.
Furthermore, while in the eighth embodiment decoded image
data of pictures which are not to be used as reference pictures
when decoding other pictures are not stored in the reference
picture memory, the decoded image data of these pictures may be
stored in the memory.
For example, when output of decoded image data of each
picture is carried out with a little delay from decoding of each
picture, the decoded image data of each picture must be stored in
the reference picture memory. In this case, a memory area, other
than the memory area where the decoded image data of the
reference candidate pictures are stored, is provided in the
reference picture memory, and the decoded image data of the
pictures not to be used as reference pictures are stored in this
memory area. Although, in this case, the storage capacity of the
reference picture memory is increased, the method for managing
the reference picture memory is identical to that described for
the eighth embodiment and, therefore, the reference picture
memory can be easily managed.
While all pictures are used as reference candidate pictures
in the second, fourth, sixth, and eighth embodiments, all
pictures are not necessarily used as reference candidate pictures.
To be brief, in a moving picture decoding apparatus, usually,
already-decoded pictures are once stored in a decoding buffer
(decoded frame memory) regardless of whether they will be used as
reference candidate pictures or not, and thereafter, the already-
decoded pictures are successively read from the decoding buffer
to be displayed.
In the second, fourth, sixth, and eighth embodiments of the
present invention, all pictures are used as reference candidate
pictures and, therefore, all of already-decoded pictures are
stored in a reference picture memory for holding pictures to be
used as reference candidate pictures, and thereafter, the
already-decoded pictures are successively read from the reference
picture memory to be displayed.
However, as described above, all of the already-decoded
pictures are not necessarily used as reference candidate pictures.
Accordingly, the already-decoded pictures may be once stored in a
decoding buffer (decoded frame memory) for holding not only
pictures not to be used as reference candidate pictures but also
pictures to be used as reference candidate pictures, and
thereafter, the already-decoded pictures are successively read
from the decoding buffer to be displayed.
The moving picture coding apparatus or the moving picture
decoding apparatus according to any of the aforementioned
embodiments is implemented by hardware, while these apparatuses
may be implemented by software. In this case, when a program for
executing the coding or decoding process according to any of the
aforementioned embodiments is recorded in a data storage medium
such as a flexible disk, the moving picture coding apparatus or
the moving picture decoding apparatus according to any of the
aforementioned embodiments can be easily implemented in an
independent computer system.
Figures 38(a)-38(c) are diagrams for explaining a computer
system for executing the moving picture coding process according
to any of the first, third, fifth, and seventh embodiments and
the moving picture decoding process according to any of the
second, fourth, sixth, and eighth embodiments.
Figure 38(a) shows a front view of a flexible disk FD which
is a medium that contains a program employed in the computer
system, a cross-sectional view thereof, and a flexible disk body
D. Figure 38(b) shows an example of a physical format of the
flexible disk body D.
The flexible disk FD is composed of the flexible disk body D
and a case FC that contains the flexible disk body D. On the
surface of the disk body D, a plurality of tracks Tr are formed
concentrically from the outer circumference of the disk toward
the inner circumference. Each track is divided into 16 sectors
Se in the angular direction. Therefore, in the flexible disk FD
containing the above-mentioned program, data of the program for
executing the moving picture coding process or the moving picture
decoding process are recorded in the assigned storage areas
(sectors) on the flexible disk body D.
Figure 38(c) shows the structure for recording or
reproducing the program in/from the flexible disk FD. When the
program is recorded in the flexible disk FD, data of the program
are written in the flexible disk FD from the computer system Csys
through the flexible disk drive FDD. When the above-mentioned
moving picture coding or decoding apparatus is constructed in the
computer system Csys by the program recorded in the flexible disk
FD, the program is read from the flexible disk FD by the flexible
disk drive FDD and then loaded to the computer system Csys.
Although in the above description a flexible disk is
employed as a storage medium, an optical disk may be employed.
Also in this case, the moving picture coding or decoding process
can be performed by software in like manner as the case of using
the flexible disk. The storage medium is not restricted to these
disks, and any medium may be employed as long as it can contain
the program, for example, a CD-ROM, a memory card, or a ROM
cassette. Also when such data storage medium is employed, the
moving picture coding or decoding process can be performed by the
computer system in the same manner as the case of using the
flexible disk.
Applications of the moving picture coding method and the
moving picture decoding method according to any of the
aforementioned embodiments and systems using the same will be
described hereinafter.
Figure 39 is a block diagram illustrating an entire
construction of a contents provision system 1100 that performs
contents distribution services.
A communication service provision area is divided into
regions (cells) of desired size, and base stations. 1107 to 1110
which are each fixed radio stations are established in the
respective cells.
In this contents provision system 1100, various devices
such as a computer 1111, a PDA (personal digital assistant) 1112,
a camera 1113, a portable telephone 1114, and a portable
telephone with a camera 1200 are connected to the Internet 1101
through an Internet service provider 1102, a telephone network
1104, and the base stations 1107 to 1110.
However, the contents provision system 1100 is not
restricted to a system including all of the plural devices shown
in figure 39, but may be one including some of the plural devices
shown in figure 39. Further, the respective devices may be
connected directly to the telephone network 1104, not through the
base stations 1107 to 1110 as the fixed radio stations.
The camera 1113 is a device that can take moving pictures
of an object, like a digital video camera. The portable
telephone may be a portable telephone set according to any of PDC
(Personal Digital Communications) system, CDMA (Code Division
Multiple Access) system, W-CDMA (Wideband-Code Division Multiple
Access) system, and GSM (Global System for Mobile Communications)
system, or PHS (Personal Handyphone System).
A streaming server 1103 is connected to the camera 1113
through the base station 1109 and the telephone network 1104. In
this system, live distribution based on coded data which are
transmitted by a user using the camera 1113 can be performed.
The coding process for the data of taken images may be carried
out by either the camera 1113 or the server that transmits the
data. Moving picture data which are obtained by taking moving
pictures of an object by means of the camera 1116 may be
transmitted to the streaming server 1103 through the computer
1111. The camera 1116 is a device that can take still images or
moving pictures of an object, such as a digital camera. In this
case, coding of the moving picture data can be performed by
either the camera 1116 or the computer 1111. Further, the coding
process is carried out by an LSI 1117 included in the computer
1111 or the camera 1116.
Image coding or decoding software may be stored in a
storage medium (a CD-ROM, a flexible disk, a hard disk, or the
like) which is a recording medium that contains data readable by
the computer 1111 or the like. The moving picture data may be
transmitted through the portable telephone with a camera 1200.
The moving picture data are data which have been coded by an LSI
included in the portable telephone 1200.
In this contents provision system 1100, contents
corresponding to images taken by the user by means of the camera
1113 or the camera 1116 (for example, live video of a music
concert) are coded in the camera in the same manner as any of the
aforementioned embodiments, and transmitted from the camera to
the streaming server 1103. The contents data are subjected to
streaming distribution from the streaming server 1103 to a
requesting client.
The client may be any of the computer 1111, the PDA 1112,
the camera 1113, the portable telephone 1114 and the like, which
can decode the coded data.
In this contents provision system 1100, the coded data can
be received and reproduced on the client side. When the data are
received, decoded, and reproduced in real time on the client side,
private broadcasting can be realized.
The coding or decoding in the respective devices that
constitute this system can be performed using the moving picture
coding apparatus or the moving picture decoding apparatus
according to any of the aforementioned embodiments.
A portable telephone will be now described as an example of
the moving picture coding or decoding apparatus.
Figure 40 is a diagram illustrating a portable telephone
1200 that employs the moving picture coding method and the moving
picture decoding method according to any of the aforementioned
embodiments.
This portable telephone 1200 includes an antenna 1201 for
transmitting/receiving radio waves to/from the base station 1110,
a camera unit 1203 that can take video or still images of an
object, such as a CCD camera, and a display unit 1202 such as a
liquid crystal display for displaying data of the video taken by
the camera unit 1203 or video received through the antenna 1201.
The portable telephone 1200 further includes a main body
1204 including plural control keys, a voice output unit 1208 for
outputting voices such as a speaker, a voice input unit 1205 for
inputting voices such as a microphone, a recording medium 1207
for retaining coded data or decoded data such as data of taken
moving pictures or still images, or data, moving picture data or
still image data of received e-mails, and a slot unit 1206 which
enables the recording medium 1207 to be attached to the portable
telephone 1200.
The recording medium 1207 has a flash memory element as a
type of EEPROM (Electrically Erasable and Programmable Read Only
Memory) that is an electrically programmable and erasable non-
volatile memory contained in a plastic case, like a SD card.
The portable telephone 1200 will be described more
specifically with reference to Figure 41.
The portable telephone 1200 has a main control unit 1241
that performs general control for the respective units of the
main body including the display unit 1202 and the control key
1204.
The portable telephone 1200 further includes a power supply
circuit 1240, an operation input control unit 1234, an image
coding unit 1242, a camera interface unit 1233, a LCD (Liquid
Crystal Display) control unit 1232, an image decoding unit 1239,
a multiplexing/demultiplexing unit 1238, a recording/reproduction
unit 1237, a modulation/demodulation unit 1236, and an audio
processing unit 1235. The respective units of the portable
telephone 1200 are connected to each other via a synchronization
bus 1250.
The power supply circuit 1240 supplies power from a battery
pack to the respective units when a call end/power supply key is
turned ON under the control of a user, thereby activating the
digital portable telephone with a camera 1200 to be turned into
an operable state.
In the portable telephone 1200, the respective units
operate under control of the main control unit 1241 that is
constituted by a CPU, a ROM, a RAM and the like. To be more
specific, in the portable telephone 1200, an audio signal that is
obtained by voice inputting into the voice input unit 1205 in a
voice communication mode is converted into digital audio data by
the audio processing unit 1235. The digital audio data is
subjected to a spectrum spread process by the
modulation/demodulation circuit 1236, further subjected to a DA
conversion process and a frequency transformation process by the
transmission/receiving circuit 1231, and transmitted through the
antenna 1201.
In this portable telephone set 1200, a signal received
through the antenna 1201 in the voice communication mode is
amplified, and then subjected to a frequency transformation
process and an AD conversion process. The received signal is
further subjected to a spectrum inverse spread process in the
modulation/demodulation circuit 1236, converted into an analog
audio signal by the audio processing unit 1235, and this analog
audio signal is outputted through the voice output unit 1208.
When the portable telephone 1200 transmits an electronic
mail in a data communication mode, text data of the e-mail that
is inputted by manipulation of the control key 1204 on the main
body is transmitted to the main control unit 1241 via the
operation input control unit 1234. The main control unit 1241
controls the respective units so that the text data is subjected
to the spectrum spread process in the modulation/demodulation
circuit 1236, then subjected to the DA conversion process and the
frequency transformation process in the transmission/receiving
circuit 1231, and then transmitted to the base station 1110
through the antenna 1201.
When this portable telephone 1200 transmits image data in
the data communication mode, data of an image taken by the camera
unit 1203 is supplied to the image coding unit 1242 via the
camera interface unit 1233. When the portable telephone 1200
does not transmit the image data, the data of the image taken by
the camera unit 1203 can be displayed directly on the display
unit 1202 via the camera interface unit 1233 and the LCD control
unit 1232.
The image coding unit 1242 includes the moving picture
coding apparatus according to any of the aforementioned
embodiments. This image coding unit 1242 compressively encodes
the image data supplied from the camera unit 1203 by the moving
picture coding method according to any of the above embodiments
to convert the same into coded image data, and outputs the
obtained coded image data to the multiplexing/demultiplexing unit
1238. At the same time, the portable telephone 1200 transmits
voices which are inputted to the voice input unit 1205 while the
image is being taken by the camera unit 1203, as digital audio
data, to the multiplexing/demultiplexing unit 1238 through the
audio processing unit 1235.
The multiplexing/demultiplexing unit 1238 multiplexes the
coded image data supplied from the image coding unit 1242 and the
audio data supplied from the audio processing unit 1235 by a
predetermined method. Resultant multiplexed data is subjected to
a spectrum spread process in the modulation/demodulation circuit
1236, then further subjected to the DA conversion process and the
frequency transformation process in the transmission/receiving
circuit 1231, and obtained data is transmitted through the
antenna 1201.
When the portable telephone 1200 receives data of a moving
picture file that is linked to a home page or the like in the
data communication mode, a signal received from the base station
1110 through the antenna 1201 is subjected to a spectrum inverse
spread process by the modulation/demodulation circuit 1236, and
resultant multiplexed data is transmitted to the
multiplexing/demultiplexing unit 1238.
When the multiplexed data that is received via the antenna
1201 is decoded, the multiplexing/demultiplexing unit 1238
demultiplexes the multiplexed data to divide the data into a
coded bit stream corresponding to the image data and a coded bit
stream corresponding to the audio data, and the coded image data
is supplied to the image decoding unit 1239 and the audio data is
supplied to the audio processing unit 1235, via the
synchronization bus 1250.
The image decoding unit 1239 includes the moving picture
decoding apparatus according to any of the aforementioned
embodiments. The image decoding unit 1239 decodes the coded bit
stream of the image data by the decoding method corresponding to
the coding method according to any of the above-mentioned
embodiments, to reproduce moving picture data, and supplies the
reproduced data to the display unit 1202 through the LCD control
unit 1232. Thereby, for example, the moving picture data
included in the moving picture file that is linked to the home
page is displayed. At the same time, the audio processing unit
1235 converts the audio data into an analog audio signal, and
then supplies the analog audio signal to the voice output unit
1208. Thereby, for example, the audio data included in the
moving picture file that is linked to the home page is reproduced.
Here, a system to which the moving picture coding method
and the moving picture decoding method according to any of the
aforementioned embodiments is applicable is not restricted to the
above-mentioned contents provision system.
Recently, digital broadcasting using satellites or
terrestrial waves is talked frequently, and the image coding
apparatus and the image decoding apparatus according to the above
embodiments is applicable also to a digital broadcasting system
as shown in Figure 42.
More specifically, a code bit stream corresponding to video
information is transmitted from a broadcast station 1409 to a
satellite 1410 such as a communication satellite or a broadcast
satellite, via radio communication. When the broadcast satellite
1410 receives the coded bit stream corresponding to the video
information, the satellite 1410 outputs broadcasting waves, and
these waves are "received by an antenna 1406 at home including
satellite broadcast receiving facility. For example, an
apparatus such as a television (receiver) 1401 or a set top box
(STB) 1407 decodes the coded bit stream, and reproduces the video
information.
Further, the image decoding apparatus according to any of
the aforementioned embodiments can be mounted also on a
reproduction apparatus 1403 that can read and decode the coded
bit stream recorded on a storage medium 1402 such as a CD or a
DVD (recording medium).
In this case, a reproduced video signal is displayed on a
monitor 1404. The moving picture decoding apparatus may be
mounted on the set top box 1407 that is connected to a cable for
cable television 1405 or an antenna for satellite/terrestrial
broadcast 1406, to reproduce an output of the moving picture
decoding apparatus to be displayed on a monitor 1408 of the
television. In this case, the moving picture decoding apparatus
may be incorporated not in the set top box but in the television.
A vehicle 1412 having an antenna 1411 can receive a signal from
the satellite 1410 or the base station 1107, and reproduce a
moving picture to display the same on a display device of a car
navigation system 1413 or the like which is mounted on the
vehicle 1412.
Further, it is also possible that an image signal can be
coded by the moving picture coding apparatus according to any of
the aforementioned embodiments and recorded in a recording medium.
A specific example of a recording device is a recorder 1420
such as a DVD recorder that records image signals on a DVD disk
1421, and a disk recorder that records image signals on a hard
disk. The image signals may be recorded on a SD card 1422.
Further, when the recorder 1420 includes the moving picture
decoding apparatus according to any of the aforementioned
embodiments, the image signals which are recorded on the DVD disk
1421 or the SD card 1422 can be reproduced by the recorder 1420
and displayed on the monitor 1408.
Here, the structure of the car navigation system 1413 may
include, for example, the components of the portable telephone
shown in figure 41 other than the camera unit 1203, the camera
interface unit 1233 and the image coding unit 1242, and the same
apply to the computer 1111, or the television (receiver) 1401.
Further, as the terminal such as the portable telephone
1114, one of three types of terminals: a transmission-receiving
type terminal having both of an encoder and a decoder, a
transmission terminal having only an encoder, and a receiving
terminal having only a decoder can be mounted.
As described above, the moving picture coding method or the
moving picture decoding method according to any of the
aforementioned embodiments is applicable to any of the above-
mentioned devices or systems, whereby the effects as described in
the above embodiments can be obtained.
Moreover, it is needless to say that the embodiments of the
present invention and the applications thereof are not restricted
to those described in this specification.
APPLICABILITY IN INDUSTRY
As described above, in the moving picture coding method and
the moving picture decoding method according to the present
invention, when a target picture to be coded or decoded is a B
picture, a forward picture that is positioned closest to the
target picture can be used as a reference picture for the target
picture, whereby prediction accuracy in motion compensation for
the B picture is increased, resulting in enhanced coding
efficiency. Particularly, these methods are useful in data
processing for transferring or recording moving picture data.
We claim:
1. A moving picture decoding method for decoding a bit stream
corresponding to plural pictures constituting a moving picture, comprising:
a decoding step of, when decoding a target picture to be decoded,
decoding each block in the target picture in either I mode decoding in
which a block in the target picture is decoded without referring to another
picture, or P mode decoding in which a block in the target picture is
predictively decoded with reference to one already-decoded picture,
wherein
the target picture which has been decoded is stored in a reference
picture memory as a candidate for a reference picture, on the basis of
candidate picture information which is included in the bit stream and
indicates whether or not the target picture is the candidate for the
reference picture when decoding another picture that follows the target
picture, and
in the P mode decoding in which a block in the target picture is
predictively decoded, one picture is determined on the basis of reference
picture information included in the bit stream, among already-decoded
pictures which are stored in the reference picture memory as candidates
for the reference picture, and the block is predictively decoded with
reference to one determined picture.
2. A moving picture decoding method for decoding a bit stream
corresponding to plural pictures constituted a moving picture, comprising:
a decoding step of, when decoding a target picture to be decoded,
decoding each block in the target picture in one decoding mode among I
mode decoding in which a block in the target picture is decoded without
referring to another picture, P mode decoding in which a block in the target
picture is predictively decoded with reference to one already-decoded picture,
and B mode decoding in which a block in the target picture is predictively
decoded with reference to two already-decoded pictures, wherein
the target picture which has been decoded is stored in a reference picture
memory as a candidate for a reference picture, on the basis of candidate
picture information which is included in the bit stream and indicates whether
or not the target picture is the candidate for the reference picture when
decoding another picture that follows the target picture, and
in the P mode decoding in which a block in the target picture is
predictively decoded, one picture is determined on the basis of reference
picture information included in the bit stream, among already-decoded
pictures which are stored in the reference picture memory as candidates for
the reference picture, and the block is predictively decoded with reference to
one determined picture, and
in the B mode decoding in which a block in the target picture is
predictively decoded, two pictures are determined on the basis of the
reference picture information included in the bit stream, among already
decoded pictures which are stored in the reference picture memory as the
candidates for the reference pictures, and the block is predictively decoded
with reference to two determined pictures.
According to the present invention, a moving picture coding apparatus (70) for
performing inter-picture predictive coding for pictures constituting a moving
picture is provided with a coding unit (103) for performing predictive error
coding for image data; a decoding unit (105) for performing predictive error
decoding for an output from the coding unit (103); a reference picture memory
(117) for holding output data from the decoding unit (105); and a motion vector
detection unit (108) for detecting motion vectors on the basis of the decoded
image data stored in the memory. When coding a B picture as a target picture,
information indicating whether or not the target picture should be used as a
reference picture when coding another picture is added as header information.
Therefore, in a decoding apparatus for decoding a bit stream Bs outputted from
the moving picture coding apparatus (70), management of a memory for holding
the reference picture can be facilitated on the basis of the header information.