Encoder, Decoder And Data Stream For Gradual Decoder Refresh Coding

< Back

Encoder, Decoder And Data Stream For Gradual Decoder Refresh Coding And Scalable Coding

Abstract: The present invention is concerned with methods, encoders, decoders and data streams for coding pictures, and in particular a consecutive sequence of pictures, Some embodiments may exploit the so-called Gradual Decoder Refresh - GDR - coding scheme for coding the pictures. Some embodiments may suggest Scalable Coding and Gradual Decoder Refresh improvements.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

24 March 2022

Publication Number

27/2022

Publication Type

INA

Invention Field

COMMUNICATION

Status

mail@lexorbis.com

Parent Application

Patent Number

Legal Status

Grant Date

2025-10-30

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Hansastraße 27c 80686 München

Inventors

1. SÁNCHEZ DE LA FUENTE, Yago

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

2. SÜHRING, Karsten

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI, Einsteinufer 37 10587 Berlin

3. HELLGE, Cornelius

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI, Einsteinufer 37 10587 Berlin

4. SCHIERL, Thomas

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI, Einsteinufer 37 10587 Berlin

5. SKUPIN, Robert

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

6. WIEGAND, Thomas

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

Specification

ENCODER, DECODER AND DATA STREAM FOR GRADUAL DECODER REFRESH

CODING AND SCALABLE CODING

The present invention is concerned with coding pictures, and in particular a consecutive sequence of pictures. Some embodiments may exploit scalable coding and/or the so-called Gradual Decoder Refresh - GDR - coding scheme for coding the pictures. Some embodiments may suggest Scalable Coding and Gradual Decoder Refresh improvements.

In nowadays video coding, some scenarios require very low latency transmission. Whenever low-latency end2end delays are required, bitrate changes from picture to picture are undesirable. Typically, video is encoded in such a way that the sizes of the encoded pictures vary not only due to complexity characteristics of the content but also by the prediction structure used. More concretely, typically, videos are encoded using a prediction structure that is based on some pictures being encoded as Intra slices (not dependent on other pictures) and others being encoded as B or P slices (using other pictures as references). Obviously, pictures being encoded without prediction to other pictures lead to bigger sizes than when temporal correlation is used and pictures are encoded as B or P slices.

There are some techniques that drift a bit of this typical encoding structure where each of the picture is encoded with some mixture of blocks using only intra prediction and some blocks using inter-prediction. In such a case, there is not a picture that is encoded only using intra prediction and therefore (if the ratio of intra predicted blocks and inter predicted blocks is kept similar across all pictures) the sizes of all pictures is kept similar.

This is ideal to achieve a lower end2end delay as the size of the biggest picture compared to the other coding structure approach is smaller and therefore the time to transmit it becomes smaller as well.

Typically, such structures are identified as Gradual Decoding Refresh (GDR) as they differ from typical coding structures in the fact that, in order to achieve a clean picture, several pictures need to be decoded and the video areas are gradually decoded and refreshed until the content can be properly shown; while in typical coding structures, only a particular form of an Intra frame (Random Access Point-RAP) is required to be present and the content can be instantaneously shown without having to decode more access units.

In accessing a frame or picture in a picture sequence, there may be a trade-off between the so-called tune-in time and the coding efficiency. Mechanisms that allow reducing the tune-in time (either on average or in worst case) while not harming the coding efficiency are desirable. Furthermore, a picture may be subdivided in picture regions (e.g. tiles) which may be refreshed overtime in a so-called Refresh Period RP. However, the identification of what regions are clean (refreshed) and what regions are not refreshed is not clear, and thus comes with some penalties, such as, intra prediction cannot be easily restricted between not-yet clean (dirty) regions and clean regions.

Thus, it is an object of the present invention to improve existing GDR encoders, decoders and data streams.

A first aspect concerns a video data stream comprising a sequence of pictures comprising at least one Gradual Decoder Refresh - GDR - coded picture and one or more subsequent pictures in a refresh period (RP). The video data stream further comprises a parameter set (SPS) defining a plurality of picture configurations, which subdivide a picture area (e.g. entire frame) into a first sub-area and a second sub-area among which one corresponds to a refreshed sub-area (e.g. a set of picture regions) comprising one or more refreshed picture regions (e.g. tiles) and the other one corresponds to an un-refreshed sub-area comprising one or more yet un-refreshed picture regions. The video data stream further comprises, for each picture within the refresh period, a picture configuration identifier (reg_conf_idx) for identifying a corresponding one picture configuration out of the plurality of picture configurations.

Furthermore, it is suggested to provide a decoder for decoding from a data stream at least one picture out of a sequence of pictures comprising at least one Gradual Decoder Refresh - GDR

- coded picture and one or more subsequent pictures in a refresh period (RP). The decoder is configured to read from the data stream a parameter set (SPS) defining a plurality of picture configurations, which subdivide a picture area (e.g. entire frame) into a first sub-area and a second sub-area among which one corresponds to a refreshed sub-area (e.g. a set of picture regions) comprising one or more refreshed picture regions (e.g. tiles) and the other one corresponds to an un-refreshed sub-area comprising one or more yet un-refreshed picture regions. The decoder is further configured to read from the data stream, for each picture within the refresh period, a picture configuration identifier (reg_conf_idx) for identifying a corresponding one picture configuration out of the plurality of picture configurations for decoding the at least one picture.

Furthermore, it is suggested to provide an encoder for encoding into a data stream at least one picture out of a sequence of pictures comprising at least one Gradual Decoder Refresh - GDR

- coded picture and one or more subsequent pictures in a refresh period (RP). The encoder is configured to write into the data stream a parameter set (SPS) defining a plurality of picture configurations, which subdivide a picture area (e.g. entire frame) into a first sub-area and a second sub-area among which one corresponds to a refreshed sub-area (e.g. a set of picture regions) comprising one or more refreshed picture regions (e.g. tiles) and the other one corresponds to an un-refreshed sub-area comprising one or more yet un-refreshed picture regions. The encoder is further configured to set in the data stream, for each picture within the refresh period, a picture configuration identifier (reg_conf_idx) for identifying a corresponding one picture configuration out of the plurality of picture configurations.

A second aspect concerns a video data stream comprising a sequence of pictures comprising at least one Gradual Decoder Refresh - GDR - coded picture and one or more subsequent pictures in a refresh period, wherein each picture of the sequence of pictures is sequentially coded into the video data stream in units of blocks (e.g. CTUs) into which the respective picture is subdivided. The video data stream comprises an implicit signaling, wherein a refreshed sub-area of a respective picture is implicitly signaled in the video data stream based on a block coding order. Additionally or alternatively, the video data stream comprises, for each block, a syntax element (e.g. a flag) indicating whether

a) the block is a last block located in a first sub-area of a respective picture and lastly coded (e.g. flag: last_ctu_of_gdr_region), and/or

b) the block is a first block located in a first sub-area of a respective picture and firstly coded (e.g. flag: first_ctu_of_gdr_region), and/or

c) the block adjoins a border confining a first sub-area, and/or

d) the block is located inside a first sub-area (e.g. flag: gdr_region_flag).

Furthermore, it is suggested to provide a decoder for decoding from a data stream at least one picture out of a sequence of pictures comprising at least one Gradual Decoder Refresh - GDR - coded picture and one or more subsequent pictures in a refresh period (RP), wherein each picture of the sequence of pictures is sequentially decoded from the video data stream in units of blocks (e.g. CTUs) into which the respective picture is subdivided. The decoder is configured to implicitly derive from the data stream a refreshed sub-area of the at least one picture based on a block coding order. Additionally or alternatively, the decoder is configured to read from the data stream, for each block, a syntax element (e.g. a flag) indicating whether

a) the block is a last block located in a first sub-area of a respective picture and lastly coded (e.g. flag: last_ctu_of_gdr_region), and/or

b) the block is a first block located in a first sub-area of a respective picture and firstly coded (e.g. flag: first_ctu_of_gdr_region), and/or

c) the block adjoins a border confining a first sub-area, and/or

d) the block is located inside a first sub-area (e.g. flag: gdr_region_flag).

Furthermore, it is suggested to provide an encoder for encoding into a data stream at least one picture out of a sequence of pictures comprising at least one Gradual Decoder Refresh - GDR - coded picture and one or more subsequent pictures in a refresh period (RP), wherein each picture of the sequence of pictures is sequentially encoded into the video data stream in units of blocks (e.g. CTUs) into which the respective picture is subdivided. The encoder is configured to write into the data stream, for each block, a syntax element (e.g. a flag) indicating whether a) the block is a last block located in a first sub-area of a respective picture and lastly coded (e.g. flag: last_ctu_of_gdr_region), and/or

b) the block is a first block located in a first sub-area of a respective picture and firstly coded (e.g. flag: first_ctu_of_gdr_region), and/or

c) the block adjoins a border confining a first sub-area, and/or

d) the block is located inside a first sub-area (e.g. flag: gdr_region_flag).

A third aspect concerns a multi layered scalable video data stream comprising a first sequence of pictures in a first layer (e.g. base layer) and a second sequence of pictures in a second layer (e.g. an enhancement layer), wherein the second sequence of pictures in the second layer comprises at least one Gradual Decoder Refresh - GDR - picture as a start picture and one or more subsequent pictures in a refresh period. The multi layered scalable video data stream comprises a signalization carrying information about a possibility that a yet un-refreshed sub-area of the GDR picture of the second layer is to be inter-layer predicted from samples of the first layer. Additionally, the multi layered scalable video data stream comprises information:

that in yet un-refreshed sub-areas of the one or more subsequent pictures contained in the refresh period, motion vector prediction is disabled or motion vector prediction is realized non-temporally, or

that in a yet un-refreshed sub-area of the GDR picture motion vector prediction is disabled or motion vector prediction is realized non-temporally.

Furthermore, it is suggested to provide a decoder for decoding at least one picture from a multi layered scalable video data stream comprising a first sequence of pictures in a first layer (e.g. base layer) and a second sequence of pictures in a second layer (e.g. an enhancement layer), wherein the second sequence of pictures in the second layer comprises at least one Gradual Decoder Refresh - GDR - picture as a start picture and one or more subsequent pictures in a refresh period. The decoder is configured to read from the multi layered scalable video data stream a signalization carrying information about a possibility that a yet un-refreshed sub-area of the GDR picture of the second layer is to be inter-layer predicted from samples of the first layer. The decoder is further configured to, responsive to the signalization:

disable motion vector prediction or to realize motion vector prediction non-temporally in yet un-refreshed sub-areas of the one or more subsequent pictures contained in the refresh period, or

to disable motion vector prediction or to realize motion vector prediction non-temporally in a yet un-refreshed sub-area of the GDR picture.

Furthermore, it is suggested to provide an encoder for encoding at least one picture into a multi layered scalable video data stream comprising a first sequence of pictures in a first layer (e.g. base layer) and a second sequence of pictures in a second layer (e.g. an enhancement layer), wherein the second sequence of pictures in the second layer comprises at least one Gradual Decoder Refresh - GDR - picture as a start picture and one or more subsequent pictures in a refresh period. The encoder is configured to write into the multi layered scalable video data stream a signalization carrying information about a possibility that a yet un-refreshed sub-area of the GDR picture of the second layer is to be inter-layer predicted from samples of the first layer, and information:

that in a yet un-refreshed sub-area of the GDR picture motion vector prediction is disabled or motion vector prediction is realized non-temporally.

A fourth aspect concerns a multi layered scalable video data stream comprising a first sequence of pictures in a first layer (e.g. base layer) and a second sequence of pictures in a second layer (e.g. an enhancement layer), each of the first and second layers comprising a plurality of temporal sublayers. The scalable video data stream further comprises a signalization (e.g. vps_sub_layer_independent_flag[i][j]) indicating which temporal sublayers of the second layer (e.g. enhancement layer) are coded by inter-layer prediction.

Furthermore, it is suggested to provide a decoder for decoding at least one picture from a multi layered scalable video data stream comprising a first sequence of pictures in a first layer (e.g. base layer) and a second sequence of pictures in a second layer (e.g. an enhancement layer), each of the first and second layers comprising a plurality of temporal sublayers. The decoder is configured to decode one or more of the temporal sublayers by using inter-layer prediction based on a signalization derived from the scalable video data stream, said signalization (e.g. vps_sub_layerjndependent_flag[i][j]) indicating which temporal sublayers of the second layer (e.g. enhancement layer) are to be coded by inter-layer prediction.

Furthermore, it is suggested to provide an encoder for encoding at least one picture into a multi layered scalable video data stream comprising a first sequence of pictures in a first layer (e.g. base layer) and a second sequence of pictures in a second layer (e.g. an enhancement layer), each of the first and second layers comprising a plurality of temporal sublayers. The encoder is configured to encode one or more of the temporal sublayers by using inter-layer prediction and to write a signalization into the scalable video data stream, said signalization (e.g. vps_sub_layer_independent_flag[i][j]) indicating which temporal sublayers of the second layer (e.g. enhancement layer) are coded by inter-layer prediction.

A fifth aspect concerns a video data stream comprising at least one picture being subdivided into tiles, and a tile-reordering flag, wherein

a) if the tile-reordering flag (e.g. sps_enforce_raster_scan_flag) in the data stream has a first state, it is signaled that tiles of the picture are to be coded using a first coding order which traverses the picture tile by tile, and/or

b) if the tile-reordering flag in the data stream has a second state, it is signaled that tiles of the picture are to be coded using a second coding order which traverses the picture along a raster scan order.

Furthermore, it is suggested to provide a decoder configured to decode a picture from a data stream, wherein:

a) if a tile-reordering flag (e.g. sps_enforce_raster_scan_flag) in the data stream has a first state, the decoder is configured to decode tiles of the picture from the data stream using a first decoding order which traverses the picture tile by tile, and/or b) if the tile-reordering flag in the data stream has a second state, the decoder is configured to decode the tiles of the picture from the data stream using a second decoding order which traverses the picture along a raster scan order.

Furthermore, it is suggested to provide an encoder configured to encode a picture into a data stream, wherein:

a) the encoder is configured to set a tile-reordering flag (e.g.

sps_enforce_raster_scan_flag) in the data stream into a first state, indicating that tiles of the picture are to be coded using a first coding order which traverses the picture tile by tile, and/or

b) the encoder is configured to set the tile-reordering flag in the data stream into a second state, indicating that tiles of the picture are to be coded using a second coding order which traverses the picture along a raster scan order.

In the following, embodiments of the present disclosure are described in more detail with reference to the figures, in which

Fig. 1 shows a schematic diagram of a Gradual Decoding Refresh concept according to an embodiment,

Fig. 2 shows a schematic diagram of a Gradual Decoding Refresh concept according to an embodiment,

Fig. 3 shows a schematic diagram of a Gradual Decoding Refresh concept using columns of picture regions according to an embodiment,

Fig. 4 shows a schematic diagram of a Gradual Decoding Refresh concept using rows of picture regions according to an embodiment,

Fig. 5 shows a schematic diagram of a Gradual Decoding Refresh concept using packets of picture regions according to an embodiment,

Fig. 6 shows a schematic diagram of a Gradual Decoding Refresh concept using a scalable multilayer video bitstream according to an embodiment, and

Fig. 7 shows a schematic diagram of a Gradual Decoding Refresh concept using a scalable multilayer video bitstream with multiple temporal sublayers according to an embodiment.

Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals.

Method steps which are depicted by means of a block diagram and which are described with reference to said block diagram may also be executed in an order different from the depicted and/or described order. Furthermore, method steps concerning a particular feature of a device may be replaceable with said feature of said device, and the other way around.

Furthermore, in this disclosure the terms frame and picture may be used interchangeably.

Figure 1 shows an example of a Gradual Decoding Refresh (GDR) coding structure which differs from conventional coding structures in the fact that, in order to achieve a clean picture, several pictures need to be decoded and the video areas are gradually decoded and refreshed until the content can be properly shown. In contrast, in conventional coding structures, only a particular form of an Intra frame (Random Access Point-RAP) is required to be present and the content can be instantaneously shown without having to decode more access units.

Figure 1 shows a sequence 100 of pictures 1011, 1012, .... 101n in a consecutive order. Each picture 1011, 1012, ..., 101n may be divided into picture regions 102, for example into tiles. The tiles 102 may be intra-coded 102a or inter-coded 102b. Intra-coded picture regions 102a may provide for an access at which the decoder may start accessing the bitstream and start refreshing the entire picture according to the Gradual Decoding Refresh (GDR) principle. Frames or pictures comprising such an Intra-Coded picture region 102a may therefore also be referred to as an Access Unit (AU).

A picture comprising an Intra-Coded picture region 102a may also be referred to as a GDR-picture 103. In this example, every second picture may be a GDR-picture 103. Accordingly, the GDR-delta is two in this example. A picture region that has been Intra-Coded (i.e. an Intra-Coded picture region 102a) may also be referred to as a refreshed picture region or a clean picture region, respectively. Picture regions that have not yet been coded after an access may be referred to as non-refreshed picture regions or dirty picture regions, respectively.

Variant A: Full MCTS based

The refresh period (RP) is the time interval that needs to be waited until all picture regions 102 are refreshed and a clean picture can be shown. There are different forms of configuring such a bitstream. In Figure 1 , (nine tiles x new refreshed tile at every second frame - 1) = 17 frames is the refresh period (RP), i.e. the number of frames until all picture regions 102 are refreshed.

Furthermore, in this example, a GDR picture 103 may be present every second picture, i.e. a picture at which a decoder can start accessing the bitstream and decode a full picture after a full RP. This can be achieved by encoding all picture regions 102 independently of each other over the time, e.g. called MCTS in HEVC and spreading the intra coded blocks 102a in time with a distance of two frames among picture regions 102 in the example above.

Variant B: Constrained inter tiles (c.f. attached Figure 2, also represented above)

Figure 2 shows a further configuration, wherein a GDR structure can be achieved by having a GDR frame 103 every 18 frames and allowing dependencies among regions (no-MCTS) but allowing only dependencies on previously refreshed regions. In that case only one region is refreshed with intra block and further regions can reference the region refreshed in time.

This allows a better efficiency as the first shown configuration, since regions are not encoded with full MCTS.

In the non-limiting example shown in Figure 2 a GDR picture 103 is present every 18th picture while the refresh period (RP) comprises 17 frames.

Variant C: Column Based

Figure 3 shows a further non-limiting example, wherein a GDR picture 103 is present every 18th picture while 17 frames is the refresh period (RP). As can be seen, the Intra-Coded picture region 102a in this example may comprise a picture column instead of an above discussed picture tile.

Variant D: Row-based

Figure 4 shows a further non-limiting example, wherein a GDR picture 103 is present every 18th picture while 17 frames is the refresh period (RP). As can be seen, the Intra-Coded picture region 102a in this example may comprise a picture row instead of an above discussed picture tile.

In summary, for all of the above discussed non-limiting example variants of GDR, the RP may be 17 and the GDR periodicity may be D = 2 or D = 18 as shown in the table below.

M = number of tiles

RP = Refresh Period

D = delta GDR, distance between GDR pictures, i.e. possible decoding start.

A further important aspect to evaluate a GDR technology is the tune-in time required to show a picture, which consists of the RP plus the time that needs to be waited until a GDR picture 103 is found. The table above shows the tune-in-time in average and worst-case.

Problems of different GDR scenarios:

1 ) As discussed above the configuration A is the one with worst coding efficiency for the same RP since regions are encoded as MCTS. However, the tune-in time for such a configuration is much smaller than for any configuration B-D. Mechanisms that allow reducing the tune-in time (either on average or in worst case) while not harming the coding efficiency are desirable.

2) As can be seen in B-C, regions may not be defined in a static way, therefore reducing the signaling overhead and efficiency penalty of using to some extent independent regions, such as tiles for example. However, the identification of which regions are clean (refreshed) and which regions are not yet refreshed may not be unambiguous, and thus comes with some penalties, such as, intra prediction cannot be easily

1 restricted between not-yet clean (dirty) regions and clean regions.

1 Dynamic GDR region signaling

In order to solve the issue of regions changing from picture to picture (i.e. a change from not-yet refreshed picture regions to a refreshed picture region), two inventive approaches can be envisioned to avoid the burden of having to send an updated PPS with every picture:

In the first inventive approach, several configurations of the picture regions, e.g. in form of tiles, may be signaled within the SPS (Sequence Parameter Set) or PPS (Picture Parameter Set) and slice headers may point to an index to indicate which is the tile configuration that is in use for a given AU (Access Unit). Thus, a dynamic signaling of refreshed and yet un refreshed picture regions may be provided with this inventive approach.

In this disclosure, a picture area may comprise an entire frame 101 or picture 101. A picture area may be divided into picture sub-areas. As exemplarily shown in the fourth frame in Figure 2, a first picture sub-area 101 r may comprise only refreshed picture regions 102r. This first picture sub-area 101 r may therefore also be referred to as a refreshed picture sub-area 101 r. On the other hand, a second picture sub-area 101 u may only comprise yet un-refreshed picture regions 102u. This second picture sub-area 101u may therefore also be referred to as a yet un-refreshed picture sub-area 101 u. A picture region 102 in general may comprise at least one of a picture tile (Figures 1 and 2), a picture tile column (Figure 3), a picture tile line (Figure 4), a coding block (e.g. CTUs), a coding block line, a coding block row, a coding block diagonal, a sample, a row of samples, or a column of samples. A refreshed picture region 102r may comprise at least one of a refreshed picture tile (Figures 1 and 2), a refreshed picture tile column (Figure 3), a refreshed picture tile line (Figure 4), a refreshed coding block (e.g. CTUs), a refreshed coding block line, a refreshed coding block row, a refreshed coding block diagonal, a refreshed sample, a refreshed row of samples, or a refreshed column of samples. A yet unrefreshed picture region 102u may comprise at least one of a yet un-refreshed picture tile (Figures 1 and 2), a yet un-refreshed picture tile column (Figure 3), a yet un-refreshed picture tile line (Figure 4), a yet un-refreshed coding block (e.g. CTUs), a yet un-refreshed coding block line, a yet un-refreshed coding block row, a yet un-refreshed coding block diagonal, a yet un-refreshed sample, a yet un-refreshed row of samples, or a yet un-refreshed column of samples.

According to an embodiment of the first aspect of the present invention, a video data stream may be provided, the video data stream comprising a sequence 100 of pictures 1011, 1012, ....

101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures in a refresh period RP. The video data stream further comprises a parameter set (e.g. SPS or PPS) defining a plurality of picture configurations, which subdivide a picture area 101 (e.g. an entire frame) into a first sub-area 101 r (e.g. first one or more picture regions 102 comprising tiles, rows, columns, etc.) and a second sub-area 101u (e.g. second one or more picture regions 102 comprising tiles, rows, columns, etc.) among which one sub-area corresponds to a refreshed sub-area 101 r comprising one or more (i.e. a set of) refreshed picture regions (e.g. tiles) and the other sub-area 101u corresponds to an un-refreshed sub-area comprising one or more (= a set of) yet un-refreshed picture regions. According to the inventive principle, the video data stream comprises for each picture 1011, 1012. 101 n within the refresh period RP a picture configuration identifier (e.g. region_configuration_idx) for identifying a corresponding one picture configuration out of the plurality of picture configurations.

According to a further embodiment, it is suggested to provide a corresponding decoder for decoding from a data stream at least one picture out of a sequence 100 of pictures 1011, 1012, ..., 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures 1012, .... 101 n in a refresh period - RP, wherein the decoder is configured to read from the data stream a parameter set (e.g. PPS or SPS) defining a plurality of picture configurations, which subdivide a picture area (e.g. an entire frame) 101 into a first sub-area 101 r and a second sub-area 101u among which one corresponds to a refreshed sub-area (e.g. a set of refreshed picture regions) 101 r comprising one or more

refreshed picture regions (e.g. refreshed tiles) 102r and the other one corresponds to an un refreshed sub-area 101u comprising one or more yet un-refreshed picture regions 102u. The decoder may further be configured to read from the data stream, for each picture 1011, 1012, 101 n within the refresh period - RP, a picture configuration identifier (region_configuration_idx) for identifying a corresponding one picture configuration out of the plurality of picture configurations for decoding the at least one picture 1011, 1012, 10V

According to a further embodiment, it is suggested to provide a corresponding encoder for encoding into a data stream at least one picture out of a sequence 100 of pictures 1011, 1012, ..., 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures 1012, .... 101 n in a refresh period - RP, wherein the encoder is configured to write into the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area 101 into a first sub-area 101 r and a second sub-area 101u among which one corresponds to a refreshed sub-area 101 r comprising one or more refreshed picture regions 102r and the other one corresponds to an un-refreshed sub-area 101u comprising one or more yet un-refreshed picture regions 102u. The encoder is further configured to set in the data stream, for each picture 1011, 1012, 101 n within the refresh period - RP, a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations for decoding the at least one picture

101i, 1012, .... 101 n- According to a further embodiment, it is suggested to provide a corresponding method for decoding from a data stream at least one picture out of a sequence 100 of pictures 1011, 1012, ..., 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures 1012, .... 101 n in a refresh period - RP, the method comprising steps of reading from the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area 101 into a first sub-area 101 r and a second sub-area 101u among which one corresponds to a refreshed sub-area 101 r comprising one or more refreshed picture regions 102r and the other one corresponds to an un-refreshed sub-area 101u comprising one or more yet un-refreshed picture regions 102u. The method further comprises a step of reading from the data stream, for each picture 1011, 1012, .... 101 n within the refresh period - RP, a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations for decoding the at least one picture 1011, 1012, .... 10V

According to a further embodiment, it is suggested to provide a corresponding method for encoding into a data stream at least one picture out of a sequence 100 of pictures 1011, 1012, ..., 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures 1012, .... 101 n in a refresh period - RP, the method

comprising steps of writing into the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area 101 into a first sub-area 101 r and a second sub-area 101u among which one corresponds to a refreshed sub-area 101 r comprising one or more refreshed picture regions 102r and the other one corresponds to an un-refreshed sub-area 101u comprising one or more yet un-refreshed picture regions 102u. The method further comprises a step of setting in the data stream, for each picture 1011, 1012, ..., 101 n within the refresh period - RP, a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations.

As mentioned above, the sequence 100 of pictures 1011, 101.. 101n may be coded in different ways. For instance, the sequence 100 of pictures 1011, 1012, .... 101n may be coded in a manner so that intra prediction does not cross a boundary between the first and second sub-areas 101 r, 101u. Additionally or alternatively, the sequence 100 of pictures 1011, 1012, ... , 101 n may be coded in a manner so that temporal prediction of the refreshed sub-area 101 r does not reference the yet un-refreshed sub-area 101u. Additionally or alternatively, the sequence 100 of pictures 1011, 1012. 101 n may be coded in a manner so that context model derivation does not cross a boundary between the first and second sub-areas 101 r, 101u.

According to an advantageous embodiment, the corresponding one picture configuration indicates the refreshed picture regions 102r and the yet un-refreshed picture regions 102u contained in a currently coded picture 1011, 1012, .... 101 n of the sequence 100 of pictures.

According to a further advantageous embodiment, each picture configuration out of the plurality of picture configurations may comprise a set of region indices for signaling which picture regions 102 are refreshed picture regions 102r and which picture regions 102 are un-refreshed picture regions 102u. This provides for an explicit signaling of refreshed and yet un-refreshed picture regions 102r, 102u.

For example, as shown and previously discussed with reference to Figures 1 and 2, the pictures 1011, 1012, 101 n contained in the sequence 100 of pictures may be subdivided into one or more tiles 102. In this case, each picture configuration out of the plurality of picture configurations may comprise a set of tile indices for signaling which picture tiles 102 are refreshed picture tiles 102r and which picture tiles 102 are un-refreshed picture tiles 102u. Thus, in one embodiment the region configuration may contain a set of tile indices.

According to a further embodiment, as exemplarily discussed with reference to Figure 3, the pictures 1011, 101 . 101 n contained in the sequence 100 of pictures may be subdivided into picture tile columns, wherein each picture configuration out of the plurality of picture configurations comprises at least one column index for signaling which picture columns are

refreshed picture columns 102r and/or which picture columns are un-refreshed picture columns 102u. Accordingly, the region configuration may contain a tile column index.

According to a further embodiment, as exemplarily discussed with reference to Figure 4, the pictures 1011, 1012, .... 101 n contained in the sequence 100 of pictures may be subdivided into picture tile rows, wherein each picture configuration out of the plurality of picture configurations comprises at least one row index for signaling which picture rows are refreshed picture rows 102r and/or which picture rows are un-refreshed picture rows 102u. Accordingly, the region configuration may contain a tile row index.

A picture region may also be represented by a coding block, e.g. by a CTU (Coding Tree Unit).

According to a further embodiment, the pictures 1011, 1012, .... 101 n contained in the sequence 100 of pictures may be subdivided into rows of coding blocks (e.g. CTUs), wherein each picture configuration out of the plurality of picture configurations may comprise at least one row coding block index for signaling which rows of coding blocks are refreshed rows 102r of coding blocks and/or which rows of coding blocks are un-refreshed rows 102u of coding blocks. Accordingly, the region configuration may contain a CTU row index.

According to a further embodiment, the pictures 1011, 1012, .... 101 n contained in the sequence 100 of pictures may be subdivided into columns of coding blocks (e.g. CTUs), wherein each picture configuration out of the plurality of picture configurations may comprise at least one column coding block index for signaling which columns of coding blocks are refreshed columns102r of coding blocks and/or which columns of coding blocks are un-refreshed columns 102u of coding blocks. Accordingly, the region configuration may contain a CTU column index.

According to a further embodiment, the pictures 1011, 1012. 101 n contained in the sequence

100 of pictures may be subdivided into diagonals of coding blocks (e.g. CTUs), and wherein each picture configuration out of the plurality of picture configurations may comprise at least one diagonal coding block index for signaling which diagonals of coding blocks are refreshed diagonals 102r of coding blocks and/or which diagonals of coding blocks are un-refreshed diagonals 102u of coding blocks. Accordingly, the region configuration may contain one or more indices of a CTU diagonal.

A picture region may also be represented by samples.

According to a further embodiment, the pictures 1011, 1012, .... 101 n contained in the sequence 100 of pictures may be subdivided into rows of samples, wherein each picture configuration out of the plurality of picture configurations may comprise at least one sample

row index for signaling which rows of samples are refreshed rows 102r of samples and/or which rows of samples are un-refreshed rows 102u of samples. Accordingly, the region configuration may contain one or more sample row indexes.

According to a further embodiment, the pictures 1011, 1012, .... 101 n contained in the sequence 100 of pictures may be subdivided into columns of samples, wherein each picture configuration out of the plurality of picture configurations may comprise at least one sample column index for signaling which columns of samples are refreshed columns 102r of samples and/or which columns of samples are un-refreshed columns 102u of samples. Accordingly, the region configuration may contain one or more sample column indexes.

According to a yet further embodiment, the corresponding one picture configuration may be signaled in a slice header and/or in an Access Unit Delimiter of the video data stream.

Then the slice header would indicate which configuration is used:

Additionally or alternatively, the information about the used region configuration is included into the Access Unit Delimiter (AUD).

According to a further embodiment, it is suggested to provide a video data stream comprising a sequence 100 of pictures 1011 , 1012, .... 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures 1012, .... 101n in a refresh period - RP. Each picture of the sequence 100 of pictures 1011 , 1012, ..., 101 n may be sequentially coded into the video data stream in units of blocks 102 (e.g. CTUs) into which the respective picture is subdivided. The video data stream may comprise an implicit signaling, wherein a refreshed sub-area 102r of a respective picture 1011, 1012, .... 101 n is implicitly signaled in the video data stream based on a block coding order.

Also, a respective decoder is suggested, i.e. a decoder for decoding from a data stream at least one picture out of a sequence 100 of pictures 1011, 1012, .... 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures in a refresh period RP. Each picture of the sequence 100 of pictures 1011, 1012, ..., 101 n may be sequentially decoded from the video data stream in units of blocks 102 (e.g. CTUs) into which the respective picture is subdivided. The decoder may be configured to

implicitly derive from the data stream a refreshed sub-area 102r of the at least one picture based on a block coding order.

Also, a respective encoder is suggested, i.e. an encoder for encoding into a data stream at least one picture out of a sequence 100 of pictures 1011, 1012, ..., 101 n comprising at least one Gradual Decoder Refresh - GDR - coded picture 103 and one or more subsequent pictures in a refresh period RP. Each picture of the sequence 100 of pictures 1011, 1012, ....

101 n may be sequentially encoded into the video data stream in units of blocks 102 (e.g. CTUs) into which the respective picture is subdivided. The encoder may be configured to implicitly derive from the data stream a refreshed sub-area 101 r of the at least one picture based on a block coding order.

Thus, in this approach a CTU based signaling of the region boundary may be used. Each CTU (picture region) 102 may contain a flag (which can be CABAC-coded) indicating whether it is the last CTU 102 of the GDR region 103 or not. This signaling affects for instance the availability of samples for intra prediction and/or CABAC reset. The benefit of such an approach is that it is more flexible not being limited to a fixed grid defined in a parameter set.

Thus, according to an embodiment, the syntax element is for indicating a boundary between a refreshed sub-area 101 r and a yet un-refreshed sub-area 101u of a picture 101 out of the sequence 100 of pictures 1011, 1012, .... 101 n and/or for indicating which sub-area is a refreshed sub-area 101 r and which sub-area is a yet un-refreshed sub-area 101 u.

With each CTU 102 indicating whether it is the last CTU in the GDR region 103, it is beneficial to identify whether the last CTU 102 in a region means the last CTU 102 in terms of rows or columns. Such an indication may be done in a parameter set, e.g. in SPS.

For example, the flag region_horizontal_flag equal to 1 may indicate that the last CTU flag in CTU indicates a horizontal split. Otherwise, a vertical split.

Thus, according to a further embodiment, a video data stream, an encoder and a decoder are suggested, wherein

in case that the syntax element indicates that

a) the block 102 is a last block located in a refreshed sub-area 101 r of a respective picture 101 and lastly coded,

the video data stream comprises a further syntax element (e.g. region_horizontal_flag) for indicating whether

a1) the block 102 is a lastly coded block of one or more rows of blocks 102 of the refreshed sub-area 101 r, or

a2) the block 102 is a lastly coded block of one or more columns of blocks 102 of the refreshed sub-area 101 r.

According to a further embodiment, the further syntax element may indicate whether the last block derives from a horizontal split (e.g. region_horizontal_flag = 1) or from a vertical split of a coding split tree according to which the respective picture 101 is subdivided into blocks 102.

In a further embodiment it may be indicated whether the region is an intra-prediction break, i.e. neighbors of another region are not available for prediction, or CABAC, etc. Thus, the video data stream may comprise an intra-prediction break indication for indicating that neighboring blocks 102 of a neighboring picture region 102u are not available for prediction, e.g. if said neighboring picture region 102u is contained in a yet un-refreshed sub-area 101u.

In both cases defined above the grid of the regions used may be aligned to the CTU sizes. In other words, the refreshed sub-area 101 r may comprise one or more refreshed picture regions 102r which are arranged in a grid that is aligned with the size of the blocks 102 into which the respective picture 101 is subdivided.

In the embodiments described so far, it is not necessarily known which of the regions is a refreshed (clean) and not refreshed (dirty) region. In this case, all regions are considered to be “independent” from each other in all or some of the following aspects:

• intra-prediction break, i.e. neighbors of another region are not available for prediction,

• spatial/temporal MV prediction

• CABAC

Alternatively, the signaling implicitly indicates that the left-most region 101 r is a clean region and the availability of intra blocks is constrained for this region - i.e. the blocks 102 in the left-most region cannot use blocks of another (non-left-most) region for all or some of the following aspects:

• intra-prediction break, i.e. neighbors of another region are not available for prediction,

• spatial/temporal MV prediction

• CABAC

Thus, according to an embodiment, the implicit signaling may signal that a first block 102 at a predetermined position in the block coding order (e.g. a first CTU in upper left corner) is part of the refreshed-sub area 101 r.

As an alternative to the above described implicit derivation of the refreshed and yet un refreshed regions 102r, 102u, some embodiments of the present invention may provide for an explicit signaling, wherein video data stream, a respective encoder and a respective decoder are suggested, wherein the video data stream may comprise, for each block 102, a syntax element indicating whether

a) the block 102 is a last block located in a first sub-area 101 r of a respective picture and lastly coded (e.g. flag: last_ctu_of_gdr_region), and/or b) the block 102 is a first block located in a first sub-area 101 r of a respective picture and firstly coded (e.g. flag: first_ctu_of_gdr_region), and/or c) the block 102 adjoins a border confining a first sub-area 101 r, and/or d) the block 102 is located inside a first sub-area 101 r (e.g. flag: gdr_region_flag).

In other words, as an alternative to the above described implicit derivation of the refreshed and yet un-refreshed regions 102r, 102u, it is suggested to explicitly indicate which region is a clean region 102r and which not, as discussed in the following embodiments.

In an embodiment, in addition to indicating the end of the GDR region 103, also the start of the GDR region 103 may be indicated at CTU level, e.g. by using a flag (CABAC-coded)

This would be helpful for an MCTS-like refresh-approach (c.f. Figure 1), where the refreshed region 101 r is always coded independently.

In another embodiment the CTU based region start and/or end flags may be signaled only in the first CTU column of a tile 102, if horizontal region splits are enabled, and in the first CTU row of a tile 102, if vertical region splits are enabled.

Thus, according to an embodiment, a picture region of the respective picture 101 may be vertically subdivided into one or more slices 102, wherein the syntax element (e.g. flag: last_ctu_of_gdr_region //flag: first_ctu_of_gdr_region) is signaled for each slice 102.

According to a further embodiment, a picture region of the respective picture 101 may be horizontally subdivided into one or more rows of blocks 102, wherein the syntax element (e.g. flag: last_ctu_of_gdr_region // flag: first_ctu_of_gdr_region) is signaled

i. only in the first row, or

ii. in every row.

In another embodiment a CTU based (CABAC-coded) flag may be signaled, indicating whether the CTU 102 is part of the GDR refresh region 103 or not.

In another embodiment the CTU start and/or end indexes of the GDR refresh region 103 may be signaled in the slice header.

One of the benefits of the above described embodiments is that picture regions may be decoupled from the usage of tiles 102 and thereby the scan order may not be affected. In most of the applications which use GDR, low delay transmission is desired. In order to achieve low delay transmission, all packets sent should have the same size and not only all AUs. Typically, in those low delay scenarios each AU may be split into multiple packets and in order to achieve that all packets have the same size (or very similar), each packet should have the same amount of blocks 102rthat are refreshed (belong to the clean area 101 r) and of blocks 102u that are not refreshed (belong to the dirty area 101u).

Figure 5 shows a non-limiting example of a sequence 100 of pictures 1011 , 1012, .... 101 n, wherein each picture 1011, 1012, .... 101 n may be split into multiple packets 501a, 501b, ..., 501 n. In order to achieve that all packets have the same size (or very similar), each packet 501a, 501b, ..., 501 n may have the same amount of blocks 102rthat are refreshed (belong to the clean area 101 r) and of blocks 102u that are not refreshed (belong to the dirty area 101u).

If tiles were used for that purpose, the tile scan order would be in use and therefore the packets 501a, 501b, ..., 501 n could not have the same amount of blocks 102rthat are refreshed (belong to the clean area 101 r) and of blocks 102u that are not refreshed (belong to the dirty area 101u).

In another embodiment tiles are used but a syntax element is added to the parameter set that enforces to follow raster scan and not tile scan. E.g., sps_enfoce_raster_scan_flag. In that case, raster scan would be used and not byte alignment would happen within the bitstream for CTUs starting a new tile.

Thus, according to an embodiment, a video data stream is suggested comprising at least one picture 101 being subdivided into tiles 102, and a tile-reordering flag (e.g. sps_enforce_raster_scan_flag), wherein

a) if the tile-reordering flag in the data stream has a first state, it is signaled that tiles 102 of the picture 101 are to be coded using a first coding order which traverses the picture 101 tile by tile, and/or b) if the tile-reordering flag in the data stream has a second state, it is signaled that tiles 102 of the picture 101 are to be coded using a second coding order which traverses the picture 101 along a raster scan order.

A further embodiment suggests a corresponding decoder that may be configured to decode a picture 101 from a data stream, wherein:

a) if a tile-reordering flag (e.g. sps_enforce_raster_scan_flag) in the data stream has a first state, the decoder is configured to decode tiles 102 of the picture 101 from the data stream using a first decoding order which traverses the picture 101 tile by tile, and/or

b) if the tile-reordering flag in the data stream has a second state, the decoder is configured to decode the tiles 102 of the picture 101 from the data stream using a second decoding order which traverses the picture 101 along a raster scan order.

As mentioned above with reference to Figure 5, the decoder may be configured to decode the picture 101 using a Gradual Decoding Refresh - GDR - approach, wherein the picture 101 may be part of sequence 100 of pictures 1011 , 1012, ... , 101 n which comprises at least one GDR coded picture 103 and one or more subsequent pictures, wherein the picture 101 is block-wise decoded and partitioned into multiple packets 501a, 501b, ..., 501 n, wherein two or more packets (and preferably each packet) comprise the same amount of blocks 102r that are refreshed and/or the same amount of blocks that are yet un-refreshed 102u.

A further embodiment suggests a corresponding encoder configured to encode a picture 101 into a data stream, wherein:

a) the encoder may be configured to set a tile-reordering flag (e.g.

sps_enforce_raster_scan_flag) in the data stream into a first state, indicating that tiles 102 of the picture 101 are to be coded using a first coding order which traverses the picture 101 tile by tile, and/or

b) the encoder is configured to set the tile-reordering flag in the data stream into a second state, indicating that tiles 102 of the picture 101 are to be coded using a second coding order which traverses the picture 101 along a raster scan order.

2. .Scalable GDR restrictions

In case that GDR is done for a scalable bitstream, it could be possible to have a RP as discussed herein where the highest quality is achieved, while at the same time a low-quality RP (LQRP) that is smaller could be achieved, where a not-yet refreshed region 101u at the highest quality may be substituted with samples of the low quality content of a lower layer.

Figure 6 shows a non-limiting example of a GDR approach using a scalable bitstream 600 having a first layer (e.g. a base layer - BL) 601 and a second layer (e.g. an enhancement layer - EL) 602, wherein missing refreshed regions (e.g. not yet refreshed or un-refreshed picture regions) 102u in a yet un-refreshed picture sub-area 101 u in the second layer (e.g.

Enhancement Layer - EL) 602 can be substituted with upsampled samples of a refresehed picture region 202r of a refreshed picture sub-area 201 r of the first layer (e.g. base layer -BL) 601.

In one embodiment, the decoding process of the EL 602 would manage the status of the defined GDR regions (refreshed since GDR or not) and indicate for each region whether its initialized per layer or not. If a region is not initialized, the resampling process of a reference layer 601 for that region would be carried out and sample values would be substituted.

Thereby, when decoding starts at the access unit containing the EL GDR picture 103, higher layer pictures can instantly be presented to the user, gradually being updated to the EL quality over the course of a RP.

However, constraints in the bitstreams are necessary for the above procedure to function.

Thus, according to an embodiment, a multi layered scalable video data stream 600 is suggested comprising a first sequence 200 of pictures 2011, 2012, .... 201 n in a first layer (e.g. base layer) 601 and a second sequence 100 of pictures 1011, 1012, .... 101 n in a second layer (e.g. an enhancement layer) 602. The second sequence 100 of pictures 1011, 1012, ..., 101 n in the second layer 602 may comprise at least one Gradual Decoder Refresh -GDR - picture 103 as a start picture and one or more subsequent pictures in a refresh period - RP, wherein the multi layered scalable video data stream 600 may comprise a signalization carrying information about a possibility that a yet un-refreshed sub-area 101 u of the GDR picture 103 of the second layer 602 is to be inter-layer predicted from samples 202r of the first layer 601. The signalization may further carry information:

• that in yet un-refreshed sub-areas 101u of the one or more subsequent pictures contained in the refresh period - RP, motion vector prediction is disabled or motion vector prediction is realized non-temporally, or

• that in a yet un-refreshed sub-area 101 u of the GDR picture 103 motion vector prediction is disabled or motion vector prediction is realized non- temporally.

In one embodiment, TMVP (Temporal Motion Vector Prediction) or sub-block TMVP (i.e. syntax based prediction of motion vectors) may be disabled for the non-refreshed regions 101 u in GDR pictures 103 so that when samples 102u are substituted by upsampled BL samples 202r, the following EL pictures 1011, ..., 101 n can use the substituted samples 202r for prediction, which significantly reduces encoder/decoder drift compared to using wrong motion vectors as would occur without the constraint.

Accordingly, a basic principle of this aspect suggests to provide the multi layered scalable video data stream 600, wherein said inter-layer prediction from samples 202r of the first layer 601 may comprise substituting one or more samples 102u of the yet un-refreshed sub-area 101u of the GDR picture 103 by an upsampled version of refreshed samples 202r of the first layer 601.

A further embodiment suggests that all samples 102u of the entire yet un-refreshed sub-area 101u of the GDR picture 103 may be substituted by an upsampled version of refreshed samples 202r of the first layer 601 so that pictures 1011, 1012, .... 101 n from the second sequence 100 of coded pictures in the second layer 602 may be instantly presentable to a user.

A further embodiment suggests that yet un-refreshed sub-areas 101u of the one or more subsequent pictures 1012, .... 101 n of the second layer 100 may be refreshed by intra-layer prediction (e.g. inside the second layer 602) using the upsampled substitute samples 202rfrom the first layer 601 which are gradually updated to refreshed samples 102r of the second layer 602.

In another embodiment, combined motion vector candidates in the merge list that are influenced by TMVP or sub-block TMVP candidates are forbidden so that when samples 101u are substituted by upsampled BL samples 202r, the following EL pictures do not rely on incorrect motion vectors on decoder side.

In another embodiment, the same constraint is active for Decoder-side Motion Vector Refinement (DMVR) (i.e. motion vector refinement based on reference sample values), which would otherwise also lead to sever artifacts.

Thus, according to an embodiment, at least one of the following coding concepts is disabled for coding the yet un-refreshed sub-areas 101u of the one or more subsequent pictures 1012, ..., 101 n contained in the refresh period - RP:

• Temporal Motion Vector Prediction (TMVP)

• Advanced Temporal Motion Vector Prediction (ATMVP)

• TMVP-influenced candidates, e.g. motion vector candidates in the merge list which are influenced by TMVP or sub-block TMVP

• Decoder Side Motion Vector Refinement (DMVR)

According to a further embodiment, at least one of the following coding concepts is disabled for coding the yet un-refreshed sub-area 101u of the GDR picture 103:

Temporal Motion Vector Prediction (TMVP)

• Advanced Temporal Motion Vector Prediction (ATMVP)

• TMVP-influenced candidates, e.g. motion vector candidates in the merge list which are influenced by TMVP or sub-block TMVP

• Decoder Side Motion Vector Refinement (DMVR)

and wherein DMVR is disabled for coding the yet un-refreshed sub-areas 101u of the one or more subsequent pictures 1012, .... 101 n contained in the refresh period RP.

In another embodiment, the coded layer with GDR is coded independent of other layers and it is expressed in the bitstream that sample substitution can be carried out using an indicated other layer with adequate content for sample substitution. Thus, according to this embodiment, the second layer 602 may be coded independently from the first layer 601 or from any further layers, and wherein, if the second sequence 100 of pictures 1011, 1012, ...,

101 n is randomly accessed at the GDR picture 103, the signalization indicates that the yet un-refreshed sub-area 101 u of the GDR picture 103 of the second layer 602 is to be interlayer predicted from samples 202r of the first layer 601 or of any predetermined (indicated) further layer with adequate content.

In another embodiment, the above restriction may take the form of a bitstream requirement depending on the identified refreshed and non-refreshed region.

CLAIMS

1. A video data stream comprising:

a sequence (100) of pictures (1011, 1012, .... 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP),

a parameter set defining a plurality of picture configurations, which subdivide a picture area (101) into a first sub-area (101 r) and a second sub-area (101u) among which one corresponds to

a refreshed sub-area ( 101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet unrefreshed picture regions (102u),

wherein the video data stream comprises for each picture (1011, 1012, .... 101 n) within the refresh period (RP) a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations.

2. The video data stream of claim 1 ,

wherein the sequence (100) of pictures (1011, 1012, .... 101 n) is coded in a manner so that

intra prediction does not cross a boundary between the first and second sub-areas (101 r, 101 u), and/or

temporal prediction of the refreshed sub-area (101 r) does not reference the yet un-refreshed sub-area (101 u), and/or

context model derivation does not cross a boundary between the first and second sub-areas (101 r, 101 u).

3. The video data stream of claim 1 or 2,

wherein the corresponding one picture configuration indicates the refreshed picture regions (102r) and the yet un-refreshed picture regions (102u) contained in a currently coded picture (1011, 1012, .... 101 n) of the sequence (100) of pictures.

4. The video data stream of any one of claims 1 , 2 or 3,

wherein each picture configuration out of the plurality of picture configurations comprises a set of region indices for signaling which picture regions (102) are refreshed picture regions (102r) and which picture regions (102) are un-refreshed picture regions (102u).

5. The video data stream of any one of claims 1 to 4,

wherein the pictures (1011, 1012, 101 n) contained in the sequence (100) of pictures are subdivided into one or more tiles (102), and wherein each picture configuration out of the plurality of picture configurations comprises a set of tile indices for signaling which picture tiles (102) are refreshed picture tiles (102r) and which picture tiles (102) are un-refreshed picture tiles (102u).

6. The video data stream of any one of claims 1 to 5,

wherein the pictures (1011, 1012, .... 101 n) contained in the sequence (100) of pictures are subdivided into picture tile rows (102), and wherein each picture configuration out of the plurality of picture configurations comprises at least one row index for signaling which picture tile rows (102) are refreshed picture tile rows (102r) and/or which picture tile rows (102) are un-refreshed picture tile rows (102u).

7. The video data stream of any one of claims 1 to 6,

wherein the pictures (1011, 1012, 101 n) contained in the sequence (100) of pictures are subdivided into picture tile columns (102), and wherein each picture configuration out of the plurality of picture configurations comprises at least one column index for signaling which picture tile columns (102) are refreshed picture tile columns (102r) and/or which picture tile columns (102) are un-refreshed picture columns (102u).

8. The video data stream of any one of claims 1 to 7,

wherein the pictures (1011, 1012, .... 101 n) contained in the sequence (100) of pictures are subdivided into rows (102) of coding blocks, and wherein each picture configuration out of the plurality of picture configurations comprises at least one row coding block index for signaling which rows (102) of coding blocks are refreshed rows (102r) of coding blocks and/or which rows (102) of coding blocks are un-refreshed rows (102u) of coding blocks.

9. The video data stream of any one of claims 1 to 8,

wherein the pictures (1011, 1012, .... 101 n) contained in the sequence (100) of pictures are subdivided into columns (102) of coding blocks, and wherein each picture configuration out of the plurality of picture configurations comprises at least one column coding block index for signaling which columns (102) of coding blocks are refreshed columns (102r) of coding blocks and/or which columns (102) of coding blocks are unrefreshed columns (102u) of coding blocks.

10. The video data stream of any one of claims 1 to 9,

wherein the pictures (101 i, 1012, ..., 101 n) contained in the sequence (100) of pictures are subdivided into diagonals (102) of coding blocks, and wherein each picture configuration out of the plurality of picture configurations comprises at least one diagonal coding block index for signaling which diagonals (102) of coding blocks are refreshed diagonals (102r) of coding blocks and/or which diagonals (102) of coding blocks are un-refreshed diagonals (102u) of coding blocks.

11. The video data stream of any one of claims 1 to 10,

wherein the pictures (1011, 101.. 101 n) contained in the sequence (100) of pictures are subdivided into rows (102) of samples, and wherein each picture configuration out of the plurality of picture configurations comprises at least one sample row index for signaling which rows (102) of samples are refreshed rows (102r) of samples and/or which rows (102) of samples are un-refreshed rows (102u) of samples.

12. The video data stream of any one of claims 1 to 11 ,

wherein the pictures (1011, 1012, 101 n) contained in the sequence (100) of pictures are subdivided into columns (102) of samples, and wherein each picture configuration out of the plurality of picture configurations comprises at least one sample column index for signaling which columns (102) of samples are refreshed columns (102r) of samples and/or which columns (102) of samples are un-refreshed columns (102u) of samples.

13. The video data stream of any one of claims 1 to 12,

wherein the corresponding one picture configuration is signaled in a slice header and/or in an Access Unit Delimiter of the video data stream.

14. A decoder for decoding from a data stream at least one picture out of a sequence (100) of pictures (1011 , 1012, 101 n) comprising at least one Gradual Decoder Refresh -

GDR - coded picture (103) and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP),

wherein the decoder is configured to read from the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area (101) into a first sub-area ( 101 r) and a second sub-area (101u) among which one corresponds to a refreshed sub-area (101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet unrefreshed picture regions (102u),

wherein the decoder is further configured to read from the data stream, for each picture (1011, 1012, 101 n) within the refresh period (RP), a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations for decoding the at least one picture (1011, 1012. 101 n).

15. An encoder for encoding into a data stream at least one picture out of a sequence (100) of pictures (1011, 1012, .... 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP),

wherein the encoder is configured to write into the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area (101) into a first sub-area ( 101 r) and a second sub-area (101u) among which one corresponds to a refreshed sub-area (101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet un refreshed picture regions (102u),

wherein the encoder is further configured to set in the data stream, for each picture (1011 , 1012, 101 n) within the refresh period (RP), a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations.

16. A method for decoding from a data stream at least one picture out of a sequence (100) of pictures (1011, 1012. 101 n) comprising at least one Gradual Decoder Refresh -

GDR - coded picture (103) and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP), the method comprising steps of:

reading from the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area (101) into a first sub-area (101 r) and a second sub-area (101u) among which one corresponds to

a refreshed sub-area (101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet un refreshed picture regions (102u),

an reading from the data stream, for each picture (1011, 1012, .... 101 n) within the refresh period (RP), a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations for decoding the at least one picture (1011, IOI2, .... 101 n).

17. A method for encoding into a data stream at least one picture out of a sequence (100) of pictures (1011, 1012, .... 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP), the method comprising steps of:

writing into the data stream a parameter set defining a plurality of picture configurations, which subdivide a picture area (101) into a first sub-area (101 r) and a second sub-area (101u) among which one corresponds to

a refreshed sub-area (101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet unrefreshed picture regions (102u),

and setting in the data stream, for each picture (1011, 1012, 101n) within the refresh period (RP), a picture configuration identifier for identifying a corresponding one picture configuration out of the plurality of picture configurations.

18. A computer program for implementing the method of claims 16 or 17 when being executed on a computer or signal processor.

19. A video data stream comprising:

a sequence (100) of pictures (1011, 1012, 101 n) comprising at least one

Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP),

wherein each picture of the sequence (100) of pictures (1011, 1012, 101 n) is sequentially coded into the video data stream in units of blocks (102) into which the respective picture is subdivided,

wherein the video data stream comprises an implicit signaling, wherein a refreshed sub-area (101 r) of a respective picture (1011, 1012. 101 n) is implicitly signaled in the video data stream based on a block coding order, or

wherein the video data stream comprises, for each block (102), a syntax element indicating whether

e) the block (102) is a last block located in a first sub-area ( 101 r) of a respective picture and lastly coded, and/or

f) the block (102) is a first block located in a first sub-area (101 r) of a respective picture and firstly coded, and/or

g) the block (102) adjoins a border confining a first sub-area (101 r), and/or h) the block (102) is located inside a first sub-area (101 r).

20. The video data stream of claim 19,

wherein a picture area (101) is subdivided into a first sub-area ( 101 r) and a second sub-area (101u) among which one corresponds to

a refreshed sub-area ( 101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet un refreshed picture regions (101u).

21. The video data stream of claim 19 or 20,

wherein the implicit signaling signals that a first block (102) at a predetermined position in the block coding order is part of the refreshed-sub area (101 r).

22. The video data stream of any one of claims 19 to 21 ,

wherein the syntax element is for indicating a boundary between a refreshed sub-area ( 101 r) and a yet un-refreshed sub-area (101u) of a picture out of the sequence of pictures (1011 , 101.. 101 n) and/or for indicating which sub-area is a refreshed sub-area ( 101 r) and which sub-area is a yet un-refreshed sub-area (101u).

23. The video data stream of any one of claims 19 to 22, wherein

in case that the syntax element indicates that

b) the block (102) is a last block located in a refreshed sub-area (101 r) of a respective picture (101) and lastly coded,

the video data stream comprises a further syntax element (e.g. region_horizontal_flag) for indicating whether

a1) the block (102) is a lastly coded block of one or more rows of blocks (102) of the refreshed sub-area (101 r), or

a2) the block (102) is a lastly coded block of one or more columns of blocks (102) of the refreshed sub-area (101 r).

24. The video data stream of claim 23,

wherein the further syntax element is for indicating whether the last block (102) derives from a horizontal split or from a vertical split of a coding split tree according to which the respective picture (101) is subdivided into blocks (102).

25. The video data stream of any one of claims 19 to 24,

wherein the video data stream comprises an intra-prediction break indication for indicating that neighboring blocks (102) of a neighboring picture region (102u) are not available for prediction, e.g. if said neighboring picture region (102u) is contained in a yet un-refreshed sub-area (101u).

26. The video data stream of any one of claims 19 to 25,

wherein the refreshed sub-area ( 101 r) comprises one or more refreshed picture regions (102r) which are arranged in a grid that is aligned with the size of the blocks (102) into which the respective picture (101) is subdivided.

27. The video data stream of any one of claims 19 to 26,

wherein a picture region of the respective picture (101) is vertically subdivided into one or more slices (102), wherein the syntax element is signaled for each slice.

28. The video data stream of any one of claims 19 to 27,

wherein a picture region of the respective picture (101) is horizontally subdivided into one or more rows of blocks (102), wherein the syntax element is signaled

iii. only in the first row, or

iv. in every row.

29. The video data stream of any one of claims 19 to 28,

wherein a picture region of the respective picture (101) is vertically subdivided into one or more columns of blocks (102), wherein the syntax elementis signaled i. only in the first column, or

ii. in every column.

30. A decoder for decoding from a data stream at least one picture out of a sequence (100) of pictures (1011 , 1012, .... 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures in a refresh period (RP),

wherein each picture of the sequence (100) of pictures (1011, 1012, ..., 101 n) is sequentially decoded from the video data stream in units of blocks (102) into which the respective picture is subdivided,

wherein the decoder is configured to implicitly derive from the data stream a refreshed sub-area ( 101 r) of the at least one picture based on a block coding order, or wherein the decoder is configured to read from the data stream, for each block (102), a syntax element indicating whether

a) the block (102) is a last block located in a first sub-area ( 101 r) of a respective picture and lastly coded, and/or

b) the block (102) is a first block located in a first sub-area ( 101 r) of a respective picture and firstly coded, and/or

c) the block (102) adjoins a border confining a first sub-area (101 r), and/or d) the block (102) is located inside a first sub-area ( 101 r) .

31. The decoder of claim 30,

wherein a picture area (101) is subdivided into a first sub-area ( 101 r) and a second sub-area (101u) among which one corresponds to

a refreshed sub-area ( 101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet unrefreshed picture regions (102u).

32. An encoder for encoding into a data stream at least one picture out of a sequence (100) of pictures (1011, 1012, .... 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures in a refresh period (RP),

wherein each picture of the sequence (100) of pictures (1011, 1012, .... 101 n) is sequentially encoded into the video data stream in units of blocks (102) into which the respective picture is subdivided,

wherein the encoder is configured to write into the data stream, for each block (102), a syntax element indicating whether

a) the block (102) is a last block located in a first sub-area ( 101 r) of a respective picture and lastly coded, and/or

b) the block (102) is a first block located in a first sub-area ( 101 r) of a respective picture and firstly coded, and/or

c) the block (102) adjoins a border confining a first sub-area (101 r), and/or d) the block (102) is located inside a first sub-area (101 r).

33. The encoder of claim 32,

wherein a picture area (101) is subdivided into a first sub-area ( 101 r) and a second sub-area (101u) among which one corresponds to

a refreshed sub-area ( 101 r) comprising one or more refreshed picture regions (102r) and the other one corresponds to

an un-refreshed sub-area (101u) comprising one or more yet un refreshed picture regions (102u).

34. A method for decoding from a data stream at least one picture out of a sequence (100) of pictures (1011, 1012, ..., 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures in a refresh period (RP),

wherein each picture of the sequence (100) of pictures (1011, 1012, .... 101 n) is sequentially decoded from the video data stream in units of blocks (102) into which the respective picture is subdivided, wherein the method comprises steps of:

implicitly deriving from the data stream a refreshed sub-area (101 r) of the at least one picture based on a block coding order, or

reading from the data stream, for each block (102), a syntax element (e.g. a flag) indicating whether

a) the block (102) is a last block located in a first sub-area ( 101 r) of a respective picture and lastly coded, and/or

b) the block (102) is a first block located in a first sub-area ( 101 r) of a respective picture and firstly coded, and/or

c) the block (102) adjoins a border confining a first sub-area (101 r), and/or d) the block (102) is located inside a first sub-area (101 r).

35. A method for encoding into a data stream at least one picture out of a sequence (100) of pictures (1011, 1012, 101 n) comprising at least one Gradual Decoder Refresh - GDR - coded picture (103) and one or more subsequent pictures in a refresh period (RP),

wherein each picture of the sequence (100) of pictures (1011, 1012, 101n) is sequentially encoded into the video data stream in units of blocks (102) into which the respective picture is subdivided, wherein the method comprises steps of:

writing into the data stream, for each block (102), a syntax element indicating whether

a) the block (102) is a last block located in a first sub-area ( 101 r) of a respective picture and lastly coded, and/or

b) the block (102) is a first block located in a first sub-area ( 101 r) of a respective picture and firstly coded, and/or

c) the block (102) adjoins a border confining a first sub-area ( 101 r) , and/or d) the block (102) is located inside a first sub-area (101 r).

36. A computer program for implementing the method of claims 34 or 35 when being executed on a computer or signal processor.

37. A video data stream comprising:

at least one picture (101) being subdivided into tiles (102), and a tile-reordering flag, wherein

a) if the tile-reordering flag (e.g. sps_enforce_raster_scan_flag) in the data stream has a first state, it is signaled that tiles (102) of the picture (101) are to be coded using a first coding order which traverses the picture (101) tile by tile, and/or

b) if the tile-reordering flag in the data stream has a second state, it is signaled that tiles (102) of the picture (101) are to be coded using a second coding order which traverses the picture (101) along a raster scan order.

38. A decoder configured to decode a picture from a data stream, wherein:

c) if a tile-reordering flag (e.g. sps_enforce_raster_scan_flag) in the data stream has a first state, the decoder is configured to decode tiles (102) of the picture (101) from the data stream using a first decoding order which traverses the picture (101) tile by tile, and/or

d) if the tile-reordering flag in the data stream has a second state, the decoder is configured to decode the tiles (102) of the picture (101) from the data stream using a second decoding order which traverses the picture (101) along a raster scan order.

39. The decoder of claim 38,

wherein the decoder is configured to decode the picture (101) using a Gradual Decoding Refresh - GDR - approach, wherein the picture (101) is part of sequence (100) of pictures (1011, 1012, .... 101 n) which comprises at least one GDR coded picture (103) and one or more subsequent pictures,

wherein the picture (101) is block-wise decoded and partitioned into multiple packets (501a, 501b, ..., 501 n), wherein two or more packets comprise the same amount of blocks (102r) that are refreshed and/or the same amount of blocks (102u) that are yet un-refreshed.

40. An encoder configured to encode a picture into a data stream, wherein:

c) the encoder is configured to set a tile-reordering flag (e.g.

sps_enforce_raster_scan_flag) in the data stream into a first state,

indicating that tiles (102) of the picture (101) are to be coded using a first coding order which traverses the picture (101) tile by tile, and/or d) the encoder is configured to set the tile-reordering flag in the data stream into a second state, indicating that tiles (102) of the picture (101) are to be coded using a second coding order which traverses the picture (101) along a raster scan order.

41. A multi layered scalable video data stream (600) comprising:

a first sequence (200) of pictures (2011, 2012, ..., 201 n) in a first layer (601) and a second sequence (100) of pictures (101i, 1012, .... 101 n) in a second layer (602), wherein the second sequence (100) of pictures (1011, 1012, ..., 101 n) in the second layer (602) comprises at least one Gradual Decoder Refresh - GDR - picture

(103) as a start picture and one or more subsequent pictures (101 . 101 n) in a refresh period (RP),

wherein the multi layered scalable video data stream (600) comprises a signalization carrying information about a possibility that a yet un-refreshed sub-area (101u) of the GDR picture (103) of the second layer (602) is to be inter-layer predicted from samples (202r) of the first layer (601), and information:

that in yet un-refreshed sub-areas (101u) of the one or more subsequent pictures (1012, .... 101 n) contained in the refresh period (RP), motion vector prediction is disabled or motion vector prediction is realized non-temporally, or that in a yet un-refreshed sub-area (101u) of the GDR picture (103) motion vector prediction is disabled or motion vector prediction is realized non- temporally.

42. The multi layered scalable video data stream (600) of claim 41 ,

wherein at least one of the following coding concepts is disabled for coding the yet un-refreshed sub-areas (101u) of the one or more subsequent pictures (1012, ....

101 n) contained in the refresh period (RP):

• Temporal Motion Vector Prediction (TMVP)

e Advanced Temporal Motion Vector Prediction (ATMVP)

e TMVP-influenced candidates, e.g. motion vector candidates in the merge list which are influenced by TMVP or sub-block TMVP e Decoder Side Motion Vector Refinement (DMVR)

43. The multi layered scalable video data stream (600) of claim 40 or 41

wherein at least one of the following coding concepts is disabled for coding the yet un-refreshed sub-area (101u) of the GDR picture (103):

• Temporal Motion Vector Prediction (TMVP)

• Advanced Temporal Motion Vector Prediction (ATMVP)

• TMVP-influenced candidates, e.g. motion vector candidates in the merge list which are influenced by TMVP or sub-block TMVP

• Decoder Side Motion Vector Refinement (DMVR)

and wherein DMVR is disabled for coding the yet un-refreshed sub-areas (101 u) of the one or more subsequent pictures (1012, ... , 101 n) contained in the refresh period (RP).

44. The multi layered scalable video data stream (600) of any one of claims 41 to 43, wherein said inter-layer prediction from samples (202r) of the first layer (601) comprises substituting one or more samples (102u) of the yet un-refreshed sub-area (101 u) of the GDR picture (103) by an upsampled version of samples (202r) of the first layer (601).

45. The multi layered scalable video data stream (600) of any one of claims 41 to 44, wherein all samples (102u) of the entire yet un-refreshed sub-area (101u) of the GDR picture (103) are substituted by an upsampled version of samples (202r) of the first layer (601) so that pictures (1011 , 1012, .... 101n) from the second sequence (100) of coded pictures in the second layer (602) are instantly presentable to a user.

46. The multi layered scalable video data stream (600) of any one of claims 41 to 45, wherein yet un-refreshed sub-areas (101u) of the one or more subsequent pictures (1012, ..., 101 n) of the second layer (100) are refreshed by intra-layer prediction using the upsampled substitute samples (202r) from the first layer (601) which are gradually updated to refreshed samples (102r) of the second layer (602).

47. The multi layered scalable video data stream (600) of any one of claims 41 to 46, wherein the second layer (602) is coded independently from the first layer (601) or from any further layers, and wherein, if the second sequence (100) of pictures (1011,

1012. 101 n) is randomly accessed at the GDR picture (103), the signalization indicates that the yet un-refreshed sub-area (101u) of the GDR picture (103) of the

second layer (602) is to be inter-layer predicted from samples (202r) of the first layer (601) or of any predetermined further layer with adequate content.

48. A decoder for decoding at least one picture from a multi layered scalable video data stream (600) comprising a first sequence (200) of pictures (2011, 2012, ..., 201 „) in a first layer (601) and a second sequence (100) of pictures (1011, 1012, .... 101 n) in a second layer (602),

wherein the second sequence (100) of pictures (1011, 1012, .... 101 n) in the second layer (602) comprises at least one Gradual Decoder Refresh - GDR - picture (103) as a start picture and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP),

wherein the decoder is configured to read from the multi layered scalable video data stream (600) a signalization carrying information about a possibility that a yet unrefreshed sub-area (101u) of the GDR picture (103) of the second layer (602) is to be inter-layer predicted from samples (202r) of the first layer (601), and

wherein the decoder is further configured to, responsive to the signalization:

disable motion vector prediction or to realize motion vector prediction non-temporally in yet un-refreshed sub-areas (101u) of the one or more subsequent pictures (1012, .... 101 n) contained in the refresh period (RP), or to disable motion vector prediction or to realize motion vector prediction non-temporally in a yet un-refreshed sub-area (101 u) of the GDR picture (103).

49. An encoder for encoding at least one picture into a multi layered scalable video data stream (600) comprising a first sequence (200) of pictures (2011, 2012, ..., 201 n) in a first layer (601) and a second sequence of pictures (1011, 1012, ..., 101n) in a second layer (602),

wherein the second sequence (100) of pictures (1011 , 1012, ..., 101 n) in the second layer (602) comprises at least one Gradual Decoder Refresh - GDR - picture (103) as a start picture and one or more subsequent pictures (1012, .... 101 n) in a refresh period (RP),

wherein the encoder is configured to write into the multi layered scalable video data stream (600) a signalization carrying information about a possibility that a yet un- refreshed sub-area (101u) of the GDR picture (103) of the second layer (602) is to be inter-layer predicted from samples (202r) of the first layer (601), and information: that in yet un-refreshed sub-areas (101 u) of the one or more subsequent pictures (1012, .... 101 n) contained in the refresh period (RP), motion vector prediction is disabled or motion vector prediction is realized non-temporally, or that in a yet un-refreshed sub-area (101u) of the GDR picture (103) motion vector prediction is disabled or motion vector prediction is realized non- temporally.

50. A method for decoding at least one picture from a multi layered scalable video data stream (600) comprising a first sequence (200) of pictures (2011, 2012, .... 201 n) in a first layer (601) and a second sequence (100) of pictures (1011, 1012. 101 n) in a second layer (602),

reading from the multi layered scalable video data stream (600) a signalization carrying information about a possibility that a yet un-refreshed sub-area (101u) of the GDR picture (103) of the second layer (602) is to be inter-layer predicted from samples (202r) of the first layer (601), and

executing, responsive to the signalization, at least one of the following actions:

disable motion vector prediction or realize motion vector prediction non- temporally in yet un-refreshed sub-areas (101u) of the one or more subsequent pictures (1012, ..., 101 n) contained in the refresh period (RP), or disable motion vector prediction or realize motion vector prediction non- temporally in a yet un-refreshed sub-area (101u) of the GDR picture (103).

51. A method for encoding at least one picture into a multi layered scalable video data stream (600) comprising a first sequence (200) of pictures (2011, 2012, ..., 201 n) in a first layer (601) and a second sequence (100) of pictures (1011, 1012, .... 101 n) in a second layer (602),

wherein the second sequence (100) of pictures (1011, 1012, ..., 101 n) in the second layer (602) comprises at least one Gradual Decoder Refresh - GDR - picture (103) as a start picture and one or more subsequent pictures (1012, ..., 101 n) in a refresh period (RP), wherein the method comprises steps of:

writing into the multi layered scalable video data stream (600) a signalization carrying information about a possibility that a yet un-refreshed sub-area (101u) of the GDR picture (103) of the second layer (602) is to be inter-layer predicted from samples (202r) of the first layer (601), and information:

that in yet un-refreshed sub-areas (101u) of the one or more subsequent pictures (1012, ..., 101 n) contained in the refresh period (RP), motion vector prediction is disabled or motion vector prediction is realized non-temporally, or that in a yet un-refreshed sub-area (101u) of the GDR picture (103) motion vector prediction is disabled or motion vector prediction is realized non- temporally.

52. A computer program for implementing the method of claims 50 or 51 when being executed on a computer or signal processor.

53. A multi layered scalable video data stream (700) comprising:

a first sequence (200) of pictures (2011, 2012, .... 201n) in a first layer (701) and a second sequence (100) of pictures (1011, 1012, ..., 101 n) in a second layer (702), each of the first and second layers (701, 702) comprising a plurality of temporal sublayers (701a, 701b; 702a, 702b),

wherein the scalable video data stream (700) comprises a signalization indicating which temporal sublayers (702a, 702b) of the second layer (e.g. enhancement layer) (702) are coded by inter-layer prediction.

54. The multi layered scalable video data stream (700) of claim 53,

wherein the signalization comprises a predetermined temporal identifier from which the temporal sublayers (702a, 702b) of the second layer (702) are coded without inter-layer prediction.

55. The multi layered scalable video data stream (700) of claim 54,

wherein those temporal sublayers (702b) of the second layer (702) which comprise a temporal identifier having a value above the predetermined temporal identifier are coded without inter-layer prediction.

56. The multi layered scalable video data stream of claim 54 or 55,

wherein those temporal sublayers (702a) of the second layer (702) which comprise a temporal identifier having a value up to or below the predetermined temporal identifier are coded with inter-layer prediction.

57. The multi layered scalable video data stream (700) of any one of claims 53 to 56, wherein, if the signalization indicates that a temporal sublayer (702b) of the second layer (702) does not use a temporal sublayer (701b) of a lower layer (701) for inter-layer prediction, this temporal sublayer (701b) of the lower layer (701) is discardable from the multi layered scalable video data stream (700).

58. The multi layered scalable video data stream (700) of any one of claims 53 to 57, further comprising a syntax element for indicating whether the signalization is included in the multi layered scalable video data stream (700) or whether per-default all temporal sublayers (702a, 702b) of the second layer (702) depend on a lower layer (701).

59. A decoder for decoding at least one picture from a multi layered scalable video data stream (700) comprising a first sequence (200) of pictures (2011, 2012, ..., 201 n) in a first layer (701) and a second sequence (100) of pictures (1011, 1012, ..., 101 n) in a second layer (702), each of the first and second layers (701 , 702) comprising a plurality of temporal sublayers (701a, 701b; 702a, 702b),

wherein the decoder is configured to decode one or more of the temporal sublayers (701a, 701b; 702a, 702b) by using inter-layer prediction based on a signalization derived from the scalable video data stream (700), said signalization (e.g. vps_sub_layer_independent_flag[i][j]) indicating which temporal sublayers (702a, 702b) of the second layer (702) (e.g. enhancement layer) are to be coded by inter-layer prediction.

60. The decoder of claim 59,

wherein the signalization comprises a predetermined temporal identifier from which the temporal sublayers (702a, 702b) of the second layer (702) are coded without inter-layer prediction.

61. The decoder of claim 60,

62. The decoder of claim 60 or 61 ,

63. The decoder of any one of claims 60 to 62,

wherein, if the decoder derives from the signalization that a temporal sublayer (702b) of the second layer (702) does not use a temporal sublayer (701b) of a lower layer (701) for inter-layer prediction, the decoder is configured to discard this temporal sublayer (701b) of the lower layer (701) from decoding.

64. An encoder for encoding at least one picture into a multi layered scalable video data stream (700) comprising a first sequence (200) of pictures (201 , 2012, .... 201 n) in a first layer (701) and a second sequence (100) of pictures (101 , 1012, .... 101 n) in a second layer (702), each of the first and second layers (701 , 702) comprising a plurality of temporal sublayers (701a, 701b; 702a, 702b),

wherein the encoder is configured to encode one or more of the temporal sublayers (701a, 701b; 702a, 702b) by using inter-layer prediction and to write a signalization into the scalable video data stream (700), said signalization indicating which temporal sublayers (702a, 702b) of the second layer (702) are coded by interlayer prediction.

65. The encoder of claim 64,

wherein the signalization comprises a predetermined temporal identifier from which the temporal sublayers (702a, 702b) of the second layer (702) are coded without inter-layer prediction.

66. The encoder of claim 65,

67. The encoder of claim 65 or 66

68. The encoder of any one of claims 64 to 67,

wherein, if the encoder determines that a temporal sub-layer (702b) of the second layer (702) does not use a temporal sublayer (701b) of a lower layer (701) for inter-layer prediction, the encoder is configured to discard this temporal sublayer (701 b) of the lower layer (701) from encoding.

69. The encoder of any one of claims 64 to 68,

wherein the encoder is configured to encode a first predetermined row of consecutive pictures of the second sequence of pictures as being dependent, or to encode a second predetermined row of consecutive pictures of the second sequence of pictures as using inter-layer dependency.

70. A method for decoding at least one picture from a multi layered scalable video data stream (700) comprising a first sequence (200) of pictures (201 , 201 , .... 201 n) in a first layer (701) and a second sequence (100) of pictures (101 , 1012, .... 101 n) in a second layer (702), each of the first and second layers (701 , 702) comprising a plurality of temporal sublayers (701a, 701b; 702a, 702b), wherein the method comprises steps of:

decoding one or more of the temporal sublayers (702a, 702b) by using inter layer prediction based on a signalization derived from the scalable video data stream (700), said signalization indicating which temporal sublayers (702a, 702b) of the second layer (702) are to be coded by inter-layer prediction.

71. The method for decoding of claim 70,

wherein the signalization comprises a predetermined temporal identifier from which the temporal sublayers (702a, 702b) of the second layer (702) are coded without inter-layer prediction.

72. The method for decoding of claim 71 ,

73. The method for decoding of claim 71 or 72,

74. A method for encoding at least one picture into a multi layered scalable video data stream (700) comprising a first sequence (200) of pictures (2011, 2012, .... 201 n) in a first layer (701) and a second sequence (100) of pictures (1011, 1012, .... 101 n) in a second layer (702), each of the first and second layers (701 , 702) comprising a plurality of temporal sublayers (701a, 701b; 702a, 702b), wherein the method comprises steps of:

encoding one or more of the temporal sublayers (702a, 702b) by using interlayer prediction and writing a signalization into the scalable video data stream (700), said signalization indicating which temporal sublayers (702a, 702b) of the second layer (702) are coded by inter-layer prediction.

75. The method for encoding of claim 74,

wherein the signalization comprises a predetermined temporal identifier from which the temporal sublayers (702a, 702b) of the second layer (702) are coded without inter-layer prediction.

76. The method for encoding of claim 75,

77. The method for encoding of claim 75 or 76,

78. A computer program for implementing the method of claims 70 to 73 or 74 to 77 when being executed on a computer or signal processor.

Documents

Application Documents

#	Name	Date
1	202217016774-TRANSLATIOIN OF PRIOIRTY DOCUMENTS ETC. [24-03-2022(online)].pdf	2022-03-24
2	202217016774-STATEMENT OF UNDERTAKING (FORM 3) [24-03-2022(online)].pdf	2022-03-24
3	202217016774-POWER OF AUTHORITY [24-03-2022(online)].pdf	2022-03-24
4	202217016774-NOTIFICATION OF INT. APPLN. NO. & FILING DATE (PCT-RO-105-PCT Pamphlet) [24-03-2022(online)].pdf	2022-03-24
5	202217016774-FORM 1 [24-03-2022(online)].pdf	2022-03-24
6	202217016774-DRAWINGS [24-03-2022(online)].pdf	2022-03-24
7	202217016774-DECLARATION OF INVENTORSHIP (FORM 5) [24-03-2022(online)].pdf	2022-03-24
8	202217016774-COMPLETE SPECIFICATION [24-03-2022(online)].pdf	2022-03-24
9	202217016774-CLAIMS UNDER RULE 1 (PROVISIO) OF RULE 20 [24-03-2022(online)].pdf	2022-03-24
10	202217016774.pdf	2022-03-25
11	202217016774-MARKED COPIES OF AMENDEMENTS [30-03-2022(online)].pdf	2022-03-30
12	202217016774-FORM 18 [30-03-2022(online)].pdf	2022-03-30
13	202217016774-FORM 13 [30-03-2022(online)].pdf	2022-03-30
14	202217016774-Annexure [30-03-2022(online)].pdf	2022-03-30
15	202217016774-AMMENDED DOCUMENTS [30-03-2022(online)].pdf	2022-03-30
16	202217016774-Proof of Right [18-05-2022(online)].pdf	2022-05-18
17	202217016774-FER.pdf	2022-08-26
18	202217016774-FORM 3 [27-09-2022(online)].pdf	2022-09-27
19	202217016774-FORM 3 [20-01-2023(online)].pdf	2023-01-20
20	202217016774-OTHERS [22-02-2023(online)].pdf	2023-02-22
21	202217016774-FER_SER_REPLY [22-02-2023(online)].pdf	2023-02-22
22	202217016774-CLAIMS [22-02-2023(online)].pdf	2023-02-22
23	202217016774-Response to office action [16-06-2025(online)].pdf	2025-06-16
24	202217016774-US(14)-HearingNotice-(HearingDate-19-09-2025).pdf	2025-09-09
25	202217016774-US(14)-ExtendedHearingNotice-(HearingDate-09-10-2025)-1100.pdf	2025-09-10
26	202217016774-Correspondence to notify the Controller [01-10-2025(online)].pdf	2025-10-01
27	202217016774-FORM-26 [08-10-2025(online)].pdf	2025-10-08
28	202217016774-Written submissions and relevant documents [22-10-2025(online)].pdf	2025-10-22
29	202217016774-PETITION UNDER RULE 137 [22-10-2025(online)].pdf	2025-10-22
30	202217016774-PatentCertificate30-10-2025.pdf	2025-10-30
31	202217016774-IntimationOfGrant30-10-2025.pdf	2025-10-30

Search Strategy

1	SearchStrategyE_25-08-2022.pdf