Neural Network Representation Formats

Abstract: Data stream (45) having a representation of a neural network (10) encoded thereinto, the data stream (45) comprising serialization parameter (102) indicating a coding order (104) at which neural network parameters (32), which define neuron interconnections (22, 24) of the neural network (10), are encoded into the data stream (45).

Patent Information

Application #

Filing Date

31 March 2022

Publication Number

28/2022

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Hansastraße 27c 80686 München

Inventors

1. MATLAGE, Stefan

Grazer Damm 115 12157 Berlin

2. HAASE, Paul

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

3. KIRCHHOFFER, Heiner

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

4. MÜLLER, Karsten

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

5. SAMEK, Wojciech

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

6. WIEDEMANN, Simon

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

7. MARPE, Detlev

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

8. SCHIERL, Thomas

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

9. SÁNCHEZ DE LA FUENTE, Yago

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

10. SKUPIN, Robert

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

11. WIEGAND, Thomas

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin

Specification

Neural Network Representation Formats

Description

The present application relates to concepts for Neural Network Representation Formats.

Neural Networks (NN) have led to break-throughs in many applications nowadays;

• object detection or classification in image/video data

• speech/keyword recognition in audio

• speech synthesis

• optical character recognition

• language translation

• and so on

However, the applicability in certain usage scenarios is still hampered by the sheer amount of data that is needed to represent NNs. In most cases, this data is comprised by two types of parameters, the weights and bias, that describe the connection between neurons. The weights are usually parameters that perform some type of linear transformation to the input values (e.g., dot product or convolution), or in other words, weight the neuron’s inputs, and the bias are offsets that are added after the linear calculation, or in other words, offset the neuron’s aggregation of inbound weighted messages. More specifically, these weights, biases and further parameter that characterize each connection between two of the potentially very large number of neurons (up to tens of millions) in each layer (up to hundreds) of the NN occupy the major portion of the data associated to a particular NN. Also, these parameters are typically consisting of sizable floating-point date types. These parameters are usually expressed as large tensors carrying all parameters of each layer. When applications require frequent transmission/updates of the involved NNs, the necessary data rate becomes a serious bottle neck. Therefore, efforts to reduce the coded size of NN representations by means of lossy compression of these matrices is a promising approach.

Typically, the parameter tensors are stored in container formats (ONNX (ONNX = Open Neural Network Exchange), Pytorch, TensorFlow, and the like) that carry all data (such as the above parameter matrices) and further properties (such as dimensions of the parameter tensors, type of layers, operations and so on) that are necessary to fully reconstruct the NN and execute it.

It would be advantageous to have a concept at hand which renders transmission/updates of machine learning predictors or, alternatively speaking, machine learning models such as a neural network more efficient such as more efficient in terms of conservation of inference quality with reducing, concurrently, a coded size of NN representations, computational inference complexity, complexity of describing or storing the NN representations, or which enables a more frequent transmission/update of a NN than currently or which even improves the inference quality for a certain task at hand and/or for a certain local input data statistic. Furthermore, it would be advantageous to provide a neural network representation, a derivation of such neural network representation and the usage of such neural network representation in performing neural network based prediction so that the usage of neural networks becomes more effective than currently.

Thus, it is the object of the present invention to provide a concept for efficient usage of neural networks and/or efficient transmission and/or updates of neural networks. This object is achieved by the subject-matter of the independent claims of the present application.

Further embodiments according to the invention are defined by the subject matter of the dependent claims of the present application.

It is a basic idea underlying a first aspect of the present application that a usage of neural networks (NN) is rendered highly efficient, if a serialization parameter is encoded/decoded into/from a data stream having a representation of the NN encoded thereinto. The serialization parameter indicates a coding order at which NN parameters, which define neuron interconnections of the NN, are encoded into the data stream. The neuron interconnections might represent connections between neurons of different NN layers of the NN. In other words, a NN parameter might define a connection between a first neuron associated with a first layer of the NN and a second neuron associated with a second layer of the NN. A decoder might use the coding order to assign NN parameters serially decoded from the data stream to the neuron interconnections.

In particular, using the serialization parameter turns out to efficiently divide a bitstring into meaningful consecutive subsets of the NN parameters. The serialization parameter might indicate a grouping of the NN parameters allowing an efficient execution of the NN. This might be done dependent on application scenarios for the NN. For different application scenarios, an encoder might traverse the NN parameters using different coding orders. Thus, the NN parameters can be encoded using individual coding orders dependent on the application scenario of the NN and the decoder can reconstruct the NN parameters accordingly while

decoding, because of the information provided by the serialization parameter. The NN parameters might represent entries of one or more parameter matrices or tensors, wherein the parameter matrices or tensors might be used for inference procedures. It was found that the one or more parameter matrices or tensors of the NN can be efficiently reconstructed by a decoder based on decoded NN parameters and the serialization parameter.

Thus, the serialization parameter allows the usage of different application specific coding orders allowing a flexible encoding and decoding with an improved efficiency. For instance, encoding parameters along different dimensions may benefit the resulting compression performance since the entropy coder may be able to better capture dependencies among them. In another example, it may be desirable to group parameters according to certain application specific criteria, i.e. what part of the input data they relate to or whether they can be jointly executed, so that they can be decoded/inferred in parallel. A further example is to encode the parameters following the General Matrix Matrix (GEMM) product scan order that support efficient memory allocation of the decoded parameters when performing a dot product operation (Andrew Kerr, 2017).

A further embodiment is directed to encoder-side chosen permutations of the data, e.g. in order to achieve, for instance, energy compaction of the NN parameter to be coded and subsequently process/serialize/code the resulting permutated data according to the resulting order. The permutation may, thus, sort the parameters so that same increase or so that same decrease steadily along the coding order.

In accordance with a second aspect of the present application, the inventors of the present application realized that a usage of neural networks, NN, is rendered highly efficient, if a numerical computation representation parameter is encoded/decoded into/from a data stream having a representation of the NN encoded thereinto. The numerical computation representation parameter indicates a numerical representation, e.g. among floating point or fixed point representation, and a bit size at which NN parameters of the NN, which are encoded into the data stream, are to be represented when using the NN for inference. An encoder is configured to encode the NN parameters. A decoder is configured to decode the NN parameters and might be configured to use the numerical representation and bit size for representing the NN parameters decoded from the data stream, DS.

This embodiment is based on the idea, that it may be advantageous to represent the NN parameters and activation values, which activation values result from a usage of the NN parameters at an inference using the NN, both with the same numerical representation and bit size. Based on the numerical computation representation parameter it is possible to compare efficiently the indicated numerical representation and bit size for the NN parameters with possible numerical representations and bit sizes for the activation values. This might be especially advantageous in case of the numerical computation representation parameter indicating a fixed point representation as numerical representation, since then, if both the NN parameters and the activation values can be represented in the fixed point representation, inference can be performed efficiently due to fixed-point arithmetic.

In accordance with a third aspect of the present application, the inventors of the present application realized that a usage of neural networks is rendered highly efficient, if a NN layer type parameter is encoded/decoded into/from a data stream having a representation of the NN encoded thereinto. The NN layer type parameter indicates a NN layer type, e.g., convolutional layer type or fully connected layer type, of a predetermined NN layer of the NN. The data stream is structured into one or more individually accessible portions, each individually accessible portion representing a corresponding NN layer of the NN. The predetermined NN layer represents one of the NN layer of the neural network. Optionally, for each of two or more predetermined NN layer of the NN, the NN layer type parameter is encoded/decoded into/from a data stream, wherein the NN layer type parameter can differ between at least some predetermined NN layer.

This embodiment is based on the idea, that it may be useful, that the data stream comprises the NN layer type parameter for NN layer, in order to, for instance, understand a meaning of the dimensions of a parameter tensor/matrix. Moreover, different layers may be treated differently while encoding in order to better capture the dependencies in the data and lead to a higher coding efficiency, e.g., by using different sets or modes of context models, information that may be crucial for the decoder to know prior to decoding.

Similarly, it may be advantageous to encode/decode into/from a data stream a type parameter indicting a parameter type of the NN parameters. The type parameter may indicate whether the NN parameters represent weights or bias. The data stream is structured into one or more individually accessible portions, each individually accessible portion representing a corresponding NN layer of the NN. An individually accessible portion representing a corresponding predetermined NN layer might be further structured into individually accessible sub-portions. Each individually accessible sub-portion is completely traversed by a coding order before a subsequent individually accessible sub-portion is traversed by the coding order. Into each individually accessible sub-portion, for example, NN parameters and a type parameter are encoded and can be decoded. NN parameter of a first individually accessible

sub-portion may be of a different parameter type or of the same parameter type as NN parameter of a second individually accessible sub-portion. Different types of NN parameters associated with the same NN layer might be encoded/decoded into/from different individually accessible sub-portions associated with the same individually accessible portion. The distinction between the parameter types may be beneficial for encoding/decoding when, for instance, different types of dependencies can be used for each type of parameters, or if parallel decoding is wished, etc. It is, for example, possible to encode/decode different types of NN parameters associated with the same NN layer parallel. This enables a higher efficiency in encoding/decoding of the NN parameters and may also benefit the resulting compression performance since the entropy coder may be able to beter capture dependencies among the NN parameters.

In accordance with a fourth aspect of the present application, the inventors of the present application realized that a transmission/updating of neural networks is rendered highly efficient, if a pointer is encoded/decoded into/from a data stream having a representation of the NN encoded thereinto. This is due to the fact, that the data stream is structured into individually accessible portions and for each of one or more predetermined individually accessible portions, a pointer points to a beginning of the respective predetermined individually accessible portion. Not all individually accessible portions need to be predetermined individually accessible portions, but it is possible, that all individually accessible portions represent predetermined individually accessible portions. The one or more predetermined individually accessible portions might be set by default or dependent on an application of the NN encoded into the data stream. The pointer indicates, for example, the beginning of the respective predetermined individually accessible portion as data stream position in bytes or as an offset, e.g., a byte offset with respect to a beginning of the data stream or with respect to a beginning of a portion corresponding to a NN layer, to which portion the respective predetermined individually accessible portion belongs to. The pointer might be encoded/decoded into/from a header portion of the data stream. According to an embodiment, for each of the one or more predetermined individually accessible portions, the pointer is encoded/decoded into/from a header portion of the data stream, in case of the respective predetermined individually accessible portion representing a corresponding NN layer of the neural network or the pointer is encoded/decoded into/from a parameter set portion of a portion corresponding to a NN layer, in case of the respective predetermined individually accessible portion representing a NN portion of a NN layer of the NN. A NN portion of a NN layer of the NN might represent a baseline section of the respective NN layer or an advanced section of the respective layer. With the pointer it is possible to efficiently access the predetermined individually accessible portions of the data stream enabling, for example, to parallelize the layer processing or

package the data stream into respective container formats. The pointer allows easier, faster and more adequate access to the predetermined individually accessible portions in order to facilitate applications that require parallel or partial decoding and execution of NNs.

In accordance with a fifth aspect of the present application, the inventors of the present application realized that a transmission/updating of neural networks is rendered highly efficient, if a start code, a pointer and/or a data stream length parameter is encoded/decoded into/from an individually accessible sub-portion of a data stream having a representation of the NN encoded thereinto. The data stream is structured into one or more individually accessible portions, each individually accessible portion representing a corresponding NN layer of the neural network. Additionally, the data stream is, within one or more predetermined individually accessible portions, further structured into individually accessible sub-portions, each individually accessible sub-portion representing a corresponding NN portion of the respective NN layer of the neural network. An apparatus is configured to encode/decode into/from the data stream, for each of the one or more predetermined individually accessible sub-portions, a start code at which the respective predetermined individually accessible sub-portion begins, and/or a pointer pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or a data stream length parameter indicating a data stream length of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the DS. The start code, the pointer and/or the data stream length parameter enable an efficient access to the predetermined individually accessible sub-portions. This is especially beneficial for applications that may rely on grouping NN parameter within a NN layer in a specific configurable fashion as it can be beneficial to have the NN parameter decoded/processed/inferred partially or in parallel. Therefore, an individually accessible sub-portion wise access to an individually accessible portion can help to access desired data in parallel or leave out unnecessary data portions. It was found, that it is sufficient to indicate an individually accessible sub-portion using a start code. This is based on the finding, that an amount of data per NN layer, i.e. individually accessible portion, is usually less than in case NN layers are to be detected by start codes within the whole data stream. Nevertheless, it is also advantageous to use the pointer and/or the data stream length parameter to improve the access to an individually accessible sub-portion. According to an embodiment, the one or more individually accessible sub-portions within an individually accessible portion of the data stream are indicated by a pointer indicating a data stream position in bytes in a parameter set portion of the individually accessible portion. The data stream length parameter might indicate a run length of individually accessible sub-portions. The data stream length parameter might be encoded/decoded into/from a header portion of the data stream or into/from the parameter set portion of the individually accessible portion. The data stream length parameter might be used in order to facilitate cut out of the respective individually accessible sub-portion for the purpose of packaging the one or more individually accessible sub-portion in appropriate containers. According to an embodiment, an apparatus for decoding the data stream is configured to use, for one or more predetermined individually accessible sub-portions, the start code and/or the pointer and/or the data stream length parameter for accessing the data stream.

In accordance with a sixth aspect of the present application, the inventors of the present application realized that a usage of neural networks is rendered highly efficient, if a processing option parameter is encoded/decoded into/from a data stream having a representation of the NN encoded thereinto. The data stream is structured into individually accessible portions and for each of one or more predetermined individually accessible portions a processing option parameter indicates one or more processing options which have to be used or which may optionally be used when using the neural network for inference. The processing option parameter might indicate one processing option out of various processing options that also determine if and how a client would access the individually accessible portions (P) and/or the individually accessible sub-portions (SP), like, for each of P and/or SP, a parallel processing capability of the respective P or SP and/or a sample wise parallel processing capability of the respective P or SP and/or a channel wise parallel processing capability of the respective P or SP and/or a classification category wise parallel processing capability of the respective P or SP and/or other processing options. The processing option parameter allows a client appropriate decision making and thus a highly efficient usage of the NN.

In accordance with a seventh aspect of the present application, the inventors of the present application realized that a transmission/updating of neural networks is rendered highly efficient, if a reconstruction rule for dequantizing NN parameters depends on a NN portion the NN parameters belong to. The NN parameters, which NN parameters represent a neural network, are encoded into a data stream in a manner quantized onto quantization indices. An apparatus for decoding is configured to dequantize the quantization indices to reconstruct the NN parameters, e.g., using the reconstruction rule. The NN parameters are encoded into the data stream so that NN parameters in different NN portions of the NN are quantized differently, and the data stream indicates, for each of the NN portions, a reconstruction rule for dequantizing NN parameters relating to the respective NN portion. The apparatus for decoding is configured to use, for each of the NN portions, the reconstruction rule indicated by the data stream for the respective NN portion to dequantize the NN parameter in the respective NN portion. The NN portions, for example, comprise one or more NN layers of the NN and/or portions of an NN layer into which portions a predetermined NN layer of the NN is subdivided.

According to an embodiment, a first reconstruction rule for dequantizing NN parameters relating to a first NN portion are encoded into the data stream in a manner delta-coded relative to a second reconstruction rule for dequantizing NN parameters relating to a second NN portion. The first NN portion might comprise first NN layers and the second NN portion might comprise second layers, wherein the first NN layers differ from the second NN layers. Alternatively, the first NN portion might comprise first NN layers and the second NN portion might comprise portions of one of the first NN layers. In this alternative case, a reconstruction rule, e.g., the second reconstruction rule, related to NN parameters in a portion of a predetermined NN layer are delta-coded relative to a reconstruction rule, e.g., the first reconstruction rule, related to the predetermined NN layer. This special delta-coding of the reconstruction rules might allow to only use few bits for signalling the reconstruction rules and can result in an efficient transmission/updating of neural networks.

In accordance with an eighth aspect of the present application, the inventors of the present application realized that a transmission/updating of neural networks is rendered highly efficient, if a reconstruction rule for dequantizing NN parameters depends on a magnitude of quantization indices associated with the NN parameters. The NN parameters, which NN parameters represent a neural network, are encoded into a data stream in a manner quantized onto quantization indices. An apparatus for decoding is configured to dequantize the quantization indices to reconstruct the NN parameters, e.g., using the reconstruction rule. The data stream comprises, for indicating the reconstruction rule for dequantizing the NN parameters, a quantization step size parameter indicating a quantization step size, and a parameter set defining a quantization-index-to-reconstruction-level mapping. The reconstruction rule for NN parameters in a predetermined NN portion is defined by the quantization step size for quantization indices within a predetermined index interval, and the quantization-index-to-reconstruction-level mapping for quantization indices outside the predetermined index interval. For each NN parameter, a respective NN parameter associated with a quantization index within the predetermined index interval, for example, is reconstructed by multiplying the respective quantization index with the quantization step size and a respective NN parameter corresponding to a quantization index outside the predetermined index interval, for example, is reconstructed by mapping the respective quantization index onto a reconstruction level using the quantization-index-to-reconstruction-level mapping. The decoder might be configured to determine the quantization-index-to-reconstruction-level mapping based on the parameter set in the data stream. According to an embodiment, the parameter set defines the quantization-index-to-reconstruction-level mapping by pointing to a quantization-index-to-reconstruction-!evel mapping out of a set of quantization-index-to-

reconstruction-level mappings, wherein the set of quantization-index-to-reconstruction-level mappings might not be part of the data stream, e.g., it might be saved at encoder side and decoder side. Defining the reconstruction rule based on a magnitude of quantization indices can result in a signalling of the reconstruction rule with few bits.

In accordance with a ninth aspect of the present application, the inventors of the present application realized that a transmission/updating of neural networks is rendered highly efficient, if an identification parameter is encoded/decoded into/from individually accessible portions of a data stream having a representation of the NN encoded thereinto. The data stream is structured into individually accessible portions and, for each of one or more predetermined individually accessible portions, an identification parameter for identifying the respective predetermined individually accessible portion is encoded/decoded into/from the data stream. The identification parameter might indicate a version of the predetermined individually accessible portion. This is especially advantageous in scenarios such as distributed learning, where many clients individually further train a NN and send relative NN updates back to a central entity. The identification parameter can be used to identify the NN of individual clients through a versioning scheme. Thereby, the central entity can identify the NN that an NN update is built upon. Additionally, or alternatively, the identification parameter might indicate whether the predetermined individually accessible portion is associated with a baseline part of the NN or with an advanced/enhanced/complete part of the NN. This is, for example, advantageous in use cases, such as scalable NNs, where a baseline part of an NN can be executed, for instance, in order to generate preliminary results, before the complete or enhanced NN is carried out to receive full results. Further, transmission errors or involuntary changes of a parameter tensor reconstructable based on NN parameters representing the NN are easily recognizable using the identification parameter. The identification parameter allows for each predetermined individually accessible portions to check integrity and make operations more error robust when it could be verified based on the NN characteristics.

In accordance with a tenth aspect of the present application, the inventors of the present application realized that a transmission/updating of neural networks is rendered highly efficient, if different versions of the NN are encoded/decoded into/from a data stream using delta-coding or using a compensation scheme. The data stream has a representation of an NN encoded thereinto in a layered manner so that different versions of the NN are encoded into the data stream. The data stream is structured into one or more individually accessible portions, each individually accessible portion relating to a corresponding version of the NN. The data stream has, for example, a first version of the NN encoded into a first portion delta-coded relative to a second version of the NN encoded into a second portion. Additionally, or alternatively, the data stream has, for example, a first version of the NN encoded into a first portion in form of one or more compensating NN portions each of which is to be, for performing an inference based on the first version of the NN, executed in addition to an execution of a corresponding NN portion of a second version of the NN encoded into a second portion, and wherein outputs of the respective compensating NN portion and corresponding NN portion are to be summed up. With these encoded versions of the NN in the data stream, a client, e.g., a decoder, can match its processing capabilities or may be able to do inference on the first version, e.g., a baseline, first before processing the second version, e.g., a more complex advanced NN. Furthermore, by applying/using the delta-coding and/or the compensation scheme, the different versions of the NN can be encoded into the DS with few bits.

In accordance with an eleventh aspect of the present application, the inventors of the present application realized that a usage of neural networks is rendered highly efficient, if supplemental data is encoded/decoded into/from individually accessible portions of a data stream having a representation of the NN encoded thereinto. The data stream is structured into individually accessible portions and the data stream comprises for each of one or more predetermined individually accessible portions a supplemental data for supplementing the representation of the NN. This supplemental data is usually not necessary for decoding/reconstruction/inference of the NN, however, it can be essential from an application point of view. Therefore, it is advantageous to mark this supplemental data as irrelevant for the decoding of the NN for the purpose of sole inference so that clients, e.g. decoders, which do not require the supplemental data, are able to skip this part of the data.

In accordance with a twelfth aspect of the present application, the inventors of the present application realized that a usage of neural networks is rendered highly efficient, if hierarchical control data is encoded/decoded into/from a data stream having a representation of the NN encoded thereinto. The data stream comprises hierarchical control data structured into a sequence of control data portions, wherein the control data portions provide information on the NN at increasing details along the sequence of control data portions. It is advantageous to structure the control data hierarchically, since a decoder might only need the control data until a certain level of detail and can thus skip the control data providing more details. Thus, depending on the use case and its knowledge of environment, different levels of control data may be required and with the aforementioned scheme of presenting such control data enables an efficient access to the needed control data for different use cases.

Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or

device corresponds to a method step or a feature of a method step. An embodiment is related to a computer program having a program code for performing, when running on a computer, such a method.

Implementations of the present invention are the subject of the dependent claims. Preferred embodiments of the present application are described below with respect to the figures. The drawings are not necessarily to scale; emphasis instead generally being placed upon illustrating the principles of the invention. In the following description, various embodiments of the invention are described with reference to the following drawings, in which:

Fig. 1 shows an example of an encoding/decoding pipeline for encoding/decoding a neural network;

Fig. 2 shows a neural network which might be encoded/decoded according to one of the embodiments;

Fig. 3 shows a serialization of parameter tensors of layers of a neural network, according to an embodiment;

Fig. 4 shows the usage of a serialization parameter for indicating how neural network parameters are serialized, according to an embodiment;

Fig. 5 shows an example for a single-output-channel convolutional layer;

Fig. 6 shows an example for a fully-connected layer;

Fig. 7 shows a set of n coding orders at which neural network parameters might be encoded, according to an embodiment;

Fig. 8 shows context-adaptive arithmetic coding of individually accessible portions or sub-portions, according to an embodiment;

Fig. 9 shows the usage of a numerical computation representation parameter, according to an embodiment;

Fig. 10 shows the usage of a neural network layer type parameter indicating a neural network layer type of a neural network layer of the neural network, according to an embodiment;

Fig. 11 shows a general embodiment of a data stream with pointer pointing to beginnings of individually accessible portions, according to an embodiment; Fig. 12 shows a detailed embodiment of a data stream with pointer pointing to beginnings of individually accessible portions, according to an embodiment; Fig. 13 shows the usage of start codes and/or pointer and/or data stream length parameter to enable an access to individually accessible sub-portions, according to an embodiment;

Fig. 14a shows a sub-layer access using pointer, according to an embodiment;

Fig. 14b shows a sub-layer access using start codes, according to an embodiment;

Fig. 15 shows exemplary types of random access as possible processing options for individually accessible portions, according to an embodiment;

Fig. 16 shows the usage of a processing option parameter, according to an embodiment;

Fig. 17 shows the usage of a neural network portion dependent reconstruction rule, according to an embodiment;

Fig. 18 shows a determination of a reconstruction rule based on quantization indices representing quantized neural network parameter, according to an embodiment;

Fig. 19 shows the usage of an identification parameter, according to an embodiment;

Fig. 20 shows an encoding/decoding of different versions of a neural network, according to an embodiment;

Fig. 21 shows a delta-coding of two versions of a neural network, wherein the two versions differ in their weights and/or biases, according to an embodiment;

Fig. 22 shows an alternative delta-coding of two versions of a neural network, wherein the two versions differ in their number of neurons or neuron interconnections, according to an embodiment;

Fig. 23 shows an encoding of different versions of a neural network using compensating neural network portions, according to an embodiment;

Fig. 24a shows an embodiment of a data stream with supplemental data, according to an embodiment;

Fig. 24b shows an alternative embodiment of a data stream with supplemental data, according to an embodiment; and

Fig. 25 shows an embodiment of a data stream with a sequence of control data portions.

Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals even if occurring in different figures.

in the following description, a plurality of details is set forth to provide a more throughout explanation of embodiments of the present invention. However, it will be apparent to those skilled in the art that embodiments of the present invention may be practiced without these specific details, in other instances, well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring embodiments of the present invention. In addition, features of the different embodiments described herein after may be combined with each other, unless specifically noted otherwise.

The following description of embodiments of the present application starts with a brief introduction and outline of embodiments of the present application in order to explain their advantages and how same achieve these advantages.

It was found, that in the current activities of coded representations of NN such as developed in the ongoing MPEG activity on NN compression, it can be beneficial to separate a model bitstream representing parameter tensors of multiple layers into smaller sub-bitstreams that contain the coded representation of the parameter tensors of individual layers, i.e. layer bitstreams. This can help in general when such model bitstreams need to be stored/loaded in context of a container format or in application scenarios that feature parallel decoding/execution of layers of the NN.

In the following, various examples are described which may assist in achieving an effective compression of a neural network, NN, and/or in improving an access to data representing the NN and thus resulting in an effective transmission/updating of the NN.

In order to ease the understanding of the following examples of the present application, the description starts with a presentation of possible encoders and decoders fitting thereto into which the subsequently outlined examples of the present application could be built.

Figure 1 shows a simple sketch example of an encoding/decoding pipeline according to DeepCABAC and illustrates the inner operations of such a compression scheme. First, the weights 32, e.g., the weights 321 to 326, of the connections 22, e.g., the connections 221 to 226, between neurons 14, 20 and/or 18, e.g., between predecessor neurons 141 to 143 and intermediate neurons 201 and 202, are formed into tensors, which are shown as matrices 30 in the example (step 1 in figure 1). In step 1 of Figure 1, for example, the weights 32 associated with a first layer of a neural Network 10, NN, are formed into the matrix 30. According to the embodiment shown in Fig. 1 , the columns of the matrix 30 are associated with the predecessor neurons 141 to 143 and the rows of the matrix 30 are associated with the intermediate neurons 201 and 202, but it is clear that the formed matrix can alternatively represent an inversion of the illustrated matrix 30.

Then, each NN parameter, e.g., the weights 32, is encoded, e.g., quantized and entropy coded, e.g. using context-adaptive arithmetic coding 600, as shown in steps 2 and 3, following a

particular scanning order, e.g., row-major order (left to right, top to bottom). As will be outlined in more detail below, it is also possible to use a different scanning order, i.e. coding order. The steps 2 and 3 are performed by an encoder 40, i.e. an apparatus for encoding. The decoder 50, i.e. an apparatus for decoding, follows the same process in reverse processing order steps. That is, firstly it decodes the list of integer representation of the encoded values, as shown in step 4, and then reshapes the list into its tensor representation 30’, as shown in step 5. Finally, the tensor 30’ is loaded into the network architecture 10’, i.e. a reconstructed NN, as shown in step 6. The reconstructed tensor 30’ comprises reconstructed NN parameter, i.e. decoded NN parameter 32’.

The NN 10 shown in Fig. 1 is only a simple NN with few neurons 14, 20 and 18. A neuron might, in the following also be understood as node, element, model element or dimension. Furthermore, the reference sign 10 might indicate a machine learning (ML) predictor or, in other words, a machine learning model such as a neural network.

With reference to Fig. 2 a neural network is described in more detail. In particular, Fig. 2 shows an ML predictor 10 comprising an input interface 12 with input nodes or elements 14 and an output interface 16 with output nodes or elements 18. The input nodes/elements 14 receive the input data. In other words, the input data is applied thereonto. For instance, they may receive a picture with, for instance, each element 14 being associated with a pixel of the picture. Alternatively, the input data applied onto elements 14 may be a signal such as a one dimensional signal such as an audio signal, a sensor signal or the like. Even alternatively, the input data may represent a certain data set such as medical file data or the like. The number of input elements 14 may be any number and depends on the type of input data, for instance. The number of output nodes 18 may be one, as shown in Fig. 1 , or larger than one, as shown in Fig. 2. Each output node or element 18 may be associated with a certain inference or prediction task. In particular, upon the ML predictor 10 being applied onto a certain input applied onto the ML predictor’s 10 input interface 12, the ML predictor 10 outputs at the output interface 16 the inference or prediction result wherein the activation, i.e. an activation value, resulting at each output node 18 may be indicative, for instance, of an answer to a certain question on the input data such as whether or not, or how likely, the input data has a certain characteristic such as whether a picture having been input contains a certain object such as a car, a person, a phase or the like.

Insofar, the input applied onto the input interface may also be interpreted as an activation, namely an activation applied onto each input node or element 14.

Between the input nodes 14 and output node(s) 18, the ML predictor 10 comprises further elements or nodes 20 which are, via connections 22 connected to predecessor nodes so as to receive activations from these predecessor nodes, and via one or more further connections 24 to successor nodes in order to forward to the successor nodes the activation, i.e. an activation value, of node 20.

Claims

1. Data stream (45) having a representation of a neural network (10) encoded thereinto, the data stream (45) comprising serialization parameter (102) indicating a coding order (104) at which neural network parameters (32), which define neuron interconnections (22, 24) of the neural network (10), are encoded into the data stream (45).

2. Data stream (45) of claim 1 , wherein the neural network parameters (32) are coded into the data stream (45) using context-adaptive arithmetic coding (600).

3. Data stream (45) of claim 1 or claim 2, wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion (200) representing a corresponding neural network layer (210, 30) of the neural network (10), wherein the serialization parameter (102) indicates the coding order (104) at which neural network parameters, which define neuron interconnections (22, 24) of the neural network within a predetermined neural network layer (210, 30), are encoded into the data stream (45).

4. Data stream (45) of any previous claim 1 to 3, wherein the serialization parameter (102) is an n-ary parameter which indicates the coding order (104) out of a set (108) of n coding orders (104).

5. Data stream (45) of claim 4, wherein the set (108) of n coding orders (104) comprises

first predetermined coding orders (1061) which differ in an order at which the predetermined coding orders traverse dimensions (34) of a tensor (30) describing a predetermined neural network layer (210, 30) of the neural network (10); and/or

second predetermined coding orders (1062) which differ in a number (107) of times at which the predetermined coding orders traverse a predetermined neural network layer (210, 30) of the neural network for sake of scalable coding of the neural network; and/or

third predetermined coding orders (1063) which differ in an order at which the predetermined coding orders traverse neural network layers (210, 30) of the neural network; and/or

and/or

fourth predetermined coding orders (1064) which differ in an order at which neurons (14, 18, 20) of a neural network layer (210, 30) of the neural network are traversed.

6. Data stream (45) of any previous claim 1 to 5, wherein the serialization parameter (102) is indicative of a permutation using which the coding order (104) permutes neurons (14, 18, 20) of a neural network layer (210, 30) relative to a default order.

7. Data stream (45) of claim 6, wherein the permutation orders the neurons (14, 18, 20) of the neural network layer (210, 30) in a manner so that the neural network parameters (32) monotonically increase along the coding order (104) or monotonically decrease along the coding order (104).

8. Data stream (45) of claim 6, wherein the permutation orders the neurons (14, 18, 20) of the neural network layer (210, 30) in a manner so that, among predetermined coding orders signalable by the serialization parameter (102), a bitrate for coding the neural network parameters (32) into the data stream (45) is lowest for the permutation indicated by the serialization parameter (102).

9. Data stream (45) of any previous claim 1 to 8, wherein the neural network parameters (32) comprise weights and biases.

10. Data stream (45) of any previous claim 1 to 9, wherein the data stream (45) is structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the neural network (10), so that each sub-portion (43, 44, 240) is completely traversed by the coding order (104) before a subsequent sub-portion is traversed by the coding order (104).

11. Data stream (45) of any of claims 3 to 10, wherein the neural network parameters (32) are coded into the data stream (45) using context-adaptive arithmetic coding (600) and using context initialization at a start of any individually accessible portion (200) or sub- portion (43, 44, 240).

12. Data stream (45) of any of claims 3 to 11 , wherein the data stream (45) comprises start codes (242) at which each individually accessible portion (200) or sub-portion (43, 44, 240) begins, and/or pointers (220, 244) pointing to beginnings of each individually accessible portion or sub-portion, and/or pointers data stream lengths (246) of each

individually accessible portion or sub-portion for skipping the respective individually accessible portion or sub-portion in parsing the data stream (45).

13. Data stream (45) of any of the previous claims 1 to 12, further comprising a numerical computation representation parameter (120) indicating a numerical representation and bit size at which the neural network parameters (32) are to be represented when using the neural network (10) for inference.

14. Data stream (45) having a representation of a neural network (10) encoded thereinto, the data stream (45) comprising a numerical computation representation parameter (120) indicating a numerical representation and bit size at which neural network parameters (32) of the neural network, which are encoded into the data stream, are to be represented when using the neural network (10) for inference.

15. Data stream (45) of any of the previous claims 1 to14, wherein the data stream (45) is structured into individually accessible sub-portions (43, 44, 240), each individually accessible sub-portion representing a corresponding neural network portion of the neural network, so that each individually accessible sub-portion is completely traversed by the coding order (104) before a subsequent individually accessible sub-portion is traversed by the coding order (104), wherein the data stream (45) comprises for a predetermined individually accessible sub-portion a type parameter indicting a parameter type of the neural network parameter (32) encoded into the predetermined individually accessible sub-portion.

16. Data stream (45) of claim 15, wherein the type parameter discriminates, at least, between neural network weights and neural network biases.

17. Data stream (45) of any of the previous claims 1 to 16, wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the data stream (45) further comprises for a predetermined neural network layer a neural network layer type parameter (130) indicating a neural network layer type of the predetermined neural network layer of the neural network.

18. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding

neural network layer (210, 30) of the neural network, wherein the data stream (45) further comprises, for a predetermined neural network layer, a neural network layer type parameter (130) indicating a neural network layer type of the predetermined neural network layer of the neural network.

19. Data stream (45) of any of claims 17 and 18, wherein the neural network layer type parameter (130) discriminates, at least, between a fully-connected and a convolutional layer type.

20. Data stream (45) of any of the previous claims 1 to 19, wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) a pointer (220, 244) pointing to a beginning of each individually accessible portion.

21. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) a pointer (220, 244) pointing to a beginning of the respective predetermined individually accessible portion.

22. Data stream (45) of any of previous claims 20 and 21, wherein each individually accessible portion represents

a corresponding neural network layer (210) of the neural network or

a neural network portion (43, 44, 240) of a neural network layer (210) of the neural network.

23. Data stream (45) of any of claims 1 to 22, having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the data stream (45) is, within a predetermined portion, further structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the respective neural network layer (210, 30) of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible sub-portions (43, 44, 240)

a start code (242) at which the respective predetermined individually accessible sub- portion begins, and/or

a pointer (244) pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream (45).

24. Data stream (45) of claim 23, wherein the data stream (45) has the representation of the neural network encoded thereinto using context-adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion and each individually accessible sub-portion.

25. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the data stream (45) is, within a predetermined portion, further structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the respective neural network layer (210, 30) of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible sub-portions (43, 44, 240)

a start code (242) at which the respective predetermined individually accessible sub- portion begins, and/or

a pointer (244) pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream (45).

26. Data stream (45) of claim 25, wherein the data stream (45) has the representation of the neural network encoded thereinto using context-adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion and each individually accessible sub-portion.

27. Data stream (45) of any previous claim 1 to 26, wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) a processing option parameter (250) indicating one or more processing options (252) which have to be used or which may optionally be used when using the neural network (10) for inference.

28. Data stream (45) of claim 27, wherein the processing option parameter (250) indicates the one or more available processing options (252) out of a set of predetermined processing options (252) including

parallel processing capability of the respective predetermined individually accessible portion; and/or

sample wise parallel processing capability (2522) of the respective predetermined individually accessible portion; and/or

channel wise parallel processing capability (2521) of the respective predetermined individually accessible portion; and/or

classification category wise parallel processing capability of the respective predetermined individually accessible portion; and/or

dependency of the neural network portion represented by the respective predetermined individually accessible portion on a computation result gained from another individually accessibly portion of the data stream (45) relating to the same neural network portion but belonging to another version of versions (330) of the neural network which are encoded into the data stream (45) in a layered manner.

29. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) a processing option parameter (250) indicating one or more processing options (252) which have to be used or which may optionally be used when using the neural network (10) for inference.

30. Data stream (45) of claim 29, wherein the processing option parameter (250) indicates the one or more available processing options (252) out of a set of predetermined processing options (252) including

parallel processing capability of the respective predetermined individually accessible portion; and/or

sample wise parallel processing capability (2522) of the respective predetermined individually accessible portion; and/or

channel wise parallel processing capability (2521) of the respective predetermined individually accessible portion; and/or

classification category wise parallel processing capability of the respective predetermined individually accessible portion; and/or

dependency of the neural network portion represented by the respective predetermined individually accessible portion on a computation result gained from another individually accessibly portion of the data stream (45) relating to the same neural network portion but belonging to another version of versions (330) of the neural network which are encoded into the data stream (45) in a layered manner.

31. Data stream (45) of one of claims 1 to 30, having neural network parameters (32) encoded thereinto, which represent a neural network,

wherein the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32"), and

wherein the neural network parameters (32) are encoded into the data stream (45) so that neural network parameters (32) in different neural network portions of the neural network are quantized (260) differently, and the data stream (45) indicates, for each of the neural network portions, a reconstruction rule (270) for dequantizing neural network parameters (32) relating to the respective neural network portion.

32. Data stream (45) having neural network parameters (32) encoded thereinto, which represent a neural network,

wherein the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”), and

wherein the neural network parameters (32) are encoded into the data stream (45) so that neural network parameters (32) in different neural network portions of the neural network are quantized (260) differently, and the data stream (45) indicates, for each of the neural network portions, a reconstruction rule (270) for dequantizing neural network parameters (32) relating to the respective neural network portion.

33. Data stream (45) of claim 31 or claim 32, wherein the neural network portions comprise neural network layers (210, 30) of the neural network and/or layer portions into which a predetermined neural network layer (210, 30) of the neural network is subdivided.

34. Data stream (45) of any previous claim 31 to 33, wherein the data stream (45) has a first reconstruction rule (2701, 270a1) for dequantizing neural network parameters (32) relating to a first neural network portion encoded thereinto in a manner delta-coded relative to a second reconstruction rule (2702, 270a2) for dequantizing neural network parameters (32) relating to a second neural network portion.

35. Data stream (45) of claim 34, wherein

the data stream (45) comprises, for indicating the first reconstruction rule (2701, 270a1), a first exponent value and, for indicating the second reconstruction rule (2702, 270a2), a second exponent value,

the first reconstruction rule (2701, 270a1) is defined by a first quantization step size defined by an exponentiation of a predetermined basis and a first exponent defined by the first exponent value, and

the second reconstruction rule (2702, 270a2) is defined by a second quantization step size defined by an exponentiation of the predetermined basis and a second exponent defined by a sum over the first and second exponent values.

36. Data stream (45) of claim 35, wherein the data stream (45) further indicates the predetermined basis.

37. Data stream (45) of any previous claim 31 to 34, wherein

the data stream (45) comprises, for indicating a first reconstruction rule (2701, 270a1) for dequantizing neural network parameters (32) relating to a first neural network portion, a first exponent value and, for indicating a second reconstruction rule (2702, 270a2) for dequantizing neural network parameters (32) relating to a second neural network portion, a second exponent value,

the first reconstruction rule (2701, 270a1) is defined by a first quantization step size defined by an exponentiation of a predetermined basis and a first exponent defined by a sum over the first exponent value and a predetermined exponent value, and

the second reconstruction rule (2702, 270a2) is defined by a second quantization step size defined by an exponentiation of the predetermined basis and a second exponent defined by a sum over the second exponent values and the predetermined exponent value.

38. Data stream (45) of claim 37, wherein the data stream (45) further indicates the predetermined basis.

39. Data stream (45) of claim 38, wherein the data stream (45) indicates the predetermined basis at a neural network scope.

40. Data stream (45) of any previous claim 37 to 39, wherein the data stream (45) further indicates the predetermined exponent value.

41. Data stream (45) of claim 40, wherein the data stream (45) indicates the predetermined exponent value at a neural network layer (210, 30) scope.

42. Data stream (45) of claim 40 or claim 41 , wherein the data stream (45) further indicates the predetermined basis and the data stream (45) indicates the predetermined exponent value at a scope finer than a scope at which the predetermined basis is indicated by the data stream (45).

43. Data stream (45) of any of previous claims 35 to 42, wherein the data stream (45) has the predetermined basis encoded thereinto in a non-integer format and the first and second exponent values in integer format.

44. Data stream (45) of any of claims 34 to 43, wherein

the data stream (45) comprises, for indicating the first reconstruction rule (2701, 270a1), a first parameter set (264) defining a first quantization-index-to-reconstruction-level mapping (265), and for indicating the second reconstruction rule (2702, 270a2), a

second parameter set (264) defining a second quantization-index-to-reconstruction- level mapping (265),

the first reconstruction rule (2701, 270a1) is defined by the first quantization-index-to- reconstruction-level mapping (265), and

the second reconstruction rule (2702, 270a2) is defined by an extension of the first quantization-index-to-reconstruction-level mapping (265) by the second quantization- index-to-reconstruction-level mapping (265) in a predetermined manner.

45. Data stream (45) of any of claims 34 to 44, wherein

the data stream (45) comprises, for indicating the first reconstruction rule (2701, 270a1), a first parameter set (264) defining a first quantization-index-to-reconstruction-level mapping (265), and for indicating the second reconstruction rule (2702, 270a2), a second parameter set (264) defining a second quantization-index-to-reconstruction- level mapping (265),

the first reconstruction rule (2701, 270a1) is defined by an extension of a predetermined quantization-index-to-reconstruction-level mapping (265) by the first quantization- index-to-reconstruction-level mapping (265) in a predetermined manner, and

the second reconstruction rule (2702, 270a2) is defined by an extension of the predetermined quantization-index-to-reconstruction-level mapping (265) by the second quantization-index-to-reconstruction-level mapping (265) in the predetermined manner.

46. Data stream (45) of claim 45, wherein the data stream (45) further indicates the predetermined quantization-index-to-reconstruction-level mapping (265).

47. Data stream (45) of claim 46, wherein the data stream (45) indicates the predetermined quantization-index-to-reconstruction-level mapping (265) at a neural network scope or at a neural network layer (210, 30) scope.

48. Data stream (45) of any of previous claims 44 to 47, wherein, according to the predetermined manner,

a mapping of each index value (32”), according to the quantization-index-to- reconstruction-level mapping to be extended, onto a first reconstruction level is superseded by, if present, a mapping of the respective index value (32”), according to the quantization-index-to-reconstruction-level mapping extending the quantization- index-to-reconstruction-level mapping to be extended, onto a second reconstruction level, and/or

for any index value (32”), for which according to the quantization-index-to- reconstruction-level mapping to be extended, no reconstruction level is defined onto which the respective index value (32”) should be mapped, and which is, according to the quantization-index-to-reconstruction-level mapping extending the quantization- index-to-reconstruction-level mapping to be extended, mapped onto a corresponding reconstruction level, the mapping from the respective index value (32”) onto the corresponding reconstruction level is adopted, and/or

for any index value (32”), for which according to the quantization-index-to- reconstruction-level mapping extending the quantization-index-to-reconstruction-level mapping to be extended, no reconstruction level is defined onto which the respective index value (32”) should be mapped, and which is, according to the quantization-index- to-reconstruction-level mapping to be extended, mapped onto a corresponding reconstruction level, the mapping from the respective index value (32”) onto the corresponding reconstruction level is adopted.

49. Data stream (45) of any previous claim 31 to 48, wherein

the data stream (45) comprises, for indicating the reconstruction rule (270) of a predetermined neural network portion,

a quantization step size parameter (262) indicating a quantization step size (263), and

a parameter set (264) defining a quantization-index-to-reconstruction-level mapping (265),

wherein the reconstruction rule (270) of the predetermined neural network portion is defined by

the quantization step size (263) for quantization indices (32”) within a predetermined index interval (268), and

the quantization-index-to-reconstruction-level mapping (265) for quantization indices (32”) outside the predetermined index interval (268).

50. Data stream (45) having neural network parameters (32) encoded thereinto, which represent a neural network,

wherein the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”),

wherein the data stream (45) comprises, for indicating a reconstruction rule (270) for dequantizing (280) the neural network parameters (32),

a quantization step size parameter (262) indicating a quantization step size (263), and

a parameter set (264) defining a quantization-index-to-reconstruction-level mapping (265),

wherein the reconstruction rule (270) of the predetermined neural network portion is defined by

the quantization step size (263) for quantization indices (32”) within a predetermined index interval (268), and

the quantization-index-to-reconstruction-level mapping (265) for quantization indices (32”) outside the predetermined index interval (268).

51. Data stream (45) of claim 49 or claim 50, wherein the predetermined index interval (268) includes zero.

52. Data stream (45) of claim 51, wherein the predetermined index interval (268) extends up to a predetermined magnitude threshold value and quantization indices (32") exceeding the predetermined magnitude threshold value represent escape codes which signal that the quantization-index-to-reconstruction-level mapping (265) is to be used for dequantization (280).

53. Data stream (45) of any of previous claims 49 to 52, wherein the parameter set (264) defines the quantization-index-to-reconstruction-level mapping (265) by way of a list of reconstruction levels associated with quantization indices (32”) outside the predetermined index interval (268).

54. Data stream (45) of any of previous claims 31 to 53, wherein the neural network portions comprise one or more sub-portions of a neural network layer (210, 30) of the neural network and/or one or more neural network layers of the neural network.

55. Data stream (45) of any of previous claims 31 to 54, wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion having the neural network parameters (32) for a corresponding neural network portion encoded thereinto.

56. Data stream (45) of claim 55, wherein the individually accessible portions (200) are encoded using context-adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion.

57. Data stream (45) of claim 55 or claim 56, wherein the data stream (45) comprises for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream (45).

58. Data stream (45) of any previous claim 55 to 57, wherein the data stream (45) indicates, for each of the neural network portions, the reconstruction rule (270) for dequantizing (280) neural network parameters (32) relating to the respective neural network portion in

a main header portion (47) of the data stream (45) relating the neural network as a whole,

a neural network layer (210, 30) related header portion (110) of the data stream (45) relating to the neural network layer (210) the respective neural network portion is part of, or

a neural network portion specific header portion of the data stream (45) relating to the respective neural network portion is part of.

59. Data stream (45) of any previous claim 1 to 58, having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) an identification parameter (310) for identifying the respective predetermined individually accessible portion.

60. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) an identification parameter (310) for identifying the respective predetermined individually accessible portion.

61. Data stream (45) of claim 59 or claim 60, wherein the identification parameter (310) is related to the respective predetermined individually accessible portion via a hash function or error detection code or error correction code.

62. Data stream (45) of any of previous claims 59 to 61 , further comprising a higher-level identification parameter (310) for identifying a collection of more than one predetermined individually accessible portion.

63. Data stream (45) of claim 62, wherein the higher-level identification parameter (310) is related to the identification parameters (310) of the more than one predetermined individually accessible portion via a hash function or error detection code or error correction code.

64. Data stream (45) of any of previous claims 59 to 63, wherein the individually accessible portions (200) are encoded using context-adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion.

65. Data stream (45) of any of previous claims 59 to 64, wherein the data stream (45) comprises for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

66. Data stream (45) of any of previous claims 59 to 65, wherein the neural network portions comprise one or more sub-portions of a neural network layer (210, 30) of the neural network and/or one or more neural network layers of the neural network.

67. Data stream (45) of any previous claim 1 to 66, having a representation of a neural network (10) encoded thereinto in a layered manner so that different versions (330) of the neural network are encoded into the data stream (45), wherein the data stream (45) is structured into one or more individually accessible portions (200), each portion relating to a corresponding version (330) of the neural network, wherein the data stream (45) has a first version (3302) of the neural network encoded into a first portion

delta-coded relative to a second version (3301) of the neural network encoded into a second portion, and/or

in form of one or more compensating neural network portions (332) each of which is to be, for performing an inference based on the first version (3302) of the neural network,

executed in addition to an execution of a corresponding neural network portion (334) of a second version (3301) of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion (332) and corresponding neural network portion (334) are to be summed up.

68. Data stream (45) having a representation of a neural network (10) encoded thereinto in a layered manner so that different versions (330) of the neural network are encoded into the data stream (45), wherein the data stream (45) is structured into one or more individually accessible portions (200), each portion relating to a corresponding version of the neural network, wherein the data stream (45) has a first version (3302) of the neural network encoded into a first portion

delta-coded relative to a second version (3301) of the neural network encoded into a second portion, and/or

in form of one or more compensating neural network portions (332) each of which is to be, for performing an inference based on the first version (3302) of the neural network,

executed in addition to an execution of a corresponding neural network portion (334) of a second version (3301) of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion (332) and corresponding neural network portion (334) are to be summed up.

69. Data stream (45) of claim 67 or claim 68, wherein the data stream (45) has the first version (3302) of the neural network encoded into a first portion delta-coded relative to the second version (3301) of the neural network encoded into the second portion in terms of

weight and/or bias differences, and/or

additional neurons (14, 18, 20) or neuron interconnections (22, 24).

70. Data stream (45) of any previous claim 67 to 69, wherein the individually accessible portions (200) are encoded using context-adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion.

71. Data stream (45) of any previous claim 67 to 70, wherein the data stream (45) comprises for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream (45).

72. Data stream (45) of any previous claim 67 to 71, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) an identification parameter (310) for identifying the respective predetermined individually accessible portion.

73. Data stream (45) of any previous claim 1 to 72, having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) a supplemental data (350) for supplementing the representation of the neural network.

74. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the data stream (45) comprises for each of one or more predetermined individually accessible portions (200) a supplemental data (350) for supplementing the representation of the neural network.

75. Data stream (45) of claim 73 or claim 74, wherein the data stream (45) indicates the supplemental data (350) as being dispensable for inference based on the neural network.

76. Data stream (45) of any previous claim 73 to 75, wherein the data stream (45) has the supplemental data (350) for supplementing the representation of the neural network for the one or more predetermined individually accessible portions (200) coded into further individually accessible portions (200) so that the data stream (45) comprises for each of the one or more predetermined individually accessible portions (200) a corresponding further predetermined individually accessible portion relating to the neural network portion to which the respective predetermined individually accessible portion corresponds.

77. Data stream (45) of any previous claim 73 to 76, wherein the neural network portions comprise neural network layers (210, 30) of the neural network and/or layer portions into which a predetermined neural network layer of the neural network is subdivided.

78. Data stream (45) of any previous claim 73 to 77, wherein the individually accessible portions (200) are encoded using context-adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion.

79. Data stream (45) of any previous claim 73 to 78, wherein the data stream (45) comprises for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream (45).

80. Data stream (45) of any previous claim 73 to 79, wherein the supplemental data (350) relates to

relevance scores of neural network parameters (32), and/or

perturbation robustness of neural network parameters (32).

81. Data stream (45) of any previous claim 1 to 80, having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) comprises hierarchical control data (400) structured into a sequence (410) of control data portions (420), wherein the control data portions (420) provide information on the neural network at increasing details along the sequence of control data portions (420).

82. Data stream (45) having a representation of a neural network (10) encoded thereinto, wherein the data stream (45) comprises hierarchical control data (400) structured into a sequence (410) of control data portions (420), wherein the control data portions (420) provide information on the neural network at increasing details along the sequence of control data portions (420).

83. Data stream (45) of claim 81 or claim 82, wherein at least some of the control data portions (420) provide information on the neural network which is partially redundant.

84. Data stream (45) of any previous claim 81 to 83, wherein a first control data portion provides the information on the neural network by way of indicating a default neural network type implying default settings and a second control data portion comprises a parameter to indicate each of the default settings.

85. Apparatus for encoding a representation of a neural network (10) into a data stream (45), wherein the apparatus is configured to provide the data stream (45) with a serialization parameter (102) indicating a coding order (104) at which neural network parameters (32), which define neuron interconnections (22, 24) of the neural network, are encoded into the data stream (45).

86. Apparatus of claim 85, wherein the apparatus is configured to encode, into the data stream (45), the neural network parameters (32) using context-adaptive arithmetic encoding.

87. Apparatus of claim 85 or claim 86, wherein the apparatus is configured to structure the data stream (45) into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and

encode, into the data stream (45), neural network parameters, which define neuron interconnections (22, 24) of the neural network within a predetermined neural network layer, according to the coding order (104) to be indicated by the serialization parameter (102).

88. Apparatus of any previous claim 85 to 87, wherein the serialization parameter (102) is an n-ary parameter which indicates the coding order (104) out of a set (108) of n coding orders (104).

89. Apparatus of claim 88, wherein the set (108) of n coding orders (104) comprises

first predetermined coding orders (1061) which differ in an order at which the predetermined coding orders traverse dimensions (34) of a tensor (30) describing a predetermined neural network layer (210, 30) of the neural network; and/or

second predetermined coding orders (1062) which differ in a number (107) of times at which the predetermined coding orders traverse a predetermined neural network layer of the neural network for sake of scalable coding of the neural network; and/or

third predetermined coding orders (1063) which differ in an order at which the predetermined coding orders traverse neural network layers of the neural network; and/or

fourth predetermined coding orders (1064) which differ in an order at which neurons (14, 18, 20) of a neural network layer (210, 30) of the neural network are traversed.

90. Apparatus of any previous claim 85 to 89, wherein the serialization parameter (102) is indicative of a permutation using which the coding order (104) permutes neurons (14, 18, 20) of a neural network layer (210, 30) relative to a default order.

91. Apparatus of claim 90, wherein the permutation orders the neurons (14, 18, 20) of the neural network layer (210, 30) in a manner so that the neural network parameters (32) monotonically increase along the coding order (104) or monotonically decrease along the coding order (104).

92. Apparatus of claim 90, wherein the permutation orders the neurons (14, 18, 20) of the neural network layer (210, 30) in a manner so that, among predetermined coding orders signalable by the serialization parameter (102), a bitrate for coding the neural network parameters (32) into the data stream (45) is lowest for the permutation indicated by the serialization parameter (102).

93. Apparatus of any previous claim 85 to 92, wherein the neural network parameters (32) comprise weights and biases.

94. Apparatus of any previous claim 85 to 93, wherein the apparatus is configured to structure the data stream into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the neural network, so that each sub-portion (43, 44, 240) is completely traversed by the coding order (104) before a subsequent sub-portion is traversed by the coding order (104).

95. Apparatus of any of claims 87 to 94, wherein the neural network parameters (32) are encoded into the data stream using context-adaptive arithmetic encoding and using context initialization at a start of any individually accessible portion (200) or sub-portion (43, 44, 240).

96. Apparatus of any of claims 87 to 95, wherein the apparatus is configured to encode, into the data stream, start codes (242) at which each individually accessible portion (200) or sub-portion (43, 44, 240) begins, and/or pointers (220, 244) pointing to beginnings of each individually accessible portion or sub-portion, and/or pointers data stream lengths (246) of each individually accessible portion or sub-portion for skipping the respective individually accessible portion or sub-portion in parsing the data stream.

97. Apparatus of any of the previous claims 85 to 96, wherein the apparatus is configured to encode, into the data stream, a numerical computation representation parameter (120) indicating a numerical representation and bit size at which the neural network parameters (32) are to be represented when using the neural network (10) for inference.

98. Apparatus for encoding a representation of a neural network (10) into a data stream (45), wherein the apparatus is configured to provide the data stream (45) with a numerical computation representation parameter (120) indicating a numerical representation and bit size at which neural network parameters (32) of the neural network, which are encoded into the data stream (45), are to be represented when using the neural network (10) for inference.

99. Apparatus of any of the previous claims 85 to 98, wherein the apparatus is configured to structure the data stream (45) into individually accessible sub-portions (43, 44, 240), each individually accessible sub-portion representing a corresponding neural network portion of the neural network, so that each individually accessible sub-portion is completely traversed by the coding order (104) before a subsequent individually accessible sub-portion is traversed by the coding order (104), wherein the apparatus is configured to encode, into the data stream (45), for a predetermined individually accessible sub-portion the neural network parameter and a type parameter indicting a parameter type of the neural network parameter encoded into the predetermined individually accessible sub-portion.

100. Apparatus of claim 99, wherein the type parameter discriminates, at least, between neural network weights and neural network biases.

101. Apparatus of any of the previous claims 85 to 100, wherein the apparatus is configured to

structure the data stream (45) into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and

encode, into the data stream (45), for a predetermined neural network layer, a neural network layer type parameter (130) indicating a neural network layer type of the predetermined neural network layer of the neural network.

102. Apparatus for encoding a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for a predetermined neural network layer, a neural network layer type parameter (130) indicating a neural network layer type of the predetermined neural network layer of the neural network.

103. Apparatus of any of claims 101 and 102, wherein the neural network layer type parameter (130) discriminates, at least, between a fully-connected and a convolutional layer type.

104. Apparatus of any of the previous claims 85 to 103, wherein the apparatus is configured to

structure the data stream (45) into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, and

encode, into the data stream (45), for each of one or more predetermined individually accessible portions, a pointer (220, 244) pointing to a beginning of each individually accessible portion.

105. Apparatus for encoding a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into one or more individually accessible portions (200), each portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions, a pointer (220, 244) pointing to a beginning of the respective predetermined individually accessible portion.

106. Apparatus of any of previous claims 104 and 105, wherein each individually accessible portion represents

a corresponding neural network layer (210) of the neural network or

a neural network portion (43, 44, 240) of a neural network layer (210) of the neural network.

107. Apparatus of any of claims 85 to 106, wherein the apparatus is configured to encode a representation of a neural network (10) into the data stream (45), so that the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and so that the data stream (45) is, within a predetermined portion, further structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the respective neural network layer of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible sub-portions (43, 44, 240)

a start code (242) at which the respective predetermined individually accessible subportion begins, and/or

a pointer (244) pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream.

108. Apparatus of claim 107, wherein the apparatus is configured to encode, into the data stream (45), the representation of the neural network using context-adaptive arithmetic encoding and using context initialization at a start of each individually accessible portion and each individually accessible sub-portion.

109. Apparatus for encoding a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and so that the data stream (45) is, within a predetermined portion, further structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the respective neural network layer of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible sub-portions (43, 44, 240)

a start code (242) at which the respective predetermined individually accessible sub- portion begins, and/or

a pointer (244) pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream (45).

110. Apparatus of claim 109, wherein the apparatus is configured to encode, into the data stream (45), the representation of the neural network using context-adaptive arithmetic encoding and using context initialization at a start of each individually accessible portion and each individually accessible sub-portion.

111. Apparatus of any previous claim 85 to 110, wherein the apparatus is configured to encode a representation of a neural network (10) into a data stream, so that the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions, a processing option parameter (250) indicating one or more processing options (252) which have to be used or which may optionally be used when using the neural network (10) for inference.

112. Apparatus of claim 111, wherein the processing option parameter (250) indicates the one or more available processing options (252) out of a set of predetermined processing options (252) including

parallel processing capability of the respective predetermined individually accessible portion; and/or

sample wise parallel processing capability (2522) of the respective predetermined individually accessible portion; and/or

channel wise parallel processing capability (2521) of the respective predetermined individually accessible portion; and/or

classification category wise parallel processing capability of the respective predetermined individually accessible portion; and/or

dependency of the neural network portion represented by the respective predetermined individually accessible portion on a computation result gained from another individually accessibly portion of the data stream (45) relating to the same neural network portion but belonging to another version of versions (330) of the neural network which are encoded into the data stream (45) in a layered manner.

113. Apparatus for encoding a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions, a processing option parameter (250) indicating one or more processing options (252) which have to be used or which may optionally be used when using the neural network (10) for inference.

114. Apparatus of claim 113, wherein the processing option parameter (250) indicates the one or more available processing options (252) out of a set of predetermined processing options (252) including

parallel processing capability of the respective predetermined individually accessible portion; and/or

sample wise parallel processing capability (2522) of the respective predetermined individually accessible portion; and/or

channel wise parallel processing capability (2521) of the respective predetermined individually accessible portion; and/or

classification category wise parallel processing capability of the respective predetermined individually accessible portion; and/or

dependency of the neural network portion represented by the respective predetermined individually accessible portion on a computation result gained from another individually accessibly portion of the data stream (45) relating to the same neural network portion but belonging to another version of versions (330) of the neural network which are encoded into the data stream (45) in a layered manner.

115. Apparatus of one of claims 85 to 114, wherein the apparatus is configured to encode neural network parameters (32), which represent a neural network, into a data stream (45), so that the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”), and the neural network parameters (32) are encoded into the data stream (45) so that neural network parameters (32) in different neural network portions of the neural network are quantized (260) differently, wherein the apparatus is configured to provide the data stream (45) indicating, for each of the neural network portions, a reconstruction rule (270) for dequantizing (280) neural network parameters (32) relating to the respective neural network portion.

116. Apparatus for encoding neural network parameters (32), which represent a neural network, into a data stream (45), so that the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”), and the neural network parameters (32) are encoded into the data stream (45) so that neural network parameters (32) in different neural network portions of the neural network are quantized (260) differently, wherein the apparatus is configured to provide the data stream (45) indicating, for each of the neural network portions, a reconstruction rule (270) for dequantizing (280) neural network parameters (32) relating to the respective neural network portion.

117. Apparatus of claim 115 or claim 116, wherein the neural network portions comprise neural network layers (210, 30) of the neural network and/or layer portions into which a predetermined neural network layer of the neural network is subdivided.

118. Apparatus of any previous claim 115 to 117, wherein the apparatus is configured to encode, into the data stream (45), a first reconstruction rule (2701, 270a1) for dequantizing (280) neural network parameters (32) relating to a first neural network portion, in a manner delta-encoded relative to a second reconstruction rule (2702, 270a2) for dequantizing (280) neural network parameters (32) relating to a second neural network portion.

119. Apparatus of claim 118, wherein

the apparatus is configured to encode, into the data stream (45), for indicating the first reconstruction rule (2701, 270a1), a first exponent value and, for indicating the second reconstruction rule (2702, 270a2), a second exponent value,

the first reconstruction rule (2701, 270a1) is defined by a first quantization step size

(263) defined by an exponentiation of a predetermined basis and a first exponent defined by the first exponent value, and

the second reconstruction rule (2702, 270a2) is defined by a second quantization step size (263) defined by an exponentiation of the predetermined basis and a second exponent defined by a sum over the first and second exponent values.

120. Apparatus of claim 119, wherein the data stream further indicates the predetermined basis.

121. Apparatus of any previous claim 115 to 118, wherein

the apparatus is configured to encode, into the data stream, for indicating a first reconstruction rule (2701, 270a1) for dequantizing (280) neural network parameters (32) relating to a first neural network portion, a first exponent value and, for indicating a second reconstruction rule (2702, 270a2) for dequantizing (280) neural network parameters (32) relating to a second neural network portion, a second exponent value,

the first reconstruction rule (2701, 270a1) is defined by a first quantization step size (263) defined by an exponentiation of a predetermined basis and a first exponent defined by a sum over the first exponent value and a predetermined exponent value, and

the second reconstruction rule (2702, 270a2) is defined by a second quantization step size (263) defined by an exponentiation of the predetermined basis and a second exponent defined by a sum over the second exponent values and the predetermined exponent value.

122. Apparatus of claim 121, wherein the data stream further indicates the predetermined basis.

123. Apparatus of claim 122, wherein the data stream indicates the predetermined basis at a neural network scope.

124. Apparatus of any previous claim 121 to 123, wherein the data stream further indicates the predetermined exponent value.

125. Apparatus of claim 125, wherein the data stream indicates the predetermined exponent value at a neural network layer (210, 30) scope.

126. Apparatus of claim 124 or claim 125, wherein the data stream further indicates the predetermined basis and the data stream indicates the predetermined exponent value at a scope finer than a scope at which the predetermined basis is indicated by the data stream.

127. Apparatus of any of previous claims 119 to 126, wherein the apparatus is configured to encode, into the data stream, the predetermined basis in a non-integer format and the first and second exponent values in integer format.

128. Apparatus of any of claims 118 to 127, wherein

the apparatus is configured to encode, into the data stream, for indicating the first reconstruction rule (2701, 270a1), a first parameter set (264) defining a first quantization-index-to-reconstruction-level mapping (265), and for indicating the second reconstruction rule (2702, 270a2), a second parameter set (264) defining a second quantization-index-to-reconstruction-level mapping (265),

the first reconstruction rule (2701, 270a1) is defined by the first quantization-index-to- reconstruction-level mapping (265), and

the second reconstruction rule (2702, 270a2) is defined by an extension of the first quantization-index-to-reconstruction-level mapping (265) by the second quantization- index-to-reconstruction-level mapping (265) in a predetermined manner.

129. Apparatus of any of claims 118 to 128, wherein

the apparatus is configured to encode, into the data stream, for indicating the first reconstruction rule (2701, 270a1), a first parameter set (264) defining a first quantization-index-to-reconstruction-level mapping (265), and for indicating the second reconstruction rule (2702, 270a2), a second parameter set (264) defining a second quantization-index-to-reconstruction-level mapping (265),

the first reconstruction rule (2701, 270a1) is defined by an extension of a predetermined quantization-index-to-reconstruction-level mapping (265) by the first quantization- index-to-reconstruction-level mapping (265) in a predetermined manner, and

the second reconstruction rule (2702, 270a2) is defined by an extension of the predetermined quantization-index-to-reconstruction-level mapping (265) by the second quantization-index-to-reconstruction-level mapping (265) in the predetermined manner.

130. Apparatus of claim 129, wherein the data stream further indicates the predetermined quantization-index-to-reconstruction-level mapping (265).

131. Apparatus of claim 130, wherein the data stream indicates the predetermined quantization-index-to-reconstruction-level mapping (265) at a neural network scope or at a neural network layer (210, 30) scope.

132. Apparatus of any of previous claims 128 to 131, wherein, according to the predetermined manner,

a mapping of each index value (32”), according to the quantization-index-to- reconstruction-level mapping to be extended, onto a first reconstruction level is superseded by, if present, a mapping of the respective index value (32”), according to the quantization-index-to-reconstruction-level mapping extending the quantization- index-to-reconstruction-level mapping to be extended, onto a second reconstruction level, and/or

for any index value (32”), for which according to the quantization-index-to- reconstruction-level mapping to be extended, no reconstruction level is defined onto which the respective index value (32”) should be mapped, and which is, according to the quantization-index-to-reconstruction-level mapping extending the quantization- index-to-reconstruction-level mapping to be extended, mapped onto a corresponding reconstruction level, the mapping from the respective index value (32”) onto the corresponding reconstruction level is adopted, and/or

for any index value (32”), for which according to the quantization-index-to- reconstruction-level mapping extending the quantization-index-to-reconstruction-level mapping to be extended, no reconstruction level is defined onto which the respective index value (32”) should be mapped, and which is, according to the quantization-index- to-reconstruction-level mapping to be extended, mapped onto a corresponding reconstruction level, the mapping from the respective index value (32”) onto the corresponding reconstruction level is adopted.

133. Apparatus of any previous claim 115 to 132, wherein

the apparatus is configured to encode, into the data stream, for indicating the reconstruction rule (270) of a predetermined neural network portion,

a quantization step size parameter (262) indicating a quantization step size (263), and

a parameter set (264) defining a quantization-index-to-reconstruction-level mapping (265),

wherein the reconstruction rule (270) of the predetermined neural network portion is defined by

the quantization step size (263) for quantization indices (32”) within a predetermined index interval (268), and

the quantization-index-to-reconstruction-level mapping (265) for quantization indices (32”) outside the predetermined index interval (268).

134. Apparatus for encoding neural network parameters (32), which represent a neural network, into a data stream (45), so that the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”), wherein the apparatus is configured to provide the data stream (45) with, for indicating a reconstruction rule (270) for dequantizing (280) the neural network parameters (32),

a quantization step size parameter (262) indicating a quantization step size (263), and

a parameter set (264) defining a quantization-index-to-reconstruction-level mapping (265),

wherein the reconstruction rule (270) of the predetermined neural network portion is defined by

the quantization step size (263) for quantization indices (32”) within a predetermined index interval (268), and

the quantization-index-to-reconstruction-level mapping (265) for quantization indices (32”) outside the predetermined index interval (268).

135. Apparatus of claim 133 or claim 134, wherein the predetermined index interval (268) includes zero.

136. Apparatus of claim 135, wherein the predetermined index interval (268) extends up to a predetermined magnitude threshold value and quantization indices (32”) exceeding the predetermined magnitude threshold value represent escape codes which signal that the quantization-index-to-reconstruction-level mapping (265) is to be used for dequantization (280).

137. Apparatus of any of previous claims 133 to 136, wherein the parameter set (264) defines the quantization-index-to-reconstruction-level mapping (265) by way of a list of reconstruction levels associated with quantization indices (32”) outside the predetermined index interval (268).

138. Apparatus of any of previous claims 115 to 137, wherein the neural network portions comprise one or more sub-portions of a neural network layer (210, 30) of the neural network and/or one or more neural network layers of the neural network.

139. Apparatus of any of previous claims 115 to 138, wherein the apparatus is configured to structure the data stream (45) into individually accessible portions (200), and encode into each individually accessible portion the neural network parameters (32) for a corresponding neural network portion.

140. Apparatus of 139, wherein the apparatus is configured to encode, into the data stream, the individually accessible portions (200) using context-adaptive arithmetic encoding and using context initialization at a start of each individually accessible portion.

141. Apparatus of claim 139 or claim 140, wherein the apparatus is configured to encode, into the data stream, for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

142. Apparatus of any previous claim 139 to 141, wherein the apparatus is configured to encode, into the data stream, for each of the neural network portions, an indication of the reconstruction rule (270) for dequantizing (280) neural network parameters (32) relating to the respective neural network portion in

a main header portion (47) of the data stream relating the neural network as a whole,

a neural network layer (210, 30) related header portion (110) of the data stream relating to the neural network layer the respective neural network portion is part of, or

a neural network portion specific header portion of the data stream relating to the respective neural network portion is part of.

143. Apparatus of any previous claim 85 to 142, wherein the apparatus is configured to encode a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions, an identification parameter (310) for identifying the respective predetermined individually accessible portion.

144. Apparatus for encoding a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions, an identification parameter (310) for identifying the respective predetermined individually accessible portion.

145. Apparatus of claim 143 or claim 144, wherein the identification parameter (310) is related to the respective predetermined individually accessible portion via a hash function or error detection code or error correction code.

146. Apparatus of any of previous claims 143 to 145, wherein the apparatus is configured to encode, into the data stream (45), a higher-level identification parameter (310) for identifying a collection of more than one predetermined individually accessible portion.

147. Apparatus of claim 146, wherein the higher-level identification parameter (310) is related to the identification parameters (310) of the more than one predetermined individually accessible portion via a hash function or error detection code or error correction code.

148. Apparatus of any of previous claims 143 to 147, wherein the apparatus is configured to encode, into the data stream, the individually accessible portions (200) using context-adaptive arithmetic encoding and using context initialization at a start of each individually accessible portion.

149. Apparatus of any of previous claims 143 to 148, wherein the apparatus is configured to encode, into the data stream, for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

150. Apparatus of any of previous claims 143 to 149, wherein the neural network portions comprise one or more sub-portions of a neural network layer (210, 30) of the neural network and/or one or more neural network layers (210, 30) of the neural network.

151. Apparatus of any previous claim 85 to 150, wherein the apparatus is configured to encode a representation of a neural network (10) into a data stream (45) in a layered manner so that different versions (330) of the neural network are encoded into the data stream (45), and so that the data stream (45) is structured into one or more individually accessible portions (200), each portion relating to a corresponding version of the neural network, wherein the apparatus is configured encode a first version (3302) of the neural network encoded into a first portion

delta-coded relative to a second version (3301) of the neural network encoded into a second portion, and/or

in form of one or more compensating neural network portions (332) each of which is to be, for performing an inference based on the first version (3302) of the neural network,

executed in addition to an execution of a corresponding neural network portion (334) of a second version (3301) of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion (332) and corresponding neural network portion (334) are to be summed up.

152. Apparatus for encoding a representation of a neural network (10) into a data stream (45) in a layered manner so that different versions (330) of the neural network are encoded into the data stream (45), and so that the data stream (45) is structured into one or more individually accessible portions (200), each portion relating to a corresponding version of the neural network, wherein the apparatus is configured encode a first version (3302) of the neural network into a first portion

delta-coded relative to a second version (3301) of the neural network encoded into a second portion, and/or

in form of one or more compensating neural network portions (332) each of which is to be, for performing an inference based on the first version (3302) of the neural network,

executed in addition to an execution of a corresponding neural network portion (334) of a second version (3301) of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion (332) and corresponding neural network portion (334) are to be summed up.

153. Apparatus of claim 151 or claim 152,

wherein the apparatus is configured to encode, into a second portion of the data stream, the second version (3301) of the neural network; and

wherein the apparatus is configured to encode, into a first portion of the data stream, the first version (3302) of the neural network delta-coded relative to the second version (3301) of the neural network encoded into the second portion in terms of

weight and/or bias differences, and/or

additional neurons (14, 18, 20) or neuron interconnections (22, 24).

154. Apparatus of any previous claim 151 to 153, wherein the apparatus is configured to encode, into the data stream, the individually accessible portions (200) using context- adaptive arithmetic coding (600) and using context initialization at a start of each individually accessible portion.

155. Apparatus of any previous claim 151 to 154, wherein the apparatus is configured to encode, into the data stream, for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

156. Apparatus of any previous claim 151 to 155, wherein the apparatus is configured to encode, into the data stream, for each of one or more predetermined individually accessible portions (200) an identification parameter (310) for identifying the respective predetermined individually accessible portion.

157. Apparatus of any previous claim 85 to 156, wherein the apparatus is configured to encode a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions (200) a supplemental data (350) for supplementing the representation of the neural network.

158. Apparatus for encoding a representation of a neural network (10) into a data stream (45), so that the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to provide the data stream (45) with, for each of one or more predetermined individually accessible portions (200) a supplemental data (350) for supplementing the representation of the neural network.

159. Apparatus (45) of claim 157 or claim 158, wherein the data stream (45) indicates the supplemental data (350) as being dispensable for inference based on the neural network.

160. Apparatus of any previous claim 157 to 159, wherein the apparatus is configured to encode the supplemental data (350) for supplementing the representation of the neural network for the one or more predetermined individually accessible portions (200) into further individually accessible portions (200) so that the data stream comprises for each of the one or more predetermined individually accessible portions (200) a corresponding further predetermined individually accessible portion relating to the neural network portion to which the respective predetermined individually accessible portion corresponds.

161. Apparatus of any previous claim 157 to 160, wherein the neural network portions comprise neural network layers (210, 30) of the neural network and/or layer portions into which a predetermined neural network layer (210, 30) of the neural network is subdivided.

162. Apparatus of any previous claim 157 to 161, wherein the apparatus is configured to encode the individually accessible portions (200) using context-adaptive arithmetic encoding and using context initialization at a start of each individually accessible portion.

163. Apparatus of any previous claim 157 to 162, wherein the apparatus is configured to encode, into the data stream, for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

164. Apparatus of any previous claim 157 to 163, wherein the supplemental data (350) relates to

relevance scores of neural network parameters (32), and/or

perturbation robustness of neural network parameters (32).

165. Apparatus of any previous claim 85 to 164, for encoding a representation of a neural network (10) into a data stream (45), wherein the apparatus is configured to provide the data stream (45) with hierarchical control data (400) structured into a sequence (410) of control data portions (420), wherein the control data portions provide information on the neural network at increasing details along the sequence of control data portions.

166. Apparatus for encoding a representation of a neural network (10) into a data stream (45), wherein the apparatus is configured to provide the data stream (45) with hierarchical control data (400) structured into a sequence (410) of control data portions (420), wherein the control data portions provide information on the neural network at increasing details along the sequence of control data portions.

167. Apparatus of claim 165 or claim 166, wherein at least some of the control data portions (420) provide information on the neural network which is partially redundant.

168 Apparatus of any previous claim 165 to 167, wherein a first control data portion provides the information on the neural network by way of indicating a default neural network type implying default settings and a second control data portion comprises a parameter to indicate each of the default settings.

169. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the apparatus is configured to decode from the data stream (45) a serialization parameter (102) indicating a coding order (104) at which neural network parameters (32), which define neuron interconnections (22, 24) of the neural network, are encoded into the data stream (45).

170. Apparatus of claim 169, wherein the apparatus is configured to decode, from the data stream (45), the neural network parameters (32) using context-adaptive arithmetic decoding.

171. Apparatus of claim 169 or claim 170, wherein the data stream is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and wherein the apparatus is configured to decode serially, from the data stream (45), neural network parameters, which define neuron interconnections (22, 24) of the neural network within a predetermined neural network layer, and

use the coding order (104) to assign neural network parameters serially decoded from the data stream (45) to the neuron interconnections (22, 24).

172. Apparatus of any previous claim 169 to 171, wherein the serialization parameter (102) is an n-ary parameter which indicates the coding order (104) out of a set (108) of n coding orders (104).

173. Apparatus of claim 172, wherein the set (108) of n coding orders (104) comprises

first predetermined coding orders (1061) which differ in an order at which the predetermined coding orders traverse dimensions (34) of a tensor (30) describing a predetermined neural network layer (210, 30) of the neural network; and/or

second predetermined coding orders (1062) which differ in a number (107) of times at which the predetermined coding orders traverse a predetermined neural network layer (210, 30) of the neural network for sake of scalable coding of the neural network; and/or

third predetermined coding orders (1063) which differ in an order at which the predetermined coding orders traverse neural network layers of the neural network; and/or

fourth predetermined coding orders (1064) which differ in an order at which neurons (14, 18, 20) of a neural network layer of the neural network are traversed.

174. Apparatus of any previous claim 169 to 173, wherein the serialization parameter (102) is indicative of a permutation using which the coding order (104) permutes neurons (14, 18, 20) of a neural network layer (210, 30) relative to a default order.

175. Apparatus of claim 174, wherein the permutation orders the neurons (14, 18, 20) of the neural network layer (210, 30) in a manner so that the neural network parameters (32) monotonically increase along the coding order (104) or monotonically decrease along the coding order (104).

176. Apparatus of claim 174, wherein the permutation orders the neurons (14, 18, 20) of the neural network layer (210, 30) in a manner so that, among predetermined coding orders signalable by the serialization parameter (102), a bitrate for coding the neural network parameters (32) into the data stream (45) is lowest for the permutation indicated by the serialization parameter (102).

177. Apparatus of any previous claim 169 to 176, wherein the neural network parameters (32) comprise weights and biases.

178. Apparatus of any previous claim 169 to 177, wherein the apparatus is configured to decode, from the data stream, individually accessible sub-portions (43, 44, 240), into which individually accessible portions (200) the data stream is structured, each subportion (43, 44, 240) representing a corresponding neural network portion of the neural network, so that each sub-portion (43, 44, 240) is completely traversed by the coding order (104) before a subsequent sub-portion is traversed by the coding order (104).

179. Apparatus of any of claims 171 to 178, wherein the neural network parameters (32) are decoded from the data stream using context-adaptive arithmetic decoding and using context initialization at a start of any individually accessible portion (200) or sub-portion (43, 44, 240).

180. Apparatus of any of claims 171 to 179, wherein the apparatus is configured to decode, from the data stream, start codes (242) at which each individually accessible portion (200) or sub-portion (43, 44, 240) begins, and/or pointers (220, 244) pointing to beginnings of each individually accessible portion or sub-portion, and/or pointers data stream lengths (246) of each individually accessible portion or sub-portion for skipping the respective individually accessible portion or sub-portion in parsing the data stream.

181. Apparatus of any of the previous claims 169 to 180, wherein the apparatus is configured to decode, from the data stream, a numerical computation representation parameter (120) indicating a numerical representation and bit size at which the neural network parameters (32) are to be represented when using the neural network (10) for inference.

182. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the apparatus is configured to decode from the data stream (45) a numerical computation representation parameter (120) indicating a numerical representation and bit size at which neural network parameters (32) of the neural network, which are encoded into the data stream (45), are to be represented when using the neural network (10) for inference, and to use the numerical representation and bit size for representing the neural network parameters (32) decoded from the data stream (45).

183. Apparatus of any of the previous claims 169 to 182, wherein the data stream (45), is structured into individually accessible sub-portions (43, 44, 240), each individually accessible sub-portion representing a corresponding neural network portion of the neural network, so that each individually accessible sub-portion is completely traversed by the coding order (104) before a subsequent individually accessible sub-portion is traversed by the coding order (104), wherein the apparatus is configured to decode, from the data stream (45), for a predetermined individually accessible sub-portion the neural network parameter and a type parameter indicting a parameter type of the neural network parameter decoded from the predetermined individually accessible sub- portion.

184. Apparatus of claim 183, wherein the type parameter discriminates, at least, between neural network weights and neural network biases.

185. Apparatus of any of the previous claims 169 to 184, wherein the data stream (45), is structured into one or more individually accessible portions (200), each one or more individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and

wherein the apparatus is configured to decode, from the data stream (45), for a predetermined neural network layer, a neural network layer type parameter (130) indicating a neural network layer type of the predetermined neural network layer of the neural network.

186. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into one or more individually accessible portions (200), each portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the apparatus is configured to decode from the data stream (45), fora predetermined neural network layer (210, 30), a neural network layer type parameter (130) indicating a neural network layer type of the predetermined neural network layer of the neural network.

187. Apparatus of any of claims 185 and 186, wherein the neural network layer type parameter (130) discriminates, at least, between a fully-connected and a convolutional layer type.

188. Apparatus of any of the previous claims 169 to 187, wherein the data stream (45), is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, and wherein the apparatus is configured to decode, from the data stream (45), for each of one or more predetermined individually accessible portions (200), a pointer (220, 244) pointing to a beginning of each individually accessible portion.

189. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into one or more individually accessible portions (200), each portion representing a corresponding neural network layer (210, 30) of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions, a pointer (220, 244) pointing to a beginning of the respective predetermined individually accessible portion.

190. Apparatus of any of previous claims 188 and 189, wherein each individually accessible portion represents

a corresponding neural network layer (210) of the neural network or

a neural network portion (43, 44, 240) of a neural network layer (210) of the neural network.

191. Apparatus of any of claims 169 to 190, wherein the apparatus is configured to decode a representation of a neural network (10) from the data stream (45), wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and wherein the data stream (45) is, within a predetermined portion, further structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the respective neural network layer (210, 30) of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible sub-portions (43, 44, 240)

a start code (242) at which the respective predetermined individually accessible subportion begins, and/or

a pointer (244) pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream (45).

192. Apparatus of claim 191, wherein the apparatus is configured to decode, from the data stream (45), the representation of the neural network using context-adaptive arithmetic decoding and using context initialization at a start of each individually accessible portion and each individually accessible sub-portion.

193. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into one or more individually accessible portions (200), each individually accessible portion representing a corresponding neural network layer (210, 30) of the neural network, and wherein the data stream (45) is, within a predetermined portion, further structured into individually accessible sub-portions (43, 44, 240), each sub-portion (43, 44, 240) representing a corresponding neural network portion of the respective neural network layer (210, 30) of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible sub-portions (43, 44, 240)

a start code (242) at which the respective predetermined individually accessible sub- portion begins, and/or

a pointer (244) pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream (45).

194. Apparatus of claim 193, wherein the apparatus is configured to decode, from the data stream (45), the representation of the neural network using context-adaptive arithmetic decoding and using context initialization at a start of each individually accessible portion and each individually accessible sub-portion.

195. Apparatus of any previous claim 169 to 194, wherein the apparatus is configured to decode a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions (200), a processing option parameter (250) indicating one or more processing options (252) which have to be used or which may optionally be used when using the neural network (10) for inference.

196. Apparatus of claim 195, wherein the processing option parameter (250) indicates the one or more available processing options (252) out of a set of predetermined processing options (252) including

parallel processing capability of the respective predetermined individually accessible portion; and/or

sample wise parallel processing capability (2522) of the respective predetermined individually accessible portion; and/or

channel wise parallel processing capability (2521) of the respective predetermined individually accessible portion; and/or

classification category wise parallel processing capability of the respective predetermined individually accessible portion; and/or

dependency of the neural network portion represented by the respective predetermined individually accessible portion on a computation result gained from another individually accessibly portion of the data stream (45) relating to the same neural network portion but belonging to another version of versions (330) of the neural network which are encoded into the data stream (45) in a layered manner.

197. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into individually accessible portions (200), each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions, a processing option parameter (250) indicating one or more processing options (252) which have to be used or which may optionally be used when using the neural network (10) for inference.

198. Apparatus of claim 197, wherein the processing option parameter (250) indicates the one or more available processing options (252) out of a set of predetermined processing options (252) including

parallel processing capability of the respective predetermined individually accessible portion; and/or

sample wise parallel processing capability (2522) of the respective predetermined individually accessible portion; and/or

channel wise parallel processing capability (2521) of the respective predetermined individually accessible portion; and/or

classification category wise parallel processing capability of the respective predetermined individually accessible portion; and/or

dependency of the neural network portion represented by the respective predetermined individually accessible portion on a computation result gained from another individually accessibly portion of the data stream (45) relating to the same neural network portion but belonging to another version of versions (330) of the neural network which are encoded into the data stream (45) in a layered manner.

199. Apparatus of one of claims 169 to 198, wherein the apparatus is configured to decode neural network parameters (32), which represent a neural network, from a data stream (45), wherein the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”), and the neural network parameters (32) are encoded into the data stream (45) so that neural network parameters (32) in different neural network portions of the neural network are quantized (260) differently, wherein the apparatus is configured to decode from the data stream (45), for each of the neural network portions, a reconstruction rule (270) for dequantizing (280) neural network parameters (32) relating to the respective neural network portion.

200. Apparatus for decoding neural network parameters (32), which represent a neural network, from a data stream (45), wherein the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32”), and the neural network parameters (32) are encoded into the data stream (45) so that neural network parameters (32) in different neural network portions of the neural network are quantized (260) differently, wherein the apparatus is configured to decode from the data stream (45), for each of the neural network portions, a reconstruction rule (270) fordequantizing (280) neural network parameters (32) relating to the respective neural network portion.

201. Apparatus of claim 199 or claim 200, wherein the neural network portions comprise neural network layers (210, 30) of the neural network and/or layer portions into which a predetermined neural network layer of the neural network is subdivided.

202. Apparatus of any previous claim 199 to 201, wherein the apparatus is configured to decode, from the data stream (45), a first reconstruction rule (2701, 270a1) for dequantizing (280) neural network parameters (32) relating to a first neural network portion, in a manner delta-decoded relative to a second reconstruction rule (2702, 270a2) for dequantizing (280) neural network parameters (32) relating to a second neural network portion.

203. Apparatus of claim 202, wherein

the apparatus is configured to decode, from the data stream (45), for indicating the first reconstruction rule (2701, 270a1), a first exponent value and, for indicating the second reconstruction rule (2702, 270a2), a second exponent value,

the first reconstruction rule (2701, 270a1) is defined by a first quantization step size (263) defined by an exponentiation of a predetermined basis and a first exponent defined by the first exponent value, and

the second reconstruction rule (2702, 270a2) is defined by a second quantization step size (263) defined by an exponentiation of the predetermined basis and a second exponent defined by a sum over the first and second exponent values.

204. Apparatus of claim 203, wherein the data stream (45) further indicates the predetermined basis.

205. Apparatus of any previous claim 199 to 202, wherein

the apparatus is configured to decode, from the data stream (45), for indicating a first reconstruction rule (2701, 270a1) for dequantizing (280) neural network parameters (32) relating to a first neural network portion, a first exponent value and, for indicating a second reconstruction rule (2702, 270a2) for dequantizing (280) neural network parameters (32) relating to a second neural network portion, a second exponent value,

the first reconstruction rule (2701, 270a1) is defined by a first quantization step size (263) defined by an exponentiation of a predetermined basis and a first exponent defined by a sum over the first exponent value and a predetermined exponent value, and

the second reconstruction rule (2702, 270a2) is defined by a second quantization step size (263) defined by an exponentiation of the predetermined basis and a second exponent defined by a sum over the second exponent values and the predetermined exponent value.

206. Apparatus of claim 205, wherein the data stream further indicates the predetermined basis.

207. Apparatus of claim 206, wherein the data stream indicates the predetermined basis at a neural network scope.

208. Apparatus of any previous claim 205 to 207, wherein the data stream further indicates the predetermined exponent value.

209. Apparatus of claim 208, wherein the data stream indicates the predetermined exponent value at a neural network layer (210, 30) scope.

210. Apparatus of claim 208 or claim 209, wherein the data stream further indicates the predetermined basis and the data stream indicates the predetermined exponent value at a scope finer than a scope at which the predetermined basis is indicated by the data stream.

211. Apparatus of any of previous claims 203 to 210, wherein the apparatus is configured to decode, from the data stream, the predetermined basis in a non-integer format and the first and second exponent values in integer format.

212. Apparatus of any of claims 202 to 211 , wherein

the apparatus is configured to decode, from the data stream, for indicating the first reconstruction rule (2701, 270a1), a first parameter set (264) defining a first quantization-index-to-reconstruction-level mapping (265), and for indicating the second reconstruction rule (2702, 270a2), a second parameter set (264) defining a second quantization-index-to-reconstruction-level mapping (265),

the first reconstruction rule (2701, 270a1) is defined by the first quantization-index-to- reconstruction-level mapping (265), and

the second reconstruction rule (2702, 270a2) is defined by an extension of the first quantization-index-to-reconstruction-level mapping (265) by the second quantization- index-to-reconstruction-level mapping (265) in a predetermined manner.

213. Apparatus of any of claims 202 to 212, wherein

the apparatus is configured to decode, from the data stream, for indicating the first reconstruction rule (2701, 270a1), a first parameter set (264) defining a first quantization-index-to-reconstruction-level mapping (265), and for indicating the second reconstruction rule (2702, 270a2), a second parameter set (264) defining a second quantization-index-to-reconstruction-level mapping (265),

the first reconstruction rule (2701, 270a1) is defined by an extension of a predetermined quantization-index-to-reconstruction-level mapping (265) by the first quantization- index-to-reconstruction-level mapping (265) in a predetermined manner, and

the second reconstruction rule (2702, 270a2) is defined by an extension of the predetermined quantization-index-to-reconstruction-level mapping (265) by the second quantization-index-to-reconstruction-level mapping (265) in the predetermined manner.

214. Apparatus of claim 213, wherein the data stream further indicates the predetermined quantization-index-to-reconstruction-level mapping (265).

215. Apparatus of claim 214, wherein the data stream indicates the predetermined quantization-index-to-reconstruction-level mapping (265) at a neural network scope or at a neural network layer (210, 30) scope.

216. Apparatus of any of previous claims 212 to 215, wherein, according to the predetermined manner,

a mapping of each index value (32”), according to the quantization-index-to- reconstruction-level mapping to be extended, onto a first reconstruction level is superseded by, if present, a mapping of the respective index value (32”), according to the quantization-index-to-reconstruction-level mapping extending the quantization- index-to-reconstruction-level mapping to be extended, onto a second reconstruction level, and/or

for any index value (32”), for which according to the quantization-index-to- reconstruction-level mapping to be extended, no reconstruction level is defined onto which the respective index value (32") should be mapped, and which is, according to the quantization-index-to-reconstruction-level mapping extending the quantization- index-to-reconstruction-level mapping to be extended, mapped onto a corresponding reconstruction level, the mapping from the respective index value (32”) onto the corresponding reconstruction level is adopted, and/or

for any index value (32”), for which according to the quantization-index-to- reconstruction-level mapping extending the quantization-index-to-reconstruction-level mapping to be extended, no reconstruction level is defined onto which the respective index value (32”) should be mapped, and which is, according to the quantization-index- to-reconstruction-level mapping to be extended, mapped onto a corresponding reconstruction level, the mapping from the respective index value (32”) onto the corresponding reconstruction level is adopted.

217. Apparatus of any previous claim 199 to 216, wherein

the apparatus is configured to decode, from the data stream, for indicating the reconstruction rule (270) of a predetermined neural network portion,

a quantization step size parameter (262) indicating a quantization step size (263), and

a parameter set (264) defining a quantization-index-to-reconstruction-level mapping (265),

wherein the reconstruction rule (270) of the predetermined neural network portion is defined by

the quantization step size (263) for quantization indices (32”) within a predetermined index interval (268), and

the quantization-index-to-reconstruction-level mapping (265) for quantization indices (32") outside the predetermined index interval (268).

218. Apparatus for decoding neural network parameters (32), which represent a neural network, from a data stream (45), wherein the neural network parameters (32) are encoded into the data stream (45) in a manner quantized (260) onto quantization indices (32"), wherein the apparatus is configured to derive from the data stream (45) a reconstruction rule (270) for dequantizing (280) the neural network parameters (32) by decoding from the data stream (45)

a quantization step size parameter (262) indicating a quantization step size (263), and

a parameter set (264) defining a quantization-index-to-reconstruction-level mapping (265),

wherein the reconstruction rule (270) of the predetermined neural network portion is defined by

the quantization step size (263) for quantization indices (32”) within a predetermined index interval (268), and

the quantization-index-to-reconstruction-level mapping (265) for quantization indices (32”) outside the predetermined index interval (268).

219. Apparatus of claim 217 or claim 218, wherein the predetermined index interval (268) includes zero.

220. Apparatus of claim 219, wherein the predetermined index interval (268) extends up to a predetermined magnitude threshold value and quantization indices (32”) exceeding the predetermined magnitude threshold value represent escape codes which signal that the quantization-index-to-reconstruction-level mapping (265) is to be used for dequantization (280).

221. Apparatus of any of previous claims 217 to 220, wherein the parameter set (264) defines the quantization-index-to-reconstruction-level mapping (265) by way of a list of reconstruction levels associated with quantization indices (32”) outside the predetermined index interval (268).

222. Apparatus of any of previous claims 199 to 221, wherein the neural network portions comprise one or more sub-portions of a neural network layer (210, 30) of the neural network and/or one or more neural network layers of the neural network.

223. Apparatus of any of previous claims 199 to 222, wherein the data stream (45) is structured into individually accessible portions (200), and the apparatus is configured to decode from each individually accessible portion the neural network parameters (32) for a corresponding neural network portion.

224. Apparatus of 223, wherein the apparatus is configured to decode, from the data stream (45), the individually accessible portions (200) using context-adaptive arithmetic decoding and using context initialization at a start of each individually accessible portion.

225. Apparatus of claim 223 or claim 224, wherein the apparatus is configured to read, from the data stream (45), for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream (45).

226. Apparatus of any previous claim 223 to 225, wherein the apparatus is configured to read, from the data stream (45), for each of the neural network portions, an indication of the reconstruction rule (270) for dequantizing (280) neural network parameters (32) relating to the respective neural network portion in

a main header portion (47) of the data stream (45) relating the neural network as a whole,

a neural network layer (210, 30) related header portion (110) of the data stream (45) relating to the neural network layer the respective neural network portion is part of, or

a neural network portion specific header portion of the data stream (45) relating to the respective neural network portion is part of.

227. Apparatus of any previous claim 169 to 226, wherein the apparatus is configured to decode a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions, an identification parameter (310) for identifying the respective predetermined individually accessible portion.

228. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions, an identification parameter (310) for identifying the respective predetermined individually accessible portion.

229. Apparatus of claim 227 or claim 228, wherein the identification parameter (310) is related to the respective predetermined individually accessible portion via a hash function or error detection code or error correction code.

230. Apparatus of any of previous claims 227 to 229, wherein the apparatus is configured to decode, from the data stream (45), a higher-level identification parameter (310) for identifying a collection of more than one predetermined individually accessible portion.

231. Apparatus of claim 230, wherein the higher-level identification parameter (310) is related to the identification parameters (310) of the more than one predetermined

individually accessible portion via a hash function or error detection code or error correction code.

232. Apparatus of any of previous claims 227 to 231 , wherein the apparatus is configured to decode, from the data stream (45), the individually accessible portions (200) using context-adaptive arithmetic decoding and using context initialization at a start of each individually accessible portion.

233. Apparatus of any of previous claims 227 to 232, wherein the apparatus is configured to read, from the data stream, for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

234. Apparatus of any of previous claims 227 to 233, wherein the neural network portions comprise one or more sub-portions of a neural network layer (210, 30) of the neural network and/or one or more neural network layers of the neural network.

235. Apparatus of any previous claim 169 to 234, wherein the apparatus is configured to decode a representation of a neural network (10) from a data stream (45), into which same is encoded in a layered manner so that different versions (330) of the neural network are encoded into the data stream (45), and so that the data stream (45) is structured into one or more individually accessible portions (200), each portion relating to a corresponding version of the neural network, wherein the apparatus is configured decode a first version (3302) of the neural network encoded from a first portion

by using delta-decoding relative to a second version (3301) of the neural network encoded into a second portion, and/or

by decoding from the data stream (45) one or more compensating neural network portions (332) each of which is to be, for performing an inference based on the first version (3302) of the neural network,

executed in addition to an execution of a corresponding neural network portion (334) of a second version (3301) of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion (332) and corresponding neural network portion (334) are to be summed up.

236. Apparatus for decoding a representation of a neural network (10) from a data stream (45), into which same is encoded in a layered manner so that different versions (330) of the neural network are encoded into the data stream (45), and so that the data stream (45) is structured into one or more individually accessible portions (200), each portion relating to a corresponding version of the neural network, wherein the apparatus is configured to decode a first version (3302) of the neural network from a first portion

by using delta-decoding relative to a second version (3301) of the neural network encoded into a second portion, and/or

by decoding from the data stream (45) one or more compensating neural network portions (332) each of which is to be, for performing an inference based on the first version (3302) of the neural network,

executed in addition to an execution of a corresponding neural network portion (334) of a second version (3301) of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion (332) and corresponding neural network portion (334) are to be summed up.

237. Apparatus of claim 235 or claim 236,

wherein the apparatus is configured to decode, from a second portion of the data stream (45), the second version (3301) of the neural network; and

wherein the apparatus is configured to decode, from a first portion of the data stream (45), the first version (3302) of the neural network delta-decoding relative to the second version (3301) of the neural network encoded into the second portion in terms of

weight and/or bias differences, and/or

additional neurons (14, 18, 20) or neuron interconnections (22, 24).

238. Apparatus of any previous claim 235 to 237, wherein the apparatus is configured to decode, from the data stream (45), the individually accessible portions (200) using context-adaptive arithmetic decoding (600) and using context initialization at a start of each individually accessible portion.

239. Apparatus of any previous claim 235 to 238, wherein the apparatus is configured to decode, from the data stream (45), for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

240. Apparatus of any previous claim 235 to 239, wherein the apparatus is configured to decode, from the data stream, for each of one or more predetermined individually accessible portions (200) an identification parameter (310) for identifying the respective predetermined individually accessible portion.

241. Apparatus of any previous claim 169 to 240, wherein the apparatus is configured to decode a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions a supplemental data (350) for supplementing the representation of the neural network.

242. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the data stream (45) is structured into individually accessible portions (200), each portion representing a corresponding neural network portion of the neural network, wherein the apparatus is configured to decode from the data stream (45), for each of one or more predetermined individually accessible portions (200) a supplemental data (350) for supplementing the representation of the neural network.

243. Apparatus of claim 241 or claim 242, wherein the data stream (45) indicates the supplemental data (350) as being dispensable for inference based on the neural network.

244. Apparatus of any previous claim 241 to 243, wherein the apparatus is configured to decode the supplemental data (350) for supplementing the representation of the neural network for the one or more predetermined individually accessible portions (200) from further individually accessible portions, wherein the data stream (45) comprises for each of the one or more predetermined individually accessible portions a corresponding further predetermined individually accessible portion relating to the neural network portion to which the respective predetermined individually accessible portion corresponds.

245. Apparatus of any previous claim 241 to 244, wherein the neural network portions comprise neural network layers (210, 30) of the neural network and/or layer portions into which a predetermined neural network layer of the neural network is subdivided.

246. Apparatus of any previous claim 241 to 245, wherein the apparatus is configured to decode the individually accessible portions (200) using context-adaptive arithmetic decoding and using context initialization at a start of each individually accessible portion.

247. Apparatus of any previous claim 241 to 246, wherein the apparatus is configured to read, from the data stream, for each individually accessible portion

a start code (242) at which the respective individually accessible portion begins, and/or

a pointer (220, 244) pointing to a beginning of the respective individually accessible portion, and/or

a data stream length parameter indicating a data stream length (246) of the respective individually accessible portion for skipping the respective individually accessible portion in parsing the data stream.

248. Apparatus of any previous claim 241 to 247, wherein the supplemental data (350) relates to

relevance scores of neural network parameters (32), and/or

perturbation robustness of neural network parameters (32).

249. Apparatus of any previous claim 169 to 248, for decoding a representation of a neural network (10) from a data stream (45), wherein the apparatus is configured to decode from the data stream (45) hierarchical control data (400) structured into a sequence (410) of control data portions (420), wherein the control data portions provide information on the neural network at increasing details along the sequence of control data portions.

250. Apparatus for decoding a representation of a neural network (10) from a data stream (45), wherein the apparatus is configured to decode from the data stream (45) hierarchical control data (400) structured into a sequence (410) of control data portions (420), wherein the control data portions provide information on the neural network at increasing details along the sequence of control data portions.

251. Apparatus of claim 249 or claim 250, wherein at least some of the control data portions (420) provide information on the neural network which is partially redundant.

252. Apparatus of any previous claim 249 to 251, wherein a first control data portion provides the information on the neural network by way of indicating a default neural network type implying default settings and a second control data portion comprises a parameter to indicate each of the default setings.

253. Apparatus for performing an inference using a neural network, comprising

an apparatus for decoding a data stream (45) according to any of claims 169 to 252, so as to derive from the data stream (45) the neural network, and

a processor configured to perform the inference based on the neural network.

254. Method for encoding a representation of a neural network into a data stream (45), comprising providing the data stream with a serialization parameter indicating a coding order at which neural network parameters, which define neuron interconnections of the neural network, are encoded into the data stream.

255. Method for encoding a representation of a neural network into a data stream, providing the data stream with a numerical computation representation parameter indicating a numerical representation and bit size at which neural network parameters of the neural network, which are encoded into the data stream, are to be represented when using the neural network for inference.

256. Method for encoding a representation of a neural network into a data stream, so that the data stream is structured into one or more individually accessible portions, each individually accessible portion representing a corresponding neural network layer of the neural network, wherein the method comprises providing the data stream with, for a predetermined neural network layer, a neural network layer type parameter indicating a neural network layer type of the predetermined neural network layer of the neural network.

257. Method for encoding a representation of a neural network into a data stream, so that the data stream is structured into one or more individually accessible portions, each portion representing a corresponding neural network layer of the neural network, wherein the comprises providing the data stream with, for each of one or more predetermined individually accessible portions, a pointer pointing to a beginning of the respective predetermined individually accessible portion.

258. Method for encoding a representation of a neural network into a data stream, so that the data stream is structured into one or more individually accessible portions, each individually accessible portion representing a corresponding neural network layer of the neural network, and so that the data stream is, within a predetermined portion, further structured into individually accessible sub-portions, each sub-portion representing a corresponding neural network portion of the respective neural network layer of the neural network, wherein the method comprises providing the data stream with, for each of one or more predetermined individually accessible sub-portions

a start code at which the respective predetermined individually accessible sub- portion begins, and/or

a pointer pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream.

259. Method for encoding a representation of a neural network into a data stream, so that the data stream is structured into individually accessible portions, each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the method comprises providing the data stream with, for each of one or more predetermined individually accessible portions, a processing option parameter indicating one or more processing options which have to be used or which may optionally be used when using the neural network for inference.

260. Method for encoding neural network parameters, which represent a neural network, into a data stream, so that the neural network parameters are encoded into the data stream in a manner quantized onto quantization indices, and the neural network parameters are encoded into the data stream so that neural network parameters in different neural network portions of the neural network are quantized differently, wherein the method comprises providing the data stream indicating, for each of the neural network portions, a reconstruction rule for dequantizing neural network parameters relating to the respective neural network portion.

261. Method for encoding neural network parameters, which represent a neural network, into a data stream, so that the neural network parameters are encoded into the data stream in a manner quantized onto quantization indices, wherein the method comprises providing the data stream with, for indicating a reconstruction rule for dequantizing the neural network parameters,

a quantization step size parameter indicating a quantization step size, and

a parameter set defining a quantization-index-to-reconstruction-level mapping,

wherein the reconstruction rule of the predetermined neural network portion is defined by

the quantization step size for quantization indices within a predetermined index interval, and

the quantization-index-to-reconstruction-level mapping for quantization indices outside the predetermined index interval.

262. Method for encoding a representation of a neural network into a data stream, so that the data stream is structured into individually accessible portions, each portion representing a corresponding neural network portion of the neural network, wherein the method comprises providing the data stream with, for each of one or more predetermined individually accessible portions, an identification parameter for identifying the respective predetermined individually accessible portion.

263. Method for encoding a representation of a neural network into a data stream in a layered manner so that different versions of the neural network are encoded into the data stream, and so that the data stream is structured into one or more individually accessible portions, each portion relating to a corresponding version of the neural network, wherein the method comprises encoding a first version of the neural network into a first portion

delta-coded relative to a second version of the neural network encoded into a second portion, and/or

in form of one or more compensating neural network portions each of which is to be, for performing an inference based on the first version of the neural network,

executed in addition to an execution of a corresponding neural network portion of a second version of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion and corresponding neural network portion are to be summed up.

264. Method for encoding a representation of a neural network into a data stream, so that the data stream is structured into individually accessible portions, each portion representing a corresponding neural network portion of the neural network, wherein the method comprises providing the data stream with, for each of one or more predetermined individually accessible portions a supplemental data for supplementing the representation of the neural network.

265. Method for encoding a representation of a neural network into a data stream, wherein the method comprises providing the data stream with hierarchical control data structured into a sequence of control data portions, wherein the control data portions provide information on the neural network at increasing details along the sequence of control data portions.

266. Method for decoding a representation of a neural network from a data stream, comprising decoding from the data stream a serialization parameter indicating a coding order at which neural network parameters, which define neuron interconnections of the neural network, are encoded into the data stream.

267. Method for decoding a representation of a neural network from a data stream, wherein the method comprises decoding from the data stream a numerical computation representation parameter indicating a numerical representation and bit size at which neural network parameters of the neural network, which are encoded into the data stream, are to be represented when using the neural network for inference, and to use the numerical representation and bit size for representing the neural network parameters decoded from the data stream.

268. Method for decoding a representation of a neural network from a data stream, wherein the data stream is structured into one or more individually accessible portions, each portion representing a corresponding neural network layer of the neural network, wherein the method comprises decoding from the data stream, for a predetermined neural network layer, a neural network layer type parameter indicating a neural network layer type of the predetermined neural network layer of the neural network.

269. Method for decoding a representation of a neural network from a data stream, wherein the data stream is structured into one or more individually accessible portions, each portion representing a corresponding neural network layer of the neural network, wherein the method comprises decoding from the data stream, for each of one or more predetermined individually accessible portions, a pointer pointing to a beginning of the respective predetermined individually accessible portion.

270. Method for decoding a representation of a neural network from a data stream, wherein the data stream is structured into one or more individually accessible portions, each individually accessible portion representing a corresponding neural network layer of the neural network, and wherein the data stream is, within a predetermined portion, further structured into individually accessible sub-portions, each sub-portion representing a corresponding neural network portion of the respective neural network layer of the neural network, wherein the method comprises decoding from the data stream, for each of one or more predetermined individually accessible sub-portions

a start code at which the respective predetermined individually accessible sub- portion begins, and/or

a pointer pointing to a beginning of the respective predetermined individually accessible sub-portion, and/or

a data stream length parameter indicating a data stream length of the respective predetermined individually accessible sub-portion for skipping the respective predetermined individually accessible sub-portion in parsing the data stream.

271. Method for decoding a representation of a neural network from a data stream, wherein the data stream is structured into individually accessible portions, each individually accessible portion representing a corresponding neural network portion of the neural network, wherein the method comprises decoding from the data stream, for each of one or more predetermined individually accessible portions, a processing option parameter indicating one or more processing options which have to be used or which may optionally be used when using the neural network for inference.

272. Method Apparatus for decoding neural network parameters, which represent a neural network, from a data stream, wherein the neural network parameters are encoded into the data stream in a manner quantized onto quantization indices, and the neural network parameters are encoded into the data stream so that neural network parameters in different neural network portions of the neural network are quantized differently, wherein the method comprises decoding from the data stream, for each of the neural network portions, a reconstruction rule for dequantizing neural network parameters relating to the respective neural network portion.

273. Method for decoding neural network parameters, which represent a neural network, from a data stream, wherein the neural network parameters are encoded into the data stream in a manner quantized onto quantization indices, wherein the method comprises deriving from the data stream a reconstruction rule for dequantizing the neural network parameters by decoding from the data stream

a quantization step size parameter indicating a quantization step size, and a parameter set defining a quantization-index-to-reconstruction-level mapping,

wherein the reconstruction rule of the predetermined neural network portion is defined by

the quantization step size for quantization indices within a predetermined index interval, and

the quantization-index-to-reconstruction-level mapping for quantization indices outside the predetermined index interval.

274. Method for decoding a representation of a neural network from a data stream, wherein the data stream is structured into individually accessible portions, each portion representing a corresponding neural network portion of the neural network, wherein the method comprises decoding from the data stream, for each of one or more predetermined individually accessible portions, an identification parameter for identifying the respective predetermined individually accessible portion.

275. Method for decoding a representation of a neural network from a data stream, into which same is encoded in a layered manner so that different versions of the neural network are encoded into the data stream, and so that the data stream is structured into one or more individually accessible portions, each portion relating to a corresponding version of the neural network, wherein the method comprises decoding a first version of the neural network from a first portion

by using delta-decoding relative to a second version of the neural network encoded into a second portion, and/or

by decoding from the data stream one or more compensating neural network portions each of which is to be, for performing an inference based on the first version of the neural network,

executed in addition to an execution of a corresponding neural network portion of a second version of the neural network encoded into a second portion, and

wherein outputs of the respective compensating neural network portion and corresponding neural network portion are to be summed up.

276. Method for decoding a representation of a neural network from a data stream, wherein the data stream is structured into individually accessible portions, each portion representing a corresponding neural network portion of the neural network, wherein the method comprises decoding from the data stream, for each of one or more predetermined individually accessible portions a supplemental data for supplementing the representation of the neural network.

277. Method for decoding a representation of a neural network from a data stream, wherein the method comprises decoding from the data stream hierarchical control data structured into a sequence of control data portions, wherein the control data portions provide information on the neural network at increasing details along the sequence of control data portions.

278. Computer program for, when executed by a computer, causing the computer to perform any method of claims 254 to 277.

Documents

Application Documents

#	Name	Date
1	202217019596.pdf	2022-03-31
2	202217019596-STATEMENT OF UNDERTAKING (FORM 3) [31-03-2022(online)].pdf	2022-03-31
3	202217019596-REQUEST FOR EXAMINATION (FORM-18) [31-03-2022(online)].pdf	2022-03-31
4	202217019596-NOTIFICATION OF INT. APPLN. NO. & FILING DATE (PCT-RO-105-PCT Pamphlet) [31-03-2022(online)].pdf	2022-03-31
5	202217019596-FORM 18 [31-03-2022(online)].pdf	2022-03-31
6	202217019596-FORM 1 [31-03-2022(online)].pdf	2022-03-31
7	202217019596-DRAWINGS [31-03-2022(online)].pdf	2022-03-31
8	202217019596-DECLARATION OF INVENTORSHIP (FORM 5) [31-03-2022(online)].pdf	2022-03-31
9	202217019596-COMPLETE SPECIFICATION [31-03-2022(online)].pdf	2022-03-31
10	202217019596-FORM-26 [09-06-2022(online)].pdf	2022-06-09
11	202217019596-Proof of Right [10-06-2022(online)].pdf	2022-06-10
12	202217019596-FORM 3 [23-08-2022(online)].pdf	2022-08-23
13	202217019596-FER.pdf	2022-08-24
14	202217019596-Information under section 8(2) [21-10-2022(online)].pdf	2022-10-21
15	202217019596-FORM 3 [21-10-2022(online)].pdf	2022-10-21
16	202217019596-OTHERS [24-02-2023(online)].pdf	2023-02-24
17	202217019596-FER_SER_REPLY [24-02-2023(online)].pdf	2023-02-24
18	202217019596-DRAWING [24-02-2023(online)].pdf	2023-02-24
19	202217019596-CLAIMS [24-02-2023(online)].pdf	2023-02-24
20	202217019596-Information under section 8(2) [18-08-2023(online)].pdf	2023-08-18
21	202217019596-FORM 3 [11-10-2023(online)].pdf	2023-10-11
22	202217019596-US(14)-HearingNotice-(HearingDate-23-01-2025).pdf	2024-12-13
23	202217019596-Correspondence to notify the Controller [16-12-2024(online)].pdf	2024-12-16
24	202217019596-FORM 3 [20-12-2024(online)].pdf	2024-12-20
25	202217019596-FORM-26 [21-01-2025(online)].pdf	2025-01-21
26	202217019596-Form-4 u-r 138 [04-02-2025(online)].pdf	2025-02-04
27	202217019596-Form-4 u-r 138 [06-03-2025(online)].pdf	2025-03-06
28	202217019596-Form-4 u-r 138 [01-04-2025(online)].pdf	2025-04-01
29	202217019596-Form-4 u-r 138 [05-06-2025(online)].pdf	2025-06-05
30	202217019596-Form-4 u-r 138 [07-07-2025(online)].pdf	2025-07-07
31	202217019596-Written submissions and relevant documents [07-08-2025(online)].pdf	2025-08-07

Search Strategy

1	search_strategy_24E_24-08-2022.pdf