Neural Network Representation Formats

< Back

Neural Network Representation Formats

Abstract: Data stream (45) having a representation of a neural network (10) encoded thereinto, the data stream (45) comprising serialization parameter (102) indicating a coding order (104) at which neural network parameters (5 32), which define neuron interconnections (22, 24) of the neural network (10), are encoded into the data stream (45). To Be Published with Figure 1 10 70

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

25 July 2025

Publication Number

33/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Hansastraße 27c 80686 München, Germany

Inventors

1. MATLAGE, Stefan

Grazer Damm 115 12157 Berlin, Germany

2. HAASE, Paul

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

3. KIRCHHOFFER, Heiner

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

4. MÜLLER, Karsten

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

5. SAMEK, Wojciech

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

6. WIEDEMANN, Simon

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

7. MARPE, Detlev

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

8. SCHIERL, Thomas

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

9. SÁNCHEZ DE LA FUENTE, Yago

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

10. SKUPIN, Robert

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

11. WIEGAND, Thomas

c/o Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI Einsteinufer 37 10587 Berlin, Germany

Specification

Description:AS ATTACHED , Claims:I/We Claim:
1. Data stream (45) having a representation of a neural network (10) encoded
thereinto, the data stream (45) comprising serialization parameter (102)
indicating a coding order (104) 5 at which neural network parameters (32),
which define neuron interconnections (22, 24) of the neural network (10),
are encoded into the data stream (45), wherein the neural network
parameters (32) are coded into the data stream (45) using context-adaptive
arithmetic coding (600).
10
2. Apparatus for encoding a representation of a neural network (10) into a data
stream (45), wherein the apparatus is configured to provide the data stream
(45) with a serialization parameter (102) indicating a coding order (104) at
which neural network parameters (32), which define neuron
15 interconnections (22, 24) of the neural network, are encoded into the data
stream (45), wherein the apparatus is configured to encode, into the data
stream (45), the neural network parameters (32) using context-adaptive
arithmetic encoding.
20 3. Apparatus for decoding a representation of a neural network (10) from a
data stream (45), wherein the apparatus is configured to decode from the
data stream (45) a serialization parameter (102) indicating a coding order
(104) at which neural network parameters (32), which define neuron
interconnections (22, 24) of the neural network, are encoded into the data
25 stream (45), wherein the apparatus is configured to decode, from the data
stream (45), the neural network parameters (32) using context-adaptive
arithmetic decoding.
4. Apparatus of claim 3, wherein the data stream is structured into one or more
30 individually accessible portions (200), each individually accessible portion
representing a corresponding neural network layer (210, 30) of the neural
network, and
62
wherein the apparatus is configured to decode serially, from the data stream
(45), neural network parameters, which define neuron interconnections (22,
24) of the neural network within a predetermined neural network layer, and
use the coding order (104) to assign neural network parameters serially
decoded from the data stream 5 (45) to the neuron interconnections (22, 24).
5. Apparatus of claim 3 or claim 4, wherein the serialization parameter (102)
is indicative of a permutation using which the coding order (104) permutes
neurons (14, 18, 20) of a neural network layer (210, 30) relative to a default
10 order.
6. Apparatus of claim 5, wherein the permutation orders the neurons (14, 18,
20) of the neural network layer (210, 30) in a manner so that the neural
network parameters (32) monotonically increase along the coding order
15 (104) or monotonically decrease along the coding order (104).
7. Apparatus of claim 5, wherein the permutation orders the neurons (14, 18,
20) of the neural network layer (210, 30) in a manner so that, among
predetermined coding orders signalable by the serialization parameter (102),
20 a bitrate for coding the neural network parameters (32) into the data stream
(45) is lowest for the permutation indicated by the serialization parameter
(102).
8. Apparatus of any previous claim 3 to 7, wherein the neural network
25 parameters (32) comprise weights and biases.
9. Apparatus of any previous claim 3 to 8, wherein the apparatus is configured
to
decode, from the data stream, individually accessible sub-portions (43, 44,
30 240), into which individually accessible portions (200) the data stream is
structured, each sub-portion (43, 44, 240) representing a corresponding
63
neural network portion of the neural network, so that each sub-portion (43,
44, 240) is completely traversed by the coding order (104) before a
subsequent sub-portion is traversed by the coding order (104).
10. Apparatus of any of claims 4 to 5 9, wherein the neural network parameters
(32) are decoded from the data stream using context-adaptive arithmetic
decoding and using context initialization at a start of any individually
accessible portion (200) or sub-portion (43, 44, 240).
10 11. Apparatus of any of claims 4 to 10, wherein the apparatus is configured to
decode, from the data stream, start codes (242) at which each individually
accessible portion (200) or sub-portion (43, 44, 240) begins, and/or pointers
(220, 244) pointing to beginnings of each individually accessible portion or
sub-portion, and/or pointers data stream lengths (246) of each individually
15 accessible portion or sub-portion for skipping the respective individually
accessible portion or sub-portion in parsing the data stream.
12. Apparatus of any of the previous claims 3 to 11, wherein the apparatus is
configured to decode, from the data stream, a numerical computation
20 representation parameter (120) indicating a numerical representation and bit
size at which the neural network parameters (32) are to be represented when
using the neural network (10) for inference.
13. Apparatus of any of the previous claims 3 to 12, wherein the data stream
25 (45), is structured into individually accessible sub-portions (43, 44, 240),
each individually accessible sub-portion representing a corresponding
neural network portion of the neural network, so that each individually
accessible sub-portion is completely traversed by the coding order (104)
before a subsequent individually accessible sub-portion is traversed by the
30 coding order (104), wherein the apparatus is configured to decode, from the
data stream (45), for a predetermined individually accessible sub-portion the
64
neural network parameter and a type parameter indicting a parameter type
of the neural network parameter decoded from the predetermined
individually accessible sub-portion.
14. Apparatus of claim 13, 5 wherein the type parameter discriminates, at least,
between neural network weights and neural network biases.
15. Apparatus of any of the previous claims 3 to 14, wherein the data stream
(45), is structured into one or more individually accessible portions (200),
10 each one or more individually accessible portion representing a
corresponding neural network layer (210, 30) of the neural network, and
wherein the apparatus is configured to decode, from the data stream (45),
for a predetermined neural network layer, a neural network layer type
parameter (130) indicating a neural network layer type of the predetermined
15 neural network layer of the neural network.
16. Apparatus of claim 15, wherein the neural network layer type parameter
(130) discriminates, at least, between a fully-connected and a convolutional
layer type.
20
17. Apparatus of any of claims 3 to 16, wherein the apparatus is configured to
decode a representation of a neural network (10) from the data stream (45),
wherein the data stream (45) is structured into one or more individually
accessible portions (200), each individually accessible portion representing
25 a corresponding neural network layer (210, 30) of the neural network, and
wherein the data stream (45) is, within a predetermined portion, further
structured into individually accessible sub-portions (43, 44, 240), each subportion
(43, 44, 240) representing a corresponding neural network portion
of the respective neural network layer (210, 30) of the neural network,
30 wherein the apparatus is configured to decode from the data stream (45), for
65
each of one or more predetermined individually accessible sub-portions (43,
44, 240)
a start code (242) at which the respective predetermined individually
accessible sub-portion begins, and/or
a pointer (244) pointing to a beginning 5 of the respective predetermined
individually accessible sub-portion, and/or
a data stream length parameter indicating a data stream length (246) of the
respective predetermined individually accessible sub-portion for skipping
the respective predetermined individually accessible sub-portion in parsing
10 the data stream (45).
18. Apparatus of claim 17, wherein the apparatus is configured to decode, from
the data stream (45), the representation of the neural network using contextadaptive
arithmetic decoding and using context initialization at a start of
15 each individually accessible portion and each individually accessible subportion.
19. Apparatus of any previous claim 3 to 18, wherein the apparatus is configured
to decode a representation of a neural network (10) from a data stream (45),
20 wherein the data stream (45) is structured into individually accessible
portions (200), each portion representing a corresponding neural network
portion of the neural network, wherein the apparatus is configured to decode
from the data stream (45), for each of one or more predetermined
individually accessible portions, an identification parameter (310) for
25 identifying the respective predetermined individually accessible portion.
20. Apparatus of claim 19, wherein the identification parameter (310) is related
to the respective predetermined individually accessible portion via a hash
function or error detection code or error correction code.
30
66
21. Apparatus of claim 19 or claim 20, wherein the apparatus is configured to
decode, from the data stream (45), a higher-level identification parameter
(310) for identifying a collection of more than one predetermined
individually accessible portion.
5
22. Apparatus of claim 21, wherein the higher-level identification parameter
(310) is related to the identification parameters (310) of the more than one
predetermined individually accessible portion via a hash function or error
detection code or error correction code.
10
23. Apparatus of any previous claim 3 to 22, wherein the apparatus is configured
to decode a representation of a neural network (10) from a data stream (45),
wherein the data stream (45) is structured into individually accessible
portions (200), each portion representing a corresponding neural network
15 portion of the neural network, wherein the apparatus is configured to decode
from the data stream (45), for each of one or more predetermined
individually accessible portions a supplemental data (350) for
supplementing the representation of the neural network.
20 24. Apparatus of claim 23, wherein the data stream (45) indicates the
supplemental data (350) as being dispensable for inference based on the
neural network.
25. Apparatus of claim 23 or claim 24, wherein the apparatus is configured to
25 decode the supplemental data (350) for supplementing the representation of
the neural network for the one or more predetermined individually
accessible portions (200) from further individually accessible portions,
wherein the data stream (45) comprises for each of the one or more
predetermined individually accessible portions a corresponding further
30 predetermined individually accessible portion relating to the neural network
67
portion to which the respective predetermined individually accessible
portion corresponds.
26. Apparatus of any previous claim 23 to 25, wherein the supplemental data
5 (350) relates to
relevance scores of neural network parameters (32), and/or
perturbation robustness of neural network parameters (32).
10 27. Apparatus of any previous claim 3 to 26, for decoding a representation of a
neural network (10) from a data stream (45), wherein the apparatus is
configured to decode from the data stream (45) hierarchical control data
(400) structured into a sequence (410) of control data portions (420),
wherein the control data portions provide information on the neural network
15 at increasing details along the sequence of control data portions.
28. Apparatus of claim 27, wherein at least some of the control data portions
(420) provide information on the neural network which is partially
redundant.
20
29. Apparatus of claim 27 or claim 28, wherein a first control data portion
provides the information on the neural network by way of indicating a
default neural network type implying default settings and a second control
data portion comprises a parameter to indicate each of the default settings.
25
30. Apparatus for performing an inference using a neural network, comprising
an apparatus for decoding a data stream (45) according to any of claims 3 to
29, so as to derive from the data stream (45) the neural network, and
30
a processor configured to perform the inference based on the neural network.
68
31. Method for encoding a representation of a neural network into a data stream
(45), comprising providing the data stream with a serialization parameter
indicating a coding order at which neural network parameters, which define
neuron interconnections of the 5 neural network, are encoded into the data
stream, wherein the method comprises encoding, into the data stream, the
neural network parameters using context-adaptive arithmetic encoding.
32. Method for decoding a representation of a neural network from a data
10 stream, comprising decoding from the data stream a serialization parameter
indicating a coding order at which neural network parameters, which define
neuron interconnections of the neural network, are encoded into the data
stream, wherein the method comprises decoding, from the data stream, the
neural network parameters using context-adaptive arithmetic decoding.
15
33. Computer program for, when executed by a computer, causing the computer
to perform the method of claim 31 or claim 32.

Documents

Application Documents

#	Name	Date
1	202518071107-STATEMENT OF UNDERTAKING (FORM 3) [25-07-2025(online)].pdf	2025-07-25
2	202518071107-REQUEST FOR EXAMINATION (FORM-18) [25-07-2025(online)].pdf	2025-07-25
3	202518071107-POWER OF AUTHORITY [25-07-2025(online)].pdf	2025-07-25
4	202518071107-FORM 18 [25-07-2025(online)].pdf	2025-07-25
5	202518071107-FORM 1 [25-07-2025(online)].pdf	2025-07-25
6	202518071107-DRAWINGS [25-07-2025(online)].pdf	2025-07-25
7	202518071107-DECLARATION OF INVENTORSHIP (FORM 5) [25-07-2025(online)].pdf	2025-07-25
8	202518071107-COMPLETE SPECIFICATION [25-07-2025(online)].pdf	2025-07-25
9	202518071107-Proof of Right [13-08-2025(online)].pdf	2025-08-13