Abstract: SHA3 ACCELERATOR FOR HASH-BASED SIGNATURE SCHEMES ABSTRACT Some embodiments are directed to hash-based signature schemes, in particular SHA2 based. Embodiments include private key generation, public key generation, signature generation, and signature verification. For example, a hash function may be applied to form a pseudo-random function (PRF), taking as input various string, wherein the string starts with an opcode, and e.g., a seed. In an embodiment, at least the start of some of the strings have a common start which has at least the length of one input block and/or at least the length of one word of the hash function. Figure 1.
Description:Complete specification: The following specification particularly describes the invention and the manner in which it is to be performed.
Field of the invention:
[0001] The presently disclosed subject matter relates to a method to compute a public key corresponding to a private key of a SHA2-based signature scheme, a method to compute a private key of a SHA2-based signature scheme, a method to compute a signature of a SHA2-based signature scheme, a method to verify a signature of a SHA2-based signature scheme, a system for computing a private key and/or a public key and/or a signature and/or verifying a signature, a computer readable medium.
Background of the invention:
[0002] As quantum computing progresses, there is a strong interest in providing signature schemes that are resistant against attack using quantum computers. Hash-based signature schemes are a candidate for such signature schemes. These include the Winternitz One-Time Signature Plus (WOTS+), and the eXtended Merkle Signature Scheme (XMSS), both described in RFC 8391.
[0003] These signature schemes provide cryptographic digital signatures without relying on the conjectured hardness of mathematical problems. Instead, it is proven that they only rely on the properties of cryptographic hash functions.
[0004] Unfortunately, such signature schemes require the evaluations of large number of hash functions, in case of RFC 8391, of SHA2 hash functions. This makes hash-based signature schemes very compute intensive.
Summary of the invention:
[0005] It would be advantageous to have an improved signature scheme. The hash-based signature schemes of RFC 8391 are based on SHA2. The inventors realized that this choice limits the efficiency of implementation. For example, SHA2 is hard to parallelize. Interestingly, however, the inventors also discovered that some of the hash operations in RFC 8391 for SHA2 are partially redundant. Although the full messages on which SHA2 is executed will typically differ, it will often happen that a partial block or one or two initial blocks are used repeatedly in various algorithms of a signature scheme. In particular, this phenomenon may happen both during private key computation, public key computation, signature generation, and signature verification. Furthermore, the inventors realized that even if a block is only partially redundant, then optimizations can still take place.
[0006] Some embodiments are directed to hash-based signature schemes, in particular SHA2 based ones. Embodiments include private key generation, public key generation, signature generation, and signature verification. For example, a hash function may be applied to form a pseudo-random function (PRF), taking as input various strings, wherein the string starts with an opcode, and e.g., a seed. In an embodiment, at least the start of some of the strings have a common start which has preferably at least the length of one input block or at least the length of one input word of the hash function.
[0007] In an aspect of the invention, a system is configured for one or more of private key computation, public key computation, signature generation, and signature verification, and which is configured to precompute repeating initial partial, and/or full blocks. Such a system is an electronic device, e.g., a computer.
[0008] In an aspect of the invention, methods are provided for one or more of private key computation, public key computation, signature generation, and signature verification, and which precompute repeating initial partial, and/or full blocks. Such a method may be implemented on an electronic device, e.g., on a computer.
[0009] An embodiment of the methods may be implemented on a computer as a computer implemented method, or in dedicated hardware, or in a combination of both. Executable code for an embodiment of the method may be stored on a computer program product. Examples of computer program products include memory devices, optical storage devices, integrated circuits, servers, online software, etc. Preferably, the computer program product comprises non-transitory program code stored on a computer readable medium for performing an embodiment of the method when said program product is executed on a computer.
[0010] In an embodiment, the computer program comprises computer program code adapted to perform all or part of the steps of an embodiment of the method when the computer program is run on a computer. Preferably, the computer program is embodied on a computer readable medium
Brief description of the accompanying drawings:
[0011] Further details, aspects, and embodiments will be described, by way of example only, with reference to the drawings. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. In the figures, elements which correspond to elements already described may have the same reference numerals. In the drawings,
[0012] Figure 1a schematically shows an example of an embodiment of a signature system;
[0013] Figure 1b schematically shows an example of an embodiment of a signature system;
[0014] Figure 2a schematically shows an example of an embodiment of a SHA2 computation;
[0015] Figure 2b schematically shows an example of an embodiment of a public key computation;
[0016] Figure 3 schematically shows an example of an embodiment of a multi-use public key computation;
[0017] Figure 4a schematically shows an example of an embodiment of a method to compute a public key corresponding to a private key of a SHA2-based signature scheme;
[0018] Figure 4b schematically shows an example of an embodiment of a method to compute a public key corresponding to a private key in a SHA2-based Merkle Signature Scheme,
[0019] Figure 5a schematically shows a computer readable medium having a writable part comprising a computer program according to an embodiment; and
[0020] Figure 5b schematically shows a representation of a processor system according to an embodiment.
Reference Signs List
[0021] The following list of references and abbreviations corresponds to figures 1a-3, 5a, 5b, and is provided for facilitating the interpretation of the drawings and shall not be construed as limiting the claims.
100, 101 a signature system
110 a key generation device
120 a signing device
130 a verification device
111 a processor system
112 storage
113 a communication interface
121 a processor system
122 a storage
123 a communication interface
131 a processor system
132 a storage
133 a communication interface
172 a computer network
200 a public key computation
210 private key elements
220 chains
230 public key elements
300 a multi-use public key computation
301 multiple number of use-once public keys
302 L-tree
303 Hash tree
304 a Merkle tree root
1000, 1001 a computer readable medium
1010 a writable part
1020 a computer program
1110 integrated circuit(s)
1120 a processing unit
1122 a memory
1124 a dedicated integrated circuit
1126 a communication element
1130 an interconnect
1140 a processor system
Detailed description of the embodiments:
[0022] While the presently disclosed subject matter is susceptible of embodiment in many different forms, there are shown in the drawings and will herein be described in detail one or more specific embodiments, with the understanding that the present disclosure is to be considered as exemplary of the principles of the presently disclosed subject matter and not intended to limit it to the specific embodiments shown and described.
[0023] In the following, for the sake of understanding, elements of embodiments are described in operation. However, it will be apparent that the respective elements are arranged to perform the functions being described as performed by them.
[0024] Further, the subject matter that is presently disclosed is not limited to the embodiments only, but also includes every other combination of features described herein or recited in mutually different dependent claims.
[0025] Figure 1a schematically shows an example of an embodiment of a signature system 100. Signature system 100 comprises a key generation device 110, signing device 120, and verification device 130.
[0026] Key generation device 110 is configured to generate a private key and a corresponding public key for a hash-based signature scheme. Unlike most other signature systems, hash-based signatures can so far withstand known attacks using quantum computers. An unfortunate disadvantage of hash-based signatures is that they are very compute intensive, in particular, requiring a large amount of hash computations.
[0027] Signing device 120 is configured to generate a signature for a message using the private key generated by key generation device 110. Verification device 130 is configured to verify a signature for a message, using the public key generated by key generation device 110, the signature being generated by signing device 120 using the corresponding private key.
[0028] The signature system may be generated for any of the signature schemes defined in RFC 8391; we refer to the May 2018 version of RFC 8391, which is included herein by reference. We refer, in particular, to the passages cited herein. Other versions of RFC 8391 may be used instead though. RFC 8391 may be combined with various hash functions. In embodiments, a hash function of the SHA2 family is used, in particular, the functions SHA-512 and SHA-384, e.g., as described in the “Secure Hash Standard (SHS)”, FIPS PUB 180-4.
[0029] In particular, the signature system may be configured to support the one-time signature scheme Winternitz One-Time Signature Plus (WOTS+). The signature system may be configured to support the eXtended Merkle Signature Scheme (XMSS), and XMSS^MT, a multi-tree variant of XMSS. Both XMSS and XMSS^MT use WOTS+ as a main building block. RFC 8391 distinguishes two security levels. Parameters with n = 32 provide a classical security level of 256 bits. Parameters with n = 64 provide a classical security level of 512 bits, see RFC 8391 section 4.2.7. In embodiments, SHA-384 is used for the n = 48 case, to provide a classical security level of 384 bits, while SHA-512 is used for the n=64 case. Note that RFC 8391 refers only to n=32 or n=64. However, SHA-384 is for n = 48 bytes = 384 bits. Although, this variant is not explicitly mentioned in the RFC document, it is a part of the reference implementation (included herein by reference).
[0030] In signature system 100 as shown, key generation device 110 generates a private key and a corresponding public key. This could be a one-time signature, or a multi-use signature.
[0031] In an embodiment, key generation device 110 and signing device 120 are integrated into a single device. This integration simplifies the architecture of the signature system 100 and reduces the need for secure communication channels between different devices for key transfers. Here, the private key, once generated, remains within the device, and is used by the signing device to create a signature.
[0032] The verification device 130, in this setup, receives the public key from the key generation device 110 through a trusted channel. The channel could be a secure digital pathway, such as an encrypted communication link, or a physical transfer of the public key through secure means. The public key can also be obtained from a trusted public key repository.
[0033] In an embodiment, key generation device 110 and signing device 120 are distinct devices. After the key generation device 110 creates the private and public keys, the private key is transferred to the signing device 120, which will use it to sign messages. This separation provides computational advantages, particularly in terms of resource allocation and efficiency. For instance, the key generation device 110 could be a more powerful system, such as a desktop computer, equipped to handle the resource-intensive process of generating keys. Signing device 120 could be a device with limited resources, like a smartphone.
[0034] Signing device 120 obtains the private key from key generation device 110. This may be done, for example, by a direct physical link. The public key may also be transferred to the signing device so that it may be further distributed from there.
[0035] Verification device 130 can obtain the public key either from the key generation device 110 through a trusted channel or from the signing device 120. The latter option can be advantageous in systems where the public key is bound to a signature or a certificate, making the verification process more streamlined.
[0036] One significant advantage of this setup is the reduced computational load on the signing device 120. Since key generation, especially in hash-based systems like XMSS or WOTS+, is resource-intensive due to the heavy computational requirements of hash computations, separating these processes allows the signing device 120 to be optimized for less resource-intensive tasks. This separation can be particularly beneficial in environments where the signing process needs to be quick and efficient, such as in high-traffic communication systems or devices with limited processing capabilities.
[0037] In both embodiments, the public key may be uploaded to a public key repository.
[0038] For example, the system 100 may be used to securely authenticate digital transactions in online banking, where the key generation device 110 creates a unique key for a user, the signing device 120 ensures the integrity of transactional data, and the verification device 130 validates the transaction's authenticity. Additionally, the system can be implemented in secure email communications, where emails are signed and verified for authenticity, ensuring that the content has not been tampered with. In the healthcare sector, the system could be used to protect sensitive patient data, with each piece of data being signed and verified to ensure confidentiality and compliance with regulations. Furthermore, the system could be applied in blockchain technologies to provide enhanced security for cryptocurrency transactions. The system can also be used in software distribution, ensuring that software updates are authentic and have not been altered. For example, locally installed software may have a public key according to an embodiment and use it to verify signed software updates.
[0039] Key generation device 110 may comprise a processor system 111, a storage 112, and a communication interface 113. Signing device 120 may comprise a processor system 121, a storage 122, and a communication interface 123. Verification device 130 may comprise a processor system 131, a storage 132, and a communication interface 133.
[0040] In the various embodiments of communication interfaces 113, 123, and/or 133, the communication interfaces may be selected from various alternatives. For example, the interface may be a network interface to a local or wide area network, e.g., the Internet, a storage interface to an internal or external data storage, an application interface (API), etc.
[0041] Storage 112, 122, and 132 may be, e.g., electronic storage, magnetic storage, etc. The storage may comprise local storage, e.g., a local hard drive or electronic memory. Storage 112, 122, and 132 may comprise non-local storage, e.g., cloud storage. In the latter case, storage 112, 122, and 132 may comprise a storage interface to the non-local storage. Storage may comprise multiple discrete sub-storages together making up storage 112, 122, and 132.
[0042] Storage 112, 122, and 132 may be non-transitory storage. For example, storage 112, 122, and 132 may store data in the presence of power such as a volatile memory device, e.g., a Random Access Memory (RAM). For example, storage 112, 122, and 132 may store data in the presence of power as well as outside the presence of power such as a non-volatile memory device, e.g., a Flash memory. Storage may comprise a volatile writable part, say a RAM, and a non-volatile writable part, e.g., Flash. Storage may comprise a non-volatile non-writable part, e.g., ROM.
[0043] The devices 110, 120, and 130 may communicate internally, with each other, with other devices, external storage, input devices, output devices, and/or one or more sensors over a computer network. The computer network may be an internet, an intranet, a LAN, a WLAN, a WAN, etc. The computer network may be the Internet. The devices 110, 120, and 130 may comprise a connection interface which is arranged to communicate within system 100 or outside of system 100 as needed. For example, the connection interface may comprise a connector, e.g., a wired connector, e.g., an Ethernet connector, an optical connector, etc., or a wireless connector, e.g., an antenna, e.g., a Wi-Fi, 4G or 5G antenna.
[0044] The communication interface 113 may be used to send or receive digital data, e.g., to send a public key to device 130, or to send a private key to device 120. The communication interface 123 may be used to send or receive digital data, e.g., to receive message and private key for signing, or to send a signature to device 130. The communication interface 133 may be used to send or receive digital data, e.g., to receive a public key from device 110 or 120, a signed message from device 120, etc.
[0045] Key generation device 110, signing device 120, and verification device 130 may have a user interface, which may include well-known elements such as one or more buttons, a keyboard, display, touch screen, etc. The user interface may be arranged for accommodating user interaction for performing key generation, message signing, message verification.
[0046] The execution of devices 110, 120, and 130 may be implemented in a processor system. The devices 110, 120, and 130 may comprise functional units to implement aspects of embodiments. The functional units may be part of the processor system. For example, functional units shown herein may be wholly or partially implemented in computer instructions that are stored in a storage of the device and executable by the processor system.
[0047] The processor system may comprise one or more processor circuits, e.g., microprocessors, CPUs, GPUs, etc. Devices 110, 120, and 130 may comprise multiple processors. A processor circuit may be implemented in a distributed fashion, e.g., as multiple sub-processor circuits. For example, devices 110, 120, and 130 may use cloud computing.
[0048] Typically, the key generation device 110, signing device 120, and verification device 130 each comprise a microprocessor which executes appropriate software stored at the device; for example, that software may have been downloaded and/or stored in a corresponding memory, e.g., a volatile memory such as RAM and/or a non-volatile memory such as Flash.
[0049] Instead of using software to implement a function, the devices 110, 120, and/or 130 may, in whole or in part, be implemented in programmable logic, e.g., as field-programmable gate array (FPGA). The devices may be implemented, in whole or in part, as a so-called application-specific integrated circuit (ASIC), e.g., an integrated circuit (IC) customized for their particular use. For example, the circuits may be implemented in CMOS, e.g., using a hardware description language such as Verilog, VHDL, etc. In particular, key generation device 110, signing device 120, and verification device 130 may comprise circuits, e.g., for cryptographic processing, and/or arithmetic processing.
[0050] In hybrid embodiments, functional units are implemented partially in hardware, e.g., as coprocessors, e.g., cryptographic coprocessors, and partially in software stored and executed on the device. In particular, the cryptographic coprocessors may be configured for SHA2 hash computations.
[0051] Figure 1b schematically shows an example of an embodiment of a signature system 101. System 101 may comprise multiple key generation devices; shown is key generation device 110. System 101 may comprise multiple signing devices; shown is signing device 120. System 101 may comprise multiple verification devices; shown is verification device 130. The devices are connected through a computer network 172, e.g., the Internet. The key generation device 110, signing device 120, and verification device 130 may be according to an embodiment.
[0052] SHA-2: Figure 2a schematically shows an example of an embodiment of a SHA2 computation, in particular the functions SHA-512 and SHA-384. We refer to the standard “Secure Hash Standard (SHS)”, FIPS PUB 180-4, for full details. We also refer to the page “SHA-2” at Wikipedia
[0053] SHA2, a family of cryptographic hash functions, comprises multiple variants, including SHA-384 and SHA-512. Each algorithm can be described in two stages: preprocessing and hash computation. Preprocessing involves padding a message, parsing the padded message into m-bit blocks, and setting initialization values to be used in the hash computation. In case of SHA-384 and SHA-512 the block length m is 1024 bits or 128 bytes.
[0054] The hash computation generates a message schedule from a message block and uses that schedule to iteratively generate a series of intermediate hash values. The final intermediate hash value generated by the hash computation is used to determine the message digest. The intermediate hash in SHA-384 and SHA-512 is 8 words of each 64 bits or 8 bytes, the message schedule has 80 words of 64 bits or 8 bytes each. The initial 16 words of the message schedule are the same as the 16 words (of 64 bits or 8 bytes each) of the message block (of 1024 bits or 128 bytes). The intermediate hash values are generated iteratively by updating an initial intermediate hash value for each word in the message schedule.
[0055] Computing a hash function works as follows.
501: Pad the input message, yielding a padded bit string P, with a length divisible by m=1024 "bits"=128 bytes. Initialize the intermediate hash h_0,...,h_7 to predetermined values.
502: Loop over consecutive m-bit message blocks P0,P1,… of P.
503: Set working variables to the intermediate hash value.
504: For each message block, loop over 80 rounds, i=0,?,79
505: Extend the message schedule w_i; for rounds i=0,?,15, w_i is a next word in the message block, for rounds i=16,?,79, a message schedule extension function is applied to obtain w_i.
506: Update the working variables by applying a compression function to w_i and the working variables.
507: At the end of the loop over the 80 rounds, the working variables are added to the intermediate hash value.
508: The final hash value is obtained from the final intermediate hash value.
[0056] Below is the SHA2 function for SHA-384 and SHA-512 in pseudo code.
Pad message and break message into 1024-bit blocks; initialize the intermediate hash values h0, h1, …, h7.
[0057] For each message block:
Create an 80-entry message schedule array w[0, …, 79] of 64-bit words.
Copy message block into first 16 words w[0, …, 15] of a message schedule array.
Extend the first 16 words into the remaining 64 words w[16, …, 79] of the message schedule array:
For i from 16 to 79:
Apply message schedule extension function to previous 15 words, w[i-16], …, w[i-2] to obtain next word w[i].
Initialize working variables to current intermediate hash value:
a := h0
b := h1
c := h2
d := h3
e := h4
f := h5
g := h6
h := h7
[0058] Compression function main loop:
For i from 0 to 79:
Apply compression function to w[i] and working variables, and update working variables.
Add the working variables to the current intermediate hash value:
h0 := h0 + a
h1 := h1 + b
h2 := h2 + c
h3 := h3 + d
h4 := h4 + e
h5 := h5 + f
h6 := h6 + g
h7 := h7 + h
[0059] Produce the final hash value:
For SHA-512:
digest := hash := h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7
For SHA-384:
digest := hash := h0 append h1 append h2 append h3 append h4 append h5
[0060] For the SHA2 variants SHA-384 and SHA-512, in padding a 1 bit is appended, followed by a variable number of 0 bits, and finally, the length of the message as a 128-bit number. The length of the message after padding is a multiple of 128 bytes, or 1024 bits.
[0061] A (cryptographic) digital signature scheme provides asymmetric message authentication. The key generation algorithm produces a key pair consisting of a private and a public key. A message is signed using a private key to produce a signature. A message/signature pair can be verified using a public key. A One-Time Signature (OTS) scheme allows using a key pair to sign exactly one message securely. A Many-Time Signature (MTS) system can be used to sign multiple messages.
[0062] We will first discuss private key generation for a one-time signature.
[0063] Private key generation: For private key generation we refer to RFC 8391, in particular WOTS+ Private key generation. According to RFC 8391, the private key in WOTS+, denoted by sk (s for secret), is a length len array of n-byte strings. Herein, the parameter n refers to the message length that the signature scheme supports, and len is derived from the length n of the message, the size of message elements msg[i] as defined by the Winternitz parameter w, and the size of a checksum that may be needed. Accordingly, sk is an array of strings.
[0064] If one is willing to depart from the RFC 8391, then various variations are possible. For example, the size of individual elements could be more or less than n-bytes, depending on the security. The length of len could be more or less depending on whether the RFC 8391 checksum is used or another checksum algorithm.
[0065] Section 3.1.3 of RFC 8391 provides two ways to generate a private key. The first option is described in Algorithm 3, and initializes the elements sk[i] with uniformly random strings. As no hash computation is done, there is no scope to improve on hash computations. Also, it incurs the additional cost of storing the private key sk, until used. Nevertheless, the computation of other values, in particular, the public key may benefit from an embodiment.
[0066] The second option according to the RFC may follow section 3.1.7 of RFC 8391. Here the n-byte value of the private key is computed as sk[i] = PRF(S, toByte(i, 32)), wherein, S is a secret seed. Expanding the PRF according to section 5.1 introduces an OPCODE_3.
[0067] For n=48, and SHA-384, we have the SHA2 application to toByte(3, 4)?KEY?M, which in this case evaluates to toByte(3, 4)?S?toByte(i, 32). The toByte(i, 32) functions as the address. Here S is a secret seed used to generate the private key. Assuming a 48-byte secret seed S, this means that the first 52 bytes are constant. Furthermore, the first 30 bytes of the address, that is of the toByte(i, 32) part, is also constant (assuming the largest value of i to be 216-1=65535), giving a total of 82 bytes. This means that the first 10 rounds of SHA-384 can be precomputed for the first 80-byte constant string.
0068] For n=64, and SHA-512, we have the SHA2 application to toByte(3, 64)?KEY?M, which in this case evaluates to toByte(3, 64)?S?toByte(i, 32). The toByte(i, 32) functions as the address. Here S is a secret seed used to generate the private key. Assuming a 64-byte secret seed S, this means that the first 128 bytes are constant. Furthermore, the first 30 bytes of the address, that is of the toByte(i, 32) part, is also constant (assuming the largest value of i to be 216-1=65535), giving a total of 158 bytes, that is the first 83 rounds (80 rounds of one full SHA-512 block of 128 bytes, and 3 rounds of the next block of 24 bytes) of SHA-512 can be precomputed for the first 152-byte constant string
[0069] We refer to a constant value used at the beginning of a SHA2 computation as an opcode. In this case the opcode is toByte(3, 4) for SHA-384, and toByte(3, 64) for SHA-512. For example, toByte(3, 4) or toByte(3, 64) is referred to as OPCODE_3 or OPCODE for short. Variant schemes, that do not adhere to RFC 8391 may take any random value for the OPCODE. The secret seed (e.g., S) is also constant for each element in the private key, but it is only constant for this particular private key.
[0070] Another option, which is used in practice, and which is also used in the reference implementation of the RFC, to generate a private key is to generate the sequence of random values by repeatedly applying a PRF_KGen function. The PRF_KGen function comprises applying a SHA2 hash to a string starting with a concatenation of: an opcode (e.g., OPCODE_4), and a secret seed (e.g., SK_SEED), the string further comprising the public seed (e.g., SEED), and an address (e.g., ADRS) in the private key (sk). For example, opcode (e.g., OPCODE_4), secret seed (e.g., SK_SEED), and public seed (e.g., SEED) are longer than an input block size of the SHA2 hash function (e.g., SHA-512). Furthermore, the first 16 bytes of the address (e.g., ADRS) are also constant for the OTS Hash Addresses for private key computation.
[0071] For example, in this option to generate the private key is to apply SHA2 to the concatenation of OPCODE_4?SK_SEED?SEED?ADRS. Other elements may be added, e.g., padding.
[0072] Here SK_SEED is the same secret seed S as above, but SEED is a public seed. The public seed is used to generate the public key, and including it in the generation of the private key links the two keys to each other. ADRS is an index into the private key. For example, it increases for each i in sk[i].
[0073] For n=64, and SHA-512, the size of each of OPCODE_4, SK_SEED, and SEED is 64 bytes each, bringing the number of constant bytes up to 64+64+64+16 = 208. The 208 constant bytes comprise 128 bytes for a full first input block, that is 80 rounds, and 80 further bytes of a second block, that is 10 further rounds that can be precomputed.
[0074] In particular, in an embodiment for SHA-512:
For the PRF_KGen() Function of gen_sk() Function, we precompute:
H_sk = SHA-512H0(OPCODE_4?SK_SEED?SEED?ADRS[1..16]). – Computed Only Once.
The cost of precomputation is 90 rounds; 80 rounds of one full first block, and 10 rounds of the second block.
Then, for each private key sk[i], we compute:
sk[i] = SHA-512H_sk(ADRS[17..32]?pad).
The cost of computation is 70 rounds, since out of a total of 160 rounds, 90 rounds have been consumed in the pre-computation part. So, the remaining 160-90=70 rounds are a part of the final computation.
[0075] For n=48, and SHA-384, the size of OPCODE_4 is 4 bytes, and the size of each of SK_SEED and SEED is 48 bytes. Furthermore, the first 12 bytes of ADRS are constant. This brings the number of constant bytes up to 4+48+48+12 = 112. The 112 constant bytes are shorter than the 128 bytes input block size of SHA-384. Nevertheless, since 112=8*14, the first 14 rounds of SHA-384 can be precomputed. Furthermore, the next 8 bytes of ADRS are constant for a WOTS+ key. Since these 8 bytes form a constant 8-byte word, the next 1 round of SHA-384 can be precomputed for a WOTS+ key.
[0076] In particular, in an embodiment for SHA-384:
For the PRF_KGen() Function of gen_sk() Function, we precompute:
H_prf-kgen = SHA-384H0(OPCODE_4?SK_SEED?SEED?ADRS[1..12]). – Computed Only Once.
The cost of precomputation is 14 rounds.
For each WOTS+ key, we precompute:
H_sk = SHA-384H_prf-kgen(ADRS[13..20]). – Computed for Each WOTS+ key Only Once.
The cost of precomputation is 1 round.
Then, for each private key sk[i], we compute:
sk[i] = SHA-384H_sk(ADRS[21..32]?pad).
The cost of computation is 145 rounds, since out of a total of 160 rounds, 14+1=15 rounds have been consumed in the pre-computation part. So, the remaining 160-15=145 rounds are a part of the final computation.
[0077] In the above, we denote with a subscript the precomputed state of the hash function, e.g., H0 for the predetermined initial intermediate hash values as set in the standard.
[0078] In an embodiment, at least a first block and/or at least a partial block of the string is constant during the computing of the private key. That is, for each element of the private key, the string on which SHA2 is applied starts with the same block and/or with the same partial block. A block has the size of an input block of the used SHA2 algorithm, also referred to as m, e.g., block size is 128 bytes for SHA-384 and SHA-512. A partial block has the size of at least a word of the used SHA2 algorithm, e.g., word size is 8 bytes for SHA-384 and SHA-512. For example, the following algorithm may be used:
Apply the round function of SHA2 on the constant at least first block(s) and/or partial block(s). Store the state H’ of the SHA2 function.
For each element of the private key,
Initialize the SHA2 function from stored state H’.
Apply the round function on the words formed by the remaining non-constant block(s) and/or word(s).
Produce the final hash value to obtain the private key element.
[0079] Instead of computing a single private key, the algorithm may be adapted to generate a string of private keys, also by repeatedly running the PRF, but in this case repeating it for longer. This is useful for generating a multi-use private key.
[0080] In an embodiment, OPCODE_4, SK_SEED, SEED and part of the address are together at least a partial block, e.g., SHA-384, or at least a full block and a partial block of the SHA2 hash function, e.g., SHA-512.
[0081] Public key generation: Figure 2b schematically shows an example of an embodiment of a public key computation 200.
[0082] Given a private key, e.g., generated as above, a corresponding public key is generated. The private key comprises a sequence of random values, e.g., sk[0], sk[1], sk[2], …, sk[len-1]. A pseudorandom value is considered a random value herein.
[0083] Shown in figure 2b at 210 are the private key elements generated by the private key generation. Each value, e.g., of length n bytes, is shown as a small circle. For each value of sk[i], a chain is computed. For each element of the private key, e.g., for each sk[i], a hash chain is generated. The chains are shown at 220. To generate a chain, a chain function is repeatedly applied, each time to a previous value in the chain, starting with the private key element. For example, if x=sk[i], then the chain may be computed as chain(x), chain(chain(x)), chain(chain(chain(x))), etc., up to the chain length. The chain length may be determined by the Winternitz constant. The resulting public key is indicated at 230.
[0084] In an embodiment, the chain function may be computed as follows by:
- applying a first pseudo-random function (PRF) to generate a key (e.g., KEY),
- applying a second PRF to generate a bitmask (e.g., BM),
- XOR-ing the bitmask (e.g., BM) with the previous value (e.g., tmp) in the chain, and
- applying a SHA2 hash function to a concatenation of an opcode (e.g., OPCODE_0), the key (e.g., KEY), and the XOR result (e.g., tmp XOR BM).
[0085] The string to which the first and second PRF are applied may comprise in order a concatenation of: an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), an address (e.g., ADRS), and padding. The results of the two PRFs are combined to compute the output of the chain function.
[0086] The public key is then derived from the final value for each of the chains. Interestingly, in an embodiment, at least a first block and/or at least a partial block of the string is constant during the computing of the chains, e.g., during application of the first and second PRF. This can be used to precompute the round function on the constant at least a first block and/or at least a partial block.
[0087] For example, in an embodiment generating a public key (e.g., pk) corresponding to a private key (e.g., sk), may use algorithm 4 of RFC 8391. According to this algorithm, a chain function is applied to each element of the private key. The chain function may be implemented as in Algorithm 2 of RFC 8391. The latter contains the following three function calls:
KEY = PRF(SEED, ADRS);
BM = PRF(SEED, ADRS);
tmp = F(KEY, tmp XOR BM);
[0088] For n=64, and SHA-512, the first and second PRF computations may be computed following section 5.1 by applying SHA-512 on: toByte(3, 64)?SEED?ADRS. We will refer to toByte(3, 64) as OPCODE_3. The two calls are distinguished because the ADRS field which indicates the chain and location in the chain, also has a separate field for distinguishing between the calls for key and bitmask, and is computed differently in the two cases. First of all, OPCODE_3?SEED is constant, that is the first 128 bytes are constant throughout the public key computation. Accordingly, the first block can be precomputed. For the ADRS field, the first 8 bytes (e.g., ADRS[1..8]) are constant in any case throughout the public key computation. The second 8 bytes (e.g., ADRS[9..16]) are constant throughout the computation of one-time public keys. The third 8 bytes (e.g., ADRS[17..24]) are constant for a given chain of the WOTS+ key. Accordingly, the first two rounds can be precomputed for second block in a public key computation. While the third block can be precomputed for each chain in the public key.
[0089] For example, the address ADRS may be the 32-byte OTS hash address as specified in Section 2.5 of RFC 8391. The OTS hash address may have the fields: layer address (32 bits), tree address (64 bits), type = 0 (32 bits), OTS address (32 bits), chain address (32 bits), hash address (32 bits), and keyAndMask (32 bits). In an embodiment, the fields layer address, tree address, and type, are constant for the entire one-time public key computation.
[0090] Choosing the SHA2 function SHA-512 for this public key computation yields one full constant block throughout the computation, of 128 bytes, and a partial block, of 24 bytes. Precomputing this full block, e.g., the 80 rounds, and the partial block, e.g., the 3 rounds, and storing the SHA2 state H_chain, saves 83 SHA2 rounds for each PRF computation in the chains.
[0091] In particular, in an embodiment for SHA-512:
For the PRF() Function of chain() Function, we precompute:
H_prf = SHA-512H0(OPCODE_3?SEED?ADRS[1..8]). – Computed Only Once.
The cost of precomputation is 81 rounds.
The value of H_prf is common for the chain() function and rand_hash() function, described later.
The OPCODE_3 is Common for these two functions.
The SEED is Constant throughout the scheme.
The First 8 Bytes of ADRS are Zero for the XMSS scheme.
For the PRF() Function of chain() Function, we precompute:
H_ots = SHA-512H_prf(ADRS[9..16]). – Computed for OTS Only Once.
The cost of precomputation is 1 round.
For each chain, we precompute:
H_chain = SHA-512H_ots(ADRS[17..24]). – Computed for each WOTS+ chain Only Once.
The cost of precomputation is 1 round.
Then, for each step, we compute:
val = SHA-512H_chain(ADRS[25..32]?pad).
The cost of computation is 77 rounds, since out of a total of 160 rounds, 81+1+1=83 rounds have been consumed in the pre-computation part. So, the remaining 160-83=77 rounds are a part of the final computation.
[0092] For n=64, and SHA-512, the function F is defined in Algorithm 2 of RFC 8391 and may be computed following section 5.1 by applying SHA-512 on: toByte(0, 64)?KEY?(tmp XOR BM). We will refer to toByte(0, 64) as OPCODE_0. First of all OPCODE_0 is constant, that is the first 64 bytes are constant throughout the public key computation. Accordingly, the first eight rounds can be precomputed for the first partial block in a public key computation. Precomputing this partial block, e.g., the 8 rounds, and storing the SHA2 state H_f, saves 8 SHA2 rounds for each F computation in the chains.
[0093] In particular, in an embodiment for SHA-512:
For the F() Function of chain() Function, we precompute:
H_f = SHA-512H0(OPCODE_0). – Computed Only Once.
The cost of precomputation is 8 rounds.
Then, for each step, we compute:
val = SHA-512H_f(KEY?(tmp XOR BM)?pad).
The cost of computation is 152 rounds, since out of a total of 160 rounds, 8 rounds have been consumed in the pre-computation part. So, the remaining 160-8=152 rounds are a part of the final computation.
[0094] For n=48, and SHA-384, we have in an embodiment:
For the PRF() Function of chain() Function, we precompute:
H_prf = SHA-384H0(OPCODE_3?SEED?ADRS[1..12]). – Computed Only Once.
Herein, OPCODE_3 is a 4-byte value and SEED is a 48-byte value.
Note the first 12 Bytes of ADRS are Zero for the XMSS scheme.
The cost of precomputation is 8 rounds.
The value of H_prf is common for the chain() function and rand_hash() function, described later.
For each WOTS+ key, we precompute:
H_wots = SHA-384H_prf(ADRS[13..20]). – Computed for each WOTS+ key Only Once.
The cost of precomputation is 1 round.
For each invocation of chain() Function, we precompute:
H_step = SHA-384H_wots(ADRS[21..28]). – Computed for each chain() Function Only Once.
The cost of precomputation is 1 round.
Then, for each step, we compute:
val = SHA-384H_step(ADRS[29..32]?pad).
The cost of computation is 70 rounds, since out of a total of 80 rounds, 8+1+1=10 rounds have been consumed in the pre-computation part. So, the remaining 80-10=70 rounds are a part of the final computation.
[0096] Signature generation: A private key (e.g., sk) corresponding to a public key (e.g., pk) may be used to generate a signature (e.g., sig). To compute a signature for a message comprising a sequence of values, one may compute as follows for each value in the sequence of values in the message.
[0097] Repeatedly the chain function is applied to the corresponding value in the private key (e.g., sk[i]) as many times as indicated by the corresponding value in the message (e.g., msg[i]). The signature comprising the final computed value in the chain for each private key value. The same chain function is used for signing a message as for generating the public key (e.g., pk). Accordingly, the same advantages due to constant initial block(s) and/or partial block(s) apply to the signing of a message as well. In an embodiment, at least a first block and/or at least a partial block of the string hashed in the chain function is constant during the computing of the chains. This constant block(s) and/or partial block(s) is precomputed for the round function of SHA2 function.
[0098] In an embodiment, a checksum (e.g., csum) may be included in the message to ensure that the signature cannot be tampered with. A simple checksum is shown in RFC 8391, which is appended to the message. Other types of checksums could be used.
[0099] For example, generating a signature from a private key may use algorithm 5 of RFC 8391. The core part of this function is:
for ( i = 0; i < len; i++ ) {
ADRS.setChainAddress(i);
sig[i] = chain(sk[i], 0, msg[i], SEED, ADRS);
}
Note that the third parameter msg[i] indicates the number of iterations, which for a public key computation would be w-1 (w being the Winternitz parameter, e.g., the maximum value of msg[i]+1).
[0100] A different number of iterations is possible. For example, a permutation function p may be applied to msg[i], e.g., p(msg[i]), which may then be used as the number of iterations in both the signing and verifying algorithm. This would provide a valid signature scheme—though not conforming to RFC 8391.
[0101] Signature verification: The signature (e.g., sig) may be verified using the public key (e.g., pk). For each value in the sequence of values in the signature (e.g., sig[i]), a temporary public key value (e.g., tmp_pk[i]) is computed by repeatedly applying the chain function to each value in the signature (e.g., sig[i]) as many times as indicated by the difference between a predetermined maximum value (e.g., w-1) and the corresponding value in the message (e.g., msg[i]). The value derived from the temporary public key values is compared to the public key to determine the validity of the signature. If the temporary public key is equal to the public key, the signature is considered valid.
[0102] For example, this may use algorithm 6 in RFC 8391. The core part of this algorithm is:
for ( i = 0; i < len; i++ ) {
ADRS.setChainAddress(i);
tmp_pk[i] = chain(sig[i], msg[i], w-1–msg[i], SEED, ADRS);
}
Note that the signature verification should include the same checksum (e.g., csum) included during signature generation.
[0103] Note that the third parameter w-1-msg[i] indicates the number of iterations, which for a public key computation would be w-1. Note that the sum of the number of iterations done during signature generation for a message value (e.g., msg[i]) plus the number of iterations done during signature verification (e.g., w-1-msg[i]) equals the number of iterations done during public key generation (e.g., w-1). This principle may be extended to create an arbitrary number of different signature/verification schemes.
[0104] Both signature generation and verification use the chain function, which for the PRF() function includes a constant first block (e.g., 128 bytes) and a partial block (e.g., 24 bytes) for SHA-512, and includes a constant partial block (e.g., 80 bytes) for SHA-384, and which for the F() function includes a partial block (e.g., 64 bytes) for SHA-512, and so benefits from precomputation.
[0105] The algorithms have been described in the context of use-once signature schemes. This is not necessary though. An XMSS signature scheme combines many private keys into one public key, thus saving space, though at the expense of additional computation.
[0106] Figure 3 schematically shows an example of an embodiment of a multi-use public key computation 300. Figure 3 shows a tree. At 301 there are a number of use-once public keys according to an embodiment. At the right of figure 3, the bottom of layer 301, are the elements of the private keys (e.g., sk) 210. Above the private key elements are their corresponding chains 220, and the top of layer 301, are the elements of the public keys (e.g., pk) 230. At 302 and 303 the public key values are combined according to a Merkle tree, culminating in a Merkle tree root 304. The part 302 of the trees that correspond to exactly one use-once public key is called the L-tree. The part 303 is called the Hash tree. The L-trees and Hash tree form a Merkle tree. Accordingly, the signature scheme may be called a Merkle Signature Scheme.
[0107] To compute the tree for multiple private keys (e.g., sk0, sk1,…, sk2^h-1 also referred to as the WOTS+ private keys), first the corresponding public keys (e.g., pk0, pk1,…, pk2^h-1 also referred to as the WOTS+ public keys) are computed. This may use an embodiment as described herein, e.g., with respect to figure 2b.
[0108] Next, a Merkle tree of hash values is computed, starting from the values in the public keys, that is the final values in the chains computed from the private key values sk[i]. A Merkle tree is computed by repeatedly computing a hash value on a next level of the tree by applying a tree function to two hash values on a lower level of the tree.
[0109] In an embodiment, the tree function may be computed as follows by:
- applying a fourth pseudo-random function (PRF) to generate a key (e.g., KEY),
- applying a fifth PRF to generate a first bitmask (e.g., BM_0),
- applying a sixth PRF to generate a second bitmask (e.g., BM_1),
- XOR-ing the first bitmask (e.g., BM_0) with a first value (e.g., LEFT) on the lower level of the tree,
- XOR-ing the second bitmask (e.g., BM_1) with a second value (e.g., RIGHT) on the lower level of the tree,
- applying a SHA2 hash function to a concatenation of an opcode (e.g., OPCODE_1), the key (e.g., KEY), the first XOR result (e.g., LEFT XOR BM_0), and the second XOR result (e.g., RIGHT XOR BM_1).
[0110] The fourth, fifth and sixth PRF comprise applying a SHA2 hash to a string starting with an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), and an address (e.g., ADRS). For example, the string may comprise in order a concatenation of: an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), an address (e.g., ADRS), and padding.
[0111] As the opcode and public seed for n=64 and SHA-512, are each 64 bytes, the first 128 bytes of the PRFs are constant throughout the tree computation. Of the address, the first 8 bytes are constant as well. This means that in total 136 bytes are constant. This means that for SHA-512, which has an input block size m=128 bytes, the first block is constant.
[0112] The key (e.g., KEY), first bitmask (e.g., BM_0), and second bitmask (e.g., BM_1) are then used to combine two values (e.g., LEFT and RIGHT) on a lower level of the tree to obtain a value on the next higher level of the tree. For example, as indicated above in the final three bullet points.
[0113] The multi-use public key (e.g., PK) may be derived from the root value of the Merkle tree. The rest of the tree as shown in figure 3, need not be part of the public key. Also, in the computations for the Merkle tree, in particular, when applying the fourth PRF, fifth PRF and sixth PRF may start with at least a first constant block and/or at least a partial constant block, which may be precomputed.
[0114] An example of an XMSS key generation, which is an example of such a tree as above, is given in RFC 8391, algorithm 10, calling algorithm 9. Algorithm 9, the TREEHASH algorithm, computing the root node 304 of the Hash Tree 303 of given height h, in turn calls algorithm 8 for the computation of root values of the L-Trees 302, and calls algorithm 7, the RAND_HASH algorithm, for the computation of internal nodes of the Hash Tree 303. Algorithm 8, the LTREE algorithm, computing the root node of the L-Tree 302, in turn calls the algorithm 7, the RAND_HASH algorithm, for the computation of internal nodes of the L-Tree 302. This RAND_HASH algorithm comprises the following lines:
KEY = PRF(SEED, ADRS);
BM_0 = PRF(SEED, ADRS);
BM_1 = PRF(SEED, ADRS);
node = H(KEY, (LEFT XOR BM_0) ? (RIGHT XOR BM_1));
Using for the PRF and H functions the definitions in section 5.1, adapted to SHA-512:
H: SHA-512(toByte(1, 64) ? KEY ? M),
PRF: SHA-512(toByte(3, 64) ? KEY ? M)
Here, toByte(1, 64) and toByte(3, 64) are used as the opcodes OPCODE_1 and OPCODE_3, respectively.
[0115] The three calls to PRF each expand to a SHA2 call on a string of the form OPCODE_3?SEED?ADRS. The first two of which are constant. Of the address ADRS, the first 8 bytes (e.g., ADRS[1..8]) are constant in any case throughout the public key computation as well. The second 8 bytes (e.g., ADRS[9..16]) are constant throughout the computation of the L-Trees and the computation of the Hash Tree, but the former is different from the latter, so, two different values (e.g., H_l-tree and H_h-tree) need to be computed. For the L-Tree computation, the third 8 bytes (e.g., ADRS[17..24]) are constant for all the L-Tree nodes at the same height for a specific WOTS+ key. For the Hash Tree computation, the third 8 bytes (e.g., ADRS[17..24]) are constant for all the Hash Tree nodes at the same height. That is the first 152 bytes are constant, e.g., 128+8*3 bytes. This means for SHA-512, that the first block is constant (e.g., 128 bytes). Accordingly, the first block can be precomputed.
[0116] The first word (e.g., 8 bytes) of the second block, is constant throughout the public key computation, the second word (e.g., 8 bytes) of the second block for L-Tree computation is constant for all the L-Trees, the second word (e.g., 8 bytes) of the second block for Hash Tree computation is constant for the Hash Tree, the third word (e.g., 8 bytes) of the second block for L-Tree computation is constant for all the L-Tree nodes at the same height for a specific WOTS+ key, and the third word (e.g., 8 bytes) of the second block for Hash Tree computation is constant for all the Hash Tree nodes at the same height. Accordingly, the first round can be precomputed for second block in a public key computation. Then, the second round can be precomputed for the L-Tree and for the Hash Tree separately. After that, for L-Trees, the third block can be precomputed for each WOTS+ key for all the L-Tree nodes at the same height. Also, for Hash Tree, the third block can be precomputed for all the Hash Tree nodes at the same height.
[0117] A multi-use public key computation according to RFC 8391 may use three different addresses, according to section 2.5. The address included in the public key computations for the individual use-once private keys may use an OTS hash address, as already described above. To combine the public key for one individual use-once private key into a single value—the L-tree—a different address scheme (e.g., L-Tree Address) may be used. The L-tree address comprises, in order: layer address (32 bits), tree address (64 bits), type = 1 (32 bits), L-tree address (32 bits), tree height (32 bits), tree index (32 bits), and keyAndMask (32 bits). Also, here the first 16 bytes are constant throughout the L-Tree, and the next 8 bytes are constant for all the L-Tree nodes at the same height for a specific WOTS+ key. Finally, the combination of the L-tree roots, sometimes referred to as the Hash tree, uses Hash Tree Address having the fields: layer address (32 bits), tree address (64 bits), type = 2 (32 bits), Padding = 0 (32 bits), tree height (32 bits), tree index (32 bits), and keyAndMask (32 bits). Also, here the first 16 bytes are constant throughout the Hash Tree, and the next 8 bytes are constant for all the Hash Tree nodes at the same height.
[0118] Choosing the SHA2 function SHA-512 for this public key computation yields one full constant block throughout the computation, of 128 bytes, and a partial block, of 24 bytes. Precomputing this full block, e.g., the 80 rounds, and the partial block, e.g., the 3 rounds, and storing the SHA2 state (e.g., H_level for L-Tree, and H_height for Hash Tree), saves 83 SHA2 rounds for each element in the Merkle Tree.
[0119] In an embodiment, with n=64, and SHA-512, we have:
For the PRF() Function of chain() and rand_hash() Functions, we precompute:
H_prf = SHA-512H0(OPCODE_3?SEED?ADRS[1..8]). – Computed Only Once.
The cost of precomputation is 81 rounds.
The value of H_prf is common for the chain() and rand_hash() functions.
The OPCODE_3 is Common for these two functions.
The SEED is Constant throughout the scheme.
The First 8 Bytes of ADRS are Zero for the XMSS scheme.
[0120] In an embodiment, for L-Tree, we have:
For the PRF() Function of rand_hash() Function in L-Tree, we precompute:
H_l-tree = SHA-512H_prf(ADRS[9..16]). – Computed for L-Tree Only Once.
The cost of precomputation is 1 round.
For each level of the L-Tree in the specific WOTS+ key, we precompute:
H_level = SHA-512H_l-tree(ADRS[17..24]). – Computed for L-Tree Nodes at same Level Only Once for each WOTS+ key.
The cost of precomputation is 1 round.
Then, for each node in L-Tree, we compute:
val = SHA-512H_level(ADRS[25..32]?pad).
The cost of computation is 77 rounds, since out of a total of 160 rounds, 81+1+1=83 rounds have been consumed in the pre-computation part. So, the remaining 160-83=77 rounds are a part of the final computation.
[0121] In an embodiment, for Hash Tree, we have:
For the PRF() Function of rand_hash() Function in Hash Tree, we precompute:
H_h-tree = SHA-512H_prf(ADRS[9..16]). – Computed for Hash Tree Only Once.
The cost of precomputation is 1 round.
For each level in the Hash Tree, we precompute:
H_height = SHA-512H_h-tree(ADRS[17..24]). – Computed for Hash Tree Nodes at same Level Only Once.
The cost of precomputation is 1 round.
Then, for each node in Hash Tree, we compute:
val = SHA-512H_height(ADRS[25..32]?pad).
The cost of computation is 77 rounds, since out of a total of 160 rounds, 81+1+1=83 rounds have been consumed in the pre-computation part. So, the remaining 160-83=77 rounds are a part of the final computation.
[0122] For n=64, and SHA-512, the function H is defined in Algorithm 7 of RFC 8391 and may be computed following section 5.1 by applying SHA-512 on: toByte(1, 64)?KEY?(LEFT XOR BM_0) ?(RIGHT XOR BM_1). We will refer to toByte(1, 64) as OPCODE_1. First of all OPCODE_1 is constant, that is the first 64 bytes are constant throughout the public key computation. Accordingly, the first eight rounds can be precomputed for the first partial block in a public key computation. Precomputing this partial block, e.g., the 8 rounds, and storing the SHA2 state H_h, saves 8 SHA2 rounds for each H computation in the Merkle Tree.
[0123] In particular, in an embodiment for SHA-512:
For the H() Function of rand_hash() Function, we precompute:
H_h = SHA-512H0(OPCODE_1). – Computed Only Once.
The cost of precomputation is 8 rounds.
Then, for each node in the L-Tree and Hash Tree, we compute:
val = SHA-512H_h(KEY?(LEFT XOR BM_0) ?(RIGHT XOR BM_1)?pad).
The cost of computation is 232 rounds, since out of a total of 240 rounds, 8 rounds have been consumed in the pre-computation part. So, the remaining 240-8=232 rounds are a part of the final computation.
[0124] Using for the PRF and H functions the definitions in section 5.1, adapted to SHA-384:
H: SHA-384(toByte(1, 4) ? KEY ? M),
PRF: SHA-384(toByte(3, 4) ? KEY ? M)
Here, toByte(1, 4) and toByte(3, 4) are used as the opcodes OPCODE_1 and OPCODE_3, respectively.
[0125] The three calls to PRF each expand to a SHA2 call on a string of the form OPCODE_3?SEED?ADRS. The first two of which are constant. Of the address ADRS, the first 12 bytes (e.g., ADRS[1..12]) are constant in any case throughout the public key computation as well. The next 8 bytes (e.g., ADRS[13..20]) are constant throughout the computation of the L-Trees for a specific WOTS+ key and the computation of the Hash Tree, but the former is different from the latter, so, two different values (e.g., H_l-tree and H_h-tree) need to be computed. For the L-Tree computation, the next 8 bytes (e.g., ADRS[21..28]) are constant for each invocation of the rand_hash() Function. For the Hash Tree computation, the next 8 bytes (e.g., ADRS[21..28]) are constant for each invocation of the rand_hash() Function. That is the first 80 bytes are constant, e.g., 4+48+28 bytes. This means for SHA-384, that the first 10 rounds of the first block can be precomputed.
[0126] A multi-use public key computation according to RFC 8391 may use three different addresses, according to section 2.5, as explained above. Here, in the L-Tree Address, the first 12 bytes are constant throughout the L-Tree, the next 8 bytes are constant throughout the computation of the L-Trees for a specific WOTS+ key, and the next 8 bytes are constant for each invocation of the rand_hash() Function. Also, here, in the Hash Tree Address, the first 20 bytes are constant throughout the Hash Tree, and the next 8 bytes are constant for each invocation of the rand_hash() Function.
[0127] Choosing the SHA2 function SHA-384 for this public key computation yields a partial block, of 80 bytes. Precomputing this partial block, e.g., the 10 rounds, and storing the SHA2 state (e.g., H_l-node for L-Tree, and H_h-node for Hash Tree), saves 10 SHA2 rounds for each element in the Merkle Tree.
[0128] In an embodiment, with n=48, and SHA-384, we have:
For the PRF() Function of chain() and rand_hash() Functions, we precompute:
H_prf = SHA-384H0(OPCODE_3?SEED?ADRS[1..12]). – Computed Only Once.
Herein, OPCODE_3 is a 4-byte value, and SEED is a 48-byte value.
Note the first 12 Bytes of ADRS are Zero for the XMSS scheme.
The cost of precomputation is 8 rounds.
The value of H_prf is common for the chain() and rand_hash() functions.
[0129] In an embodiment, for L-Tree, we have:
For the PRF() Function of rand_hash() Function in L-Tree, we precompute:
H_l-tree = SHA-384H_prf(ADRS[13..20]). – Computed for each WOTS+ key Only Once.
The cost of precomputation is 1 round.
For each invocation of rand_hash() Function in L-Tree, we precompute:
H_l-node = SHA-384H_l-tree(ADRS[21..28]). – Computed for each invocation of rand_hash() Function Only Once.
The cost of precomputation is 1 round.
Then, for each node in L-Tree, we compute:
val = SHA-384H_l-node(ADRS[29..32]?pad).
The cost of computation is 70 rounds, since out of a total of 80 rounds, 8+1+1=10 rounds have been consumed in the pre-computation part. So, the remaining 80-10=70 rounds are a part of the final computation.
[0130] In an embodiment, for Hash Tree, we have:
For the PRF() Function of rand_hash() Function in Hash Tree, we precompute:
H_h-tree = SHA-384H_prf(ADRS[13..20]). – Computed for Hash Tree Only Once.
The cost of precomputation is 1 round.
For each invocation of rand_hash() Function in Hash Tree, we precompute:
H_h-node = SHA-384H_h-tree(ADRS[21..28]). – Computed for each invocation of rand_hash() Function Only Once.
The cost of precomputation is 1 round.
Then, for each node in Hash Tree, we compute:
val = SHA-384H_h-node(ADRS[29..32]?pad).
The cost of computation is 70 rounds, since out of a total of 80 rounds, 8+1+1=10 rounds have been consumed in the pre-computation part. So, the remaining 80-10=70 rounds are a part of the final computation.
[0131] Signing a message for verification with a multi-use public key comprises the same signature computation as for a use-once key which is part of the multi-use tree. This provides the same advantages as for use-once signature computation.
[0132] To verify a signature using a multi-use public key may comprise computing a temporary public key as is done for verifying with a use-once public key. However, instead of directly comparing public key with the temporary public key, a value is derived from both—the Merkle tree root. A multi-use public key comprises the Merkle tree root. A temporary Merkle tree root is computed from the temporary public key using additional values which allow this computation, the so-called authentication path (e.g., AUTH), which is provided with the signature. The computing of authentication path (e.g., AUTH) requires computing the siblings of the intermediate nodes on the path from the used leaf node to the root 304 of the Hash tree 303. The computing of authentication path requires generating the entire Merkle tree, but due to the optimization by BDS08 Algorithm, the execution cost is substantially reduced at the price of storage memory. See, “Merkle Tree Traversal Revisited”, by Johannes Buchmann, Erik Dahmen & Michael Schneider, part of the Lecture Notes in Computer Science book series (LNCS, volume 5299). Furthermore, the above mentioned optimizations are applicable for generating WOTS+ private keys (e.g., sk), computing WOTS+ public keys (e.g., pk), computing the L-trees and then combining several L-trees to form the intermediate nodes of the Hash tree. By comparing the Merkle tree root to the temporary Merkle tree root the validity of the signature is determined.
[0133] Figure 4a schematically shows an example of an embodiment of a method 400 to compute a public key (e.g., pk) corresponding to a private key (e.g., sk) of a SHA2-based signature scheme, wherein the hash is SHA-384 or SHA-512. Method 400 comprises:
- obtaining (405) the private key, the private key comprising a sequence of random values (e.g., sk[0], sk[1], sk[2], …, sk[len-1]),
- computing (410) a set of chains by, for each value in the sequence of random values in the private key, computing a chain of hash values starting from the private key value by repeatedly applying a chain function to a previous value in the chain, the chain function comprising:
- applying (411) a first pseudo-random function (PRF) to generate a key, e.g., KEY = PRF(SEED, ADRS);
- applying (412) a second PRF to generate a bitmask, e.g., BM = PRF(SEED, ADRS); wherein the first and second PRF comprise applying the SHA2 hash to a string starting with a concatenation of: an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), and an address (e.g., ADRS). For example, opcode, public seed, and a few starting bytes of the address may together be at least as long as one or more rounds of the SHA2 hash function, that is at least 8 bytes, and preferably may be at least as long as a full input block, e.g., they may be at least 128 bytes.
- XOR-ing (413) the bitmask (e.g., BM) with the previous value in the chain, e.g., tmp;
- applying (414) the SHA2 hash function to a concatenation of an opcode (e.g., OPCODE_0), the key (e.g., KEY), and the XOR result (e.g., tmp XOR BM), e.g., tmp = F(KEY, tmp XOR BM), and
- deriving (420) the public key from the final value for each of the chains,
- wherein the SHA2 hash function is configured to iteratively receive one or more input blocks of the string, and is configured to iteratively update an intermediate hash value, at least a first block and/or at least a partial block of the string being constant during the computing of the chains, the method further comprising precomputing the intermediate hash value on the constant at least a first block and/or at least a partial block.
[0134] Figure 4b schematically shows an example of an embodiment of a method 430 to compute a public key (e.g., PK) corresponding to a private key of a SHA2-based Merkle Signature Scheme, wherein the hash is SHA-384 or SHA-512. Method 430 comprises:
- obtaining (440) multiple private keys (e.g., sk0, sk1,…, sk2^h-1), each private key comprising a sequence of random values (e.g., ski[0], ski[1], ski[2], …, ski[len-1]), and generating therefor multiple sets of chains, which may use an embodiment as described herein, e.g., with respect to figure 4a,
- computing (450) a Merkle tree of hash values, starting from the final values of each of the chains in each of the multiple sets, and repeatedly computing a hash value on a next level of the tree by applying a tree function to two hash values (e.g., LEFT and RIGHT) on a lower level of the tree, the tree function comprises:
- applying (451) a fourth pseudo-random function (PRF) to generate a key, e.g., KEY = PRF(SEED, ADRS);
- applying (452) a fifth PRF to generate a first bitmask, e.g., BM_0 = PRF(SEED, ADRS);
- applying (453) a sixth PRF to generate a second bitmask, e.g., BM_1 = PRF(SEED, ADRS); wherein the fourth, fifth, and sixth PRF comprise applying a SHA2 hash to a string starting with an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), and an address (e.g., ADRS). For example, opcode, public seed, and a few starting bytes of the address may together be at least as long as one or more rounds of the SHA2 hash function, that is at least 8 bytes, and preferably may be at least as long as a full input block, e.g., they may be at least 128 bytes.
- XOR-ing (454) the first bitmask (e.g., BM_0) with a first value on the lower level of the tree, e.g., LEFT,
- XOR-ing (455) the second bitmask (e.g., BM_1) with a second value on the lower level of the tree, e.g., RIGHT,
- applying (456) a SHA2 hash function to a concatenation of an opcode (e.g., OPCODE_1), the key (e.g., KEY), the first XOR result (e.g., LEFT XOR BM_0), and the second XOR result (e.g., RIGHT XOR BM_1), e.g., node = H(KEY, (LEFT XOR BM_0) ? (RIGHT XOR BM_1)); and
- deriving (460) the public key from the root value of the Merkle tree,
- wherein, at least a first block and/or at least a partial block of the string being constant during the computing of the tree, the method further comprising precomputing the intermediate hash value on the constant at least a first block and/or at least a partial block.
[0135] Many different ways of executing methods according to an embodiment are possible, including those of figure 4a and 4b, as will be apparent to a person skilled in the art. For example, the order of the steps can be performed in the shown order, but the order of the steps can be varied or some steps may be executed in parallel. Moreover, in between steps other method steps may be inserted. The inserted steps may represent refinements of the method such as described herein, or may be unrelated to the method. For example, some steps may be executed, at least partially, in parallel. Moreover, a given step may not have finished completely before a next step is started.
[0136] Embodiments of the method may be executed using software, which comprises instructions for causing a processor system to perform a method according to an embodiment, e.g., method 400 and/or 430. Software may only include those steps taken by a particular sub-entity of the system. The software may be stored in a suitable storage medium, such as a hard disk, a floppy, a memory, an optical disc, etc. The software may be sent as a signal along a wire, or wireless, or using a data network, e.g., the Internet. The software may be made available for download and/or for remote usage on a server. Embodiments of the method may be executed using a bitstream arranged to configure programmable logic, e.g., a field-programmable gate array (FPGA), to perform the method.
[0137] It will be appreciated that the presently disclosed subject matter also extends to computer programs, particularly computer programs on or in a carrier, adapted for putting the presently disclosed subject matter into practice. The program may be in the form of source code, object code, a code intermediate source, and object code such as partially compiled form, or in any other form suitable for use in the implementation of an embodiment of the method. An embodiment relating to a computer program product comprises computer executable instructions corresponding to each of the processing steps of at least one of the methods set forth. These instructions may be subdivided into subroutines and/or be stored in one or more files that may be linked statically or dynamically. Another embodiment relating to a computer program product comprises computer executable instructions corresponding to each of the devices, units and/or parts of at least one of the systems and/or products set forth.
[0138] Figure 5a shows a computer readable medium 1000 having a writable part 1010, and a computer readable medium 1001 also having a writable part. Computer readable medium 1000 is shown in the form of an optically readable medium. Computer readable medium 1001 is shown in the form of an electronic memory, in this case a memory card. Computer readable medium 1000 and 1001 may store data 1020 wherein the data may indicate instructions, which when executed by a processor system, cause a processor system to perform an embodiment of a method of private key generation, public key generation, signature generation, signature verification according to an embodiment. The computer program 1020 may be embodied on the computer readable medium 1000 as physical marks or by magnetization of the computer readable medium 1000. However, any other suitable embodiment is conceivable as well. Furthermore, it will be appreciated that, although the computer readable medium 1000 is shown here as an optical disc, the computer readable medium 1000 may be any suitable computer readable medium, such as a hard disk, solid state memory, flash memory, etc., and may be non-recordable or recordable. The computer program 1020 comprises instructions for causing a processor system to perform said method of private key generation, public key generation, signature generation, signature verification.
[0139] Figure 5b shows in a schematic representation of a processor system 1140 according to an embodiment of system configured for one or more of private key generation, public key generation, signature generation, signature verification. The processor system comprises one or more integrated circuits 1110. The architecture of the one or more integrated circuits 1110 is schematically shown in Figure 5b. Circuit 1110 comprises a processing unit 1120, e.g., a CPU, for running computer program components to execute a method according to an embodiment and/or implement its modules or units. Circuit 1110 comprises a memory 1122 for storing programming code, data, etc. Part of memory 1122 may be read-only. Circuit 1110 may comprise a communication element 1126, e.g., an antenna, connectors or both, and the like. Circuit 1110 may comprise a dedicated integrated circuit 1124 for performing part or all of the processing defined in the method. Processor 1120, memory 1122, dedicated IC 1124 and communication element 1126 may be connected to each other via an interconnect 1130, say a bus. The processor system 1110 may be arranged for contact and/or contact-less communication, using connectors and/or an antenna, respectively.
[0140] For example, in an embodiment, processor system 1140, e.g., the system configured for one or more of method of private key generation, public key generation, signature generation, signature verification may comprise a processor circuit and a memory circuit, the processor being arranged to execute software stored in the memory circuit. For example, the processor circuit may be an Intel Core i7 processor, ARM Cortex-R8, etc. The memory circuit may be a ROM circuit, or a non-volatile memory, e.g., a flash memory. The memory circuit may be a volatile memory, e.g., an SRAM memory. In the latter case, the device may comprise a non-volatile software interface, e.g., a hard drive, a network interface, etc., arranged for providing the software.
[0141] While device 1140 is shown as including one of each described component, the various components may be duplicated in various embodiments. For example, the processor 1120 may include multiple microprocessors that are configured to independently execute the methods described herein or are configured to perform steps or subroutines of the methods described herein such that the multiple processors cooperate to achieve the functionality described herein. Further, where the device 1140 is implemented in a cloud computing system, the various hardware components may belong to separate physical systems. For example, the processor 1120 may include a first processor in a first server and a second processor in a second server.
[0142] It should be noted that the above-mentioned embodiments illustrate rather than limit the presently disclosed subject matter, and that those skilled in the art will be able to design many alternative embodiments.
[0143] In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. Use of the verb ‘comprise’ and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. The article ‘a’ or ‘an’ preceding an element does not exclude the presence of a plurality of such elements. Expressions such as “at least one of” when preceding a list of elements represent a selection of all or of any subset of elements from the list. For example, the expression, “at least one of A, B, and C” should be understood as including only A, only B, only C, both A and B, both A and C, both B and C, or all of A, B, and C. The presently disclosed subject matter may be implemented by hardware comprising several distinct elements, and by a suitably programmed computer. In the device claim enumerating several parts, several of these parts may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
[0144] In the claims, references in parentheses refer to reference signs in drawings of exemplifying embodiments or to formulas of embodiments, thus increasing the intelligibility of the claim. These references shall not be construed as limiting the claim.
, Claims:We Claim:
1. A method (400) to compute a public key (e.g., pk) corresponding to a private key (e.g., sk) of a SHA2 hash-based signature scheme, wherein the hash is SHA-384 or SHA-512, the method comprising:
- obtaining (405) the private key, the private key comprising a sequence of random values (e.g., sk[0], sk[1], sk[2], …, sk[len-1]),
- computing (410) a set of chains by, for each value in the sequence of random values in the private key, computing a chain of hash values starting from the private key value by repeatedly applying a chain function to a previous value in the chain, the chain function comprising:
- applying (411) a first pseudo-random function (PRF) to generate a key,
- applying (412) a second PRF to generate a bitmask, wherein the first and second PRF comprise applying the SHA2 hash to a string starting with a concatenation of: an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), and an address (e.g., ADRS),
- XOR-ing (413) the bitmask with the previous value in the chain;
- applying (414) the SHA2 hash function to a concatenation of an opcode (e.g., OPCODE_0), the key, and the XOR result, and
- deriving (420) the public key from the final value for each of the chains,
- wherein the SHA2 hash function is configured to iteratively receive one or more input blocks of the string, and is configured to iteratively update an intermediate hash value, at least one or more words (e.g., 10 words, SHA-384), and preferably at least a first block (e.g., SHA-512), of the string being constant during the computing of the chains, the method further comprising precomputing the intermediate hash value on the constant at least a first block or the constant at least a partial first block.
2. A method as in Claim 1, wherein a first block of the string (e.g., 128 bytes) and an adjacent partial block of the string (e.g., 24 bytes), is constant during the computing of the chains, the hash (e.g., SHA-512) being precomputed in full for the first block (e.g., the 80 rounds), and in part for the partial block (e.g., the first 3 rounds).
3. A method as in Claim 1, wherein an adjacent partial block of the string (e.g., 80 bytes), is constant during the computing of the chains, the hash (e.g., SHA-384) being precomputed in part for the partial block (e.g., the first 10 rounds).
4. The method of Claim 2 and 3, wherein the string starts with a concatenation of: an opcode (e.g. OPCODE_3), a public seed (e.g., SEED), and an address (e.g., ADRS), the address in the concatenated string for a pseudo-random function (PRF) identifying a specific position in the sequence of random values of the public or private key, wherein the opcode, public seed, and an initial part of the address is constant.
5. A method as in any one of the preceding claims, wherein the applying (414) the hash function to a concatenation of an opcode (e.g., OPCODE_0), the key, and the XOR result, is precomputed for the opcode, (e.g., 8 rounds on 64-byte opcode, using SHA-512).
6. The method of any one of the preceding claims, wherein obtaining the private key (e.g., sk) comprises:
- generating the sequence of random values (e.g., sk[0], sk[1], sk[2], …, sk[len-1]) by repeatedly applying a third PRF, wherein the third PRF comprises applying the SHA2 hash to a string starting with a concatenation of: an opcode (e.g., OPCODE_4), and a secret seed (e.g., SK_SEED), wherein at least a first block of the string (128 bytes) is constant during the computing of the private key (e.g., SHA-512), the method further comprising precomputing the intermediate hash value on the constant at least a first block.
7. The method of Claim 6, wherein the string further starts with a concatenation of a public seed (e.g., SEED), and a constant part of an address (e.g., ADRS[1..16] or ADRS[1..20]), together forming a partial block of hash (e.g., 80 bytes or 120 bytes), the hash (e.g., SHA-512 or SHA-384, respectively) being precomputed in full for the first block, and in part for the partial block (e.g., 10 rounds or 15 rounds, respectively).
8. The method of Claim 6 or 7, wherein the string start with a concatenation of: an opcode (e.g., OPCODE_4), and a secret seed (e.g., SK_SEED), optionally the public seed (e.g., SEED), and an address, the address in the concatenated string for a pseudo-random function (PRF) identifying a specific position in the sequence of random values of the private key and chains, wherein the opcode, secret seed, public seed, and an initial part of the address is constant.
9. The method (430) of any of the preceding claims, wherein the signature scheme is a Merkle Signature Scheme, the method computing a public key for multiple uses, the method comprising:
- obtaining (440) multiple private keys (e.g., sk0, sk1,…, sk2^h-1), each private key comprising a sequence of random values (e.g., ski[0], ski[1], ski[2], …, ski[len-1]), and generating therefor multiple sets of chains,
- computing (450) a Merkle tree of hash values, starting from the final values of each of the chains in each of the multiple sets, and repeatedly computing a hash value on a next level of the tree by applying a tree function to two hash values on a lower level of the tree, the tree function comprises:
- applying (451) a fourth pseudo-random function (PRF) to generate a key,
- applying (452) a fifth PRF to generate a first bitmask,
- applying (453) a sixth PRF to generate a second bitmask, wherein the fourth, fifth, and sixth PRF comprise applying the SHA2 hash function to a string starting with an opcode (e.g., OPCODE_3), a public seed (e.g., SEED), and an address (e.g., ADRS),
- XOR-ing (454) the first bitmask with a first value on the lower level of the tree,
- XOR-ing (455) the second bitmask with a second value on the lower level of the tree,
- applying (456) the SHA2 hash function to a concatenation of an opcode (e.g., OPCODE_1), the key, the first XOR result, and the second XOR result, and
- deriving (460) the public key (e.g., PK) from the root value of the Merkle tree,
- wherein, at least a first block of the string being constant during the computing of the tree, the method further comprising precomputing the intermediate hash value (e.g., SHA-512) on the constant at least a first block (e.g., 128 bytes) in the applying of the fourth PRF, fifth PRF, and sixth PRF.
10. The method (430) of Claim 9, wherein a first block (e.g., 128 bytes for SHA-512) and an adjacent partial block of the input to the fourth PRF, fifth PRF, and sixth PRF (e.g., 24 bytes or 80 bytes), is constant during the computing of the tree, the hash (e.g., SHA-512 or SHA-384, respectively) being precomputed in full for the first block, and in part for the partial block.
11. The method of Claim 9 or 10, wherein applying (456) the hash function to a concatenation of an opcode (e.g., OPCODE_1), the key, the first XOR result, and the second XOR result, comprises precomputing the hash function on a partial block comprising the opcode, (e.g., 8 rounds on 64-byte opcode, using SHA-512).
12. The method of any of the preceding claims, further comprising generating a signature for a message using the private key, the method comprising:
- obtaining the message, the message comprising a sequence of values,
- for each value in the sequence of values in the message, repeatedly computing the chain function for each value in the private key as many times as indicated by the corresponding value in the message, the signature comprising the final computed value in the chain for each private key value, wherein at least a first block and/or at least a partial block of the string hashed in the chain function being constant during the computing of the chains, the method further comprising precomputing the intermediate hash value on the constant at least a first block and/or at least a partial block.
13. The method of any of the preceding claims, further comprising verifying a signature for a message computed using the obtained private key, the method comprising:
- receiving the signature and the message, wherein the signature comprises a sequence of values corresponding to the message,
- for each value in the sequence of values in the signature, computing a temporary public key value by repeatedly applying the chain function to each value in the signature as many times as indicated by the difference between a predetermined maximum value and the corresponding value in the message, wherein at least a first block and/or at least a partial block of the string hashed in the chain function being constant during the computing of the chains, the method further comprising precomputing the intermediate hash value on the constant at least a first block and/or at least a partial block,
- comparing a value derived from the temporary public key values to the public key to determine the validity of the signature.
14. A system comprising: one or more processors; and one or more storage devices storing instructions that, when executed by the one or more processors, cause the one or more processors to perform operations for any of claims 1-13.
15. A transitory or non-transitory computer readable medium (1000) comprising data (1020) representing instructions, which when executed by a processor system, cause the processor system to perform the method according to any of claims 1-13.
| # | Name | Date |
|---|---|---|
| 1 | 202441035192-POWER OF AUTHORITY [03-05-2024(online)].pdf | 2024-05-03 |
| 2 | 202441035192-FORM 1 [03-05-2024(online)].pdf | 2024-05-03 |
| 3 | 202441035192-DRAWINGS [03-05-2024(online)].pdf | 2024-05-03 |
| 4 | 202441035192-DECLARATION OF INVENTORSHIP (FORM 5) [03-05-2024(online)].pdf | 2024-05-03 |
| 5 | 202441035192-COMPLETE SPECIFICATION [03-05-2024(online)].pdf | 2024-05-03 |
| 6 | 202441035192-RELEVANT DOCUMENTS [22-05-2024(online)].pdf | 2024-05-22 |
| 7 | 202441035192-POA [22-05-2024(online)].pdf | 2024-05-22 |
| 8 | 202441035192-FORM 13 [22-05-2024(online)].pdf | 2024-05-22 |
| 9 | 202441035192-AMMENDED DOCUMENTS [22-05-2024(online)].pdf | 2024-05-22 |