Keywords

1 Introduction

The threat of quantum computing has motivated cryptographers to develop digital signatures based on new, supposedly quantum-resistant, hardness assumptions. In order to standardize these new signature schemes, NIST started its first post-quantum (PQ) signature standardization processFootnote 1 in 2017, where SPHINCS+ [6, 16], Dilithium [30] and FALCON [43] were standardized. With two out of three standardizations relying on hard lattice problems for their security, NIST deemed it necessary to seek additional candidates for standardization whose security is based on a more diverse set of hardness assumptionsFootnote 2.

Signatures from Zero-Knowledge Proofs. A well-known technique to build a digital signature scheme is to compile a (public-coin, honest-verifier) zero-knowledge (ZK) proof of knowledge, used in an identification protocol, with the Fiat-Shamir transformation (FS). In particular, a zero-knowledge proof of knowledge (ZKPoK) for an NP relation \(\mathcal {R}\) is an interactive protocol that allows the prover to prove knowledge of a witness w for a statement x such that \((x,w) \in \mathcal {R}\), without revealing any further information. In the context of signature (and identification) schemes, this is a proof of knowledge of a secret key k such that \(y = F_k(x)\), for a given one-way function (OWF) \(F_k(\cdot )\).

A powerful and efficient technique to build such ZK proofs for arbitrary NP relations is the MPC-in-the-Head (MPCitH) framework due to Ishai et al. [36]. However, a significant limitation of many MPCitH-based proofs lies in their proof size which scales linearly with the size of the circuit representation of the statement being proven. Nevertheless, MPCitH is particularly effective with small to medium-sized circuits and leads to efficient post-quantum signature schemes. These schemes are either based solely on symmetric primitives, such as AES [13, 29, 37, 45, 46] and other MPC-friendly one-way functions (OWFs) like LowMC [4], Rain [29], and AIM [39], or well-studied computational hardness assumptions, including syndrome decoding [3, 5, 34], the multivariate quadratic problem (MQ) [15, 42], the permuted kernel problem [1], and the Legendre PRF [19]. This second approach typically results in a more communication-efficient scheme.

VOLE-ZK and FAEST. In 2018, Boyle et al. [22] proposed a new class of prover-efficient (linear complexity) and scalable ZK proofs, which use commit-and-prove protocols instantiated using vector oblivious linear evaluation (VOLE) correlations. Follow-up works [8, 12, 22, 23, 27, 49,50,51, 53] reduced the constants of the linear proof size, surpassing MPCitH schemes in terms of efficiency, in particular when dealing with very large circuits. Compared to MPCitH schemes, the above VOLE-ZK protocols are limited to the designated-verifier setting only. However, recent work by Baum et al. [7] reconciles the advantages of both worlds, resulting in VOLE-ZK proofs that are publicly verifiable. To achieve this, they introduce a technique called VOLE-in-the-Head (VOLEitH) which bears a surprising resemblance to MPCitH-based protocols. Based on VOLEitH, they proposed the FAEST [9] post-quantum signature scheme.

Similarly to MPCitH signature schemes like Banquet [13], BBQ [45], and Helium [37], FAEST relies on AES [2] as its OWF. However, FAEST outperforms MPCitH-based signatures, by having signatures at least twice as small and with similar or better signing and verification times. This makes the VOLEitH-based FAEST as performant as the most optimized MPCitH-based schemes [37], while relying on a very conservative OWF. At the same time, VOLEitH is a relatively new concept, and it remained unexplored to what extent VOLEitH-based signatures can benefit from selecting different OWFs, such as Rain or random multivariate quadratic maps.

1.1 Our Contributions

In this work, we present improvements to the core building blocks used in VOLE-in-the-head proof systems, as well as alternative one-way function instantiations that optimize prior approaches and lead to more efficient post-quantum signature candidates.

Improved Batch Vector Commitments. VOLE-in-the-head signatures such as those based on MPC-in-the-head, use multiple GGM-based [35] all-but-one vector commitment schemes to generate correlated randomness for the ZK proofs. These vector commitments are then opened at random challenge points as part of the proof, incurring a decommitment size of \(\log (N)\cdot \lambda \) bits per vector commitment that must be sent during the opening phase (where N is the length of the vector and \(\lambda \) is the security parameter). These openings are a substantial part of the setup cost of the ZK proof. We provide a new abstraction, called batch all-but-one vector commitment (BAVC) schemes, which captures how multiple vector commitments are used in VOLEitH and MPCitH. We observe that, to instantiate the BAVC abstraction more efficiently, one can interleave multiple vector commitments which drastically reduces the opening size. This batching requires the signer to perform rejection sampling when selecting the points to open, reducing the entropy of the challenge space somewhat. While it might seem that this makes the scheme less secure, one can prove that security is actually preserved: since each rejection sampling step requires the prover to perform a hash function call, we can consider rejection sampling as a proof of work done during each signing operation. Any attacker must also perform this proof of work to generate a valid signature. We believe that this technique is of independent interest.

FAESTER. This rejection sampling/proof of work idea can be pushed further, using a technique known as “grinding” [18, 48]. Proof systems naturally have a tradeoff between signature size, computation, and security, and reducing the security can lead to significant improvements in both signature size and computational efficiency. We do this by further reducing the entropy of the challenge space so that some part of the opening process does not even need to be considered. This makes the VOLEitH proof itself slightly less secure, but the overall signature scheme retains the same security level due to the additional proof of work caused by increased rejection probability. It might seem that this trade-off will naturally lead to longer signing times, but the opposite can actually be the case: reducing the challenge entropy significantly reduces the other signing costs, so the scheme is optimized by finding a balance between the costs of the proof of work and those of the rest of the scheme. We applied BAVC and grinding to the FAEST signature scheme, leading to a new digital signature with a signature size of 4KB (an improvement over all signature schemes using AES OWF) while maintaining or improving upon the signing and verification time of FAEST. We name this new improved signature scheme FAESTER.

MandaRain & KuMQuat. AES-based OWFs benefit from decades of public scrutiny. However, AES was not designed for use-cases such as VOLEitH which leaves open the possibility that other OWFs may result in faster signing and verification times, and smaller signature sizes. We survey suitable candidate PRFs, ranging from various recent specialized designs in symmetric cryptography [4, 29, 31, 33, 39, 44] to various instances of the MQ problem [15]. We select the Rain [29] and MQ [15] PRFs, from which we construct the new MandaRain and KuMQuat signature schemes using our new commitment optimization. These signature schemes have a signature size as small as 2.6KB, lowest among all VOLEitH and MPCitH-based signature schemes. An overview of our results can be seen in Fig. 1.

Fig. 1.
figure 1

Signature size and runtime trade-off comparison between the proposed signature schemes with FAEST and FAEST-EM. The slow and fast versions are denoted with s and f respectively. The fast version offers smaller signing and verification time, however, comes with a larger signature size. Similarly, in the slower version, the signature size is smaller but both signing and verification timings are larger.

Allowing Uniform AES Keys in FAEST(ER). In cases where the conservative choice of AES is preferred to alternative OWFs, we show how to tweak the AES proving algorithm so that FAEST and FAESTER can support secret keys that cover the entirety of the AES keyspace. This avoids sampling signing keys via rejection sampling, as done in previous works, so we obtain a simplified key generation algorithm and improve concrete security by 1–2 bits. This improvement comes with no cost in signature size or runtime.

FAEST-d7: Higher-Degree Constraints for AES. We also present a new method of proving AES in VOLE-ZK proof systems, using degree-7 constraints over \(\mathbb {F}_2\). Compared with the degree-2 constraints over \(\mathbb {F}_{2^8}\) used in the original FAEST (and above), we halve the witness size in the ZK proof. Although proving higher-degree constraints does come with some extra costs, we show that signature sizes can be up to 5% smaller in FAEST-d7. We have not yet implemented this variant, but expect signing and verification times to be similar to FAEST. As a contribution of independent interest, we optimize the method for proving high-degree constraints in the QuickSilver proof system [52], greatly improving the efficiency of the prover.

VOLEitH Parameter Exploration. With our implementationFootnote 3, we enable a systematic investigation of the parameter set within the VOLEitH paradigm for constructing a signature scheme, providing insights into the effects of different parameters, including those introduced in this work. These insights contribute to further improvements and trade-offs. Table 5 summarizes our signature performance for the L1 security, and in the full version of the paperFootnote 4, we include the holistic results for all the security levels. In Fig. 8, we compare our results with the other signature schemes, including the NIST PQ Signature Round 4 candidates.

2 Preliminaries

2.1 One-Way Functions

MPCitH and VOLEitH signatures are based on proving knowledge of the preimage to a OWF.Footnote 5 In many recent signature schemes like Picnic and FAEST, OWFs are built from a block cipher, according to the following construction.

Construction 1

A one-way-function \(\textsf{F}(k, x)\) can be constructed using a block cipher \(E_k(x)\) by setting \(\textsf{F}(k, x) := (x, E_k(x))\), where \(E_k(x)\) denotes the encryption of x under the key k. The OWF relation is defined as \(((x,y),k) \in R \Leftrightarrow E_k(x) = y\).

The \(\texttt{Rain}\) OWF Dobraunig et al. presented a block cipher called \(\texttt{Rain}\) [29] with a small number of non-linear constraints, designed to optimize the signature size and time when used as a OWF in MPCitH based signature schemes.Footnote 6 The resulting signature scheme, Rainier [29], was the first MPCitH signature scheme with less than 5 KB of signature size.

Below we describe the \(\texttt{Rain}\) round function and we refer to Fig. 2 for a graphical overview of \(\texttt{Rain}\) with 3 rounds.

The \(\texttt{Rain}\) keyed permutation \(f_k(x) : \mathbb {F}_2^\lambda \rightarrow \mathbb {F}_2^\lambda \) is defined by the concatenation of a small number r of round functions \(R_i\), \(i \in [r]\), i.e. \(f_{k}(x) = R_{r} \circ \dots \circ R_2 \circ R_1(x)\). Each \(R_i\), \(i \in [r]\), is in turn defined as

$$ R_i(x) = {\left\{ \begin{array}{ll} {\textbf {M}}_i \cdot S(x + k + {\textbf {c}}_i) & i \in [1..r)\\ k + S(x + k + {\textbf {c}}_{r}) & i = r. \end{array}\right. } $$

Here, \(S : \mathbb {F}_{2^\lambda } \rightarrow \mathbb {F}_{2^\lambda }\) is the field inversion function over \(\mathbb {F}_{2^\lambda }\) (mapping 0 to 0), \({\textbf {c}}_i \in \mathbb {F}_2^{\lambda }\) is a round constant, \(k \in \mathbb {F}_2^{\lambda }\) the secret key and \({\textbf {M}}_i \in \mathbb {F}_2^{\lambda \times \lambda }\) an invertible matrix.

Fig. 2.
figure 2

The \(\texttt{Rain}\) encryption function with r = 3 rounds. \({\textbf {M}}_i\) denotes the multiplication with an unstructured invertible matrix over \(\mathbb {F}_2\) in the i-th round.

In the VOLEitH setting, similar to MPCitH schemes, the linear layer has a much smaller impact on the performance in comparison to the non-linear layer. Thus to improve diffusion, the authors of \(\texttt{Rain}\) decided to use different round constants \({\textbf {c}}_i\) and linear matrices \({\textbf {M}}_i\) for each round. Rain comes in two settings, namely \(\texttt{Rain}\)-3 with 3 rounds and \(\texttt{Rain}\)-4 with (more conservative) 4 rounds. Despite detailed cryptanalysis carried out by the authors, the best known attacks [41, 54] extend only to two rounds.

Multivariate Quadratic (MQ) OWF. One can also build a OWF from the well-known Multivariate Quadratic problem.

Definition 1

(Multivariate Quadratic Problem). Let \(\mathbb {F}_q\) be a finite field and \( \textsf{MQ}_{n,m,q} \) be the set of multivariate maps over \(\mathbb {F}_q\) with n variables and m components of the form \( \{{\textbf{x}}^{\textbf {T}} \cdot \textbf{A}_i \cdot {\textbf{x}}+ {\textbf{b}}_i^{\textbf {T}} \cdot {\textbf{x}}\}_{i \in [m]} \), where \(\textbf{A}_i \in \mathbb {F}_q^{n \times n}\), are randomly sampled upper triangular matrices and \({\textbf{b}}_i \in \mathbb {F}_q^n\) are uniformly sampled vectors. Given \(F \in \textsf{MQ}_{n,m,q}\) and \({\textbf{y}}= (y_1, \dots , y_m ) \in \mathbb {F}_q^m\), the MQ problem asks to find \({\textbf{x}}\) such that \(F({\textbf{x}}) = {\textbf{y}}\), i.e. \(\left( y_i := {\textbf{x}}^{\textbf {T}} \cdot \textbf{A}_i \cdot {\textbf{x}}+ {\textbf{b}}_i^{\textbf {T}} \cdot {\textbf{x}}\right) _{i \in [m]}\).

The MQ problem has been extensively used in cryptography and used to build both trapdoor [17, 40] and one-way signature schemes [15, 47]. We construct the one-way function \(E_{{\textbf{x}}}(\texttt {seed}) = {\textbf{y}}\) from the MQ problem, where seed is the input to a pseudorandom generator G such that \(\textbf{A}_1, \ldots , \textbf{A}_m, {\textbf{b}}_1, \ldots , {\textbf{b}}_m \leftarrow G(\texttt {seed})\). Therefore, when constructing a one-way signature scheme from the MQ problem, \(({\textbf{x}},\texttt {seed})\) becomes the \(\textsf{sk}\) and (\({\textbf{y}}\),seed) becomes the \(\textsf{pk}\) (similar to MQOM [15]).

2.2 VOLEitH Signatures

We now give an overview of the VOLEitH framework as the ZK-proof system underlying FAEST. A more detailed introduction of the VOLE-in-the-Head approach is available in the full version.

A vector oblivious linear evaluation (VOLE) correlation of length m is a two-party correlation between a prover \(\mathcal {P}\) and a verifier \(\mathcal {V}\) defined by a random global key \(\varDelta \in \mathbb {F}_{2^k}\), a set of random bits \(u_i \in \mathbb {F}_2\), a random VOLE tag \(v_i \in \mathbb {F}_{2^k}\) and VOLE keys \(q_i \in \mathbb {F}_{2^k}\) such that \(q_i= u_i \cdot \varDelta - v_i, i = 0, \dots , m-1\). \(\mathcal {P}\) obtains \(u_i,v_i\) while \(\mathcal {V}\) obtains \(\varDelta ,q_i\). The correlations commit \(\mathcal {P}\) to the \(u_i\)’s as linearly homomorphic commitments, allowing efficient proof systems (see [11] for an overview). One of the main drawbacks of such VOLE-based ZK schemes is that of being inherently designated verifier since the verifier \(\mathcal {V}\) needs to know its part of the VOLE correlation to verify the proof, which has to remain secret from the prover for the proof to be sound.

Using VOLEitH, Baum et al. realized a delayed VOLE functionality that allows the prover to generate values \(u_i, v_i\) of VOLE correlations independently of \(\varDelta , q_i\) and to generate them later instead. This delayed VOLE functionality can in turn be realized from vector commitments (VCs). The main steps of the interactive ZK proof can be computed as before, and only after these, in the last stage of the protocol, the verifier will choose and send to the prover the random value \(\varDelta \) of the correlation. At this point, \(\mathcal {P}\) will open the homomorphic commitments and send to \(\mathcal {V}\) information which allows it to reconstruct the \(q_i\)s in the VOLE correlations, check the openings and thus the proof. This guarantees public verifiability, as the final VOLE correlation is defined by the random value \(\varDelta \) chosen as the last step of the proof by the verifier, after all other proof messages have been fixed. Concretely, to obtain the desired soundness, it is necessary to run \(\tau \) instances of VOLEitH such that \(\tau \cdot k = \lambda \). The main steps of the resulting ZK proof using the VOLEitH technique are depicted in Fig. 3.

Fig. 3.
figure 3

Main steps of the VOLEitH-based Zero-Knowledge proof in FAEST

3 Improving Batch Vector Commitments

In this section, we present our result on batch vector commitments (VCs) in the random oracle (RO) model. We start by providing a formal definition of a batch all-but-one vector commitment scheme (BAVC) with abort in the opening phase. This can be used in FAEST, and more generally in VOLEitH-based protocols, as well as in most of the known MPC-in-the-head schemes. By making the properties of the used GGM-based instantiation explicit, we manage to achieve an optimized construction that results in shorter signatures.

Informally, a batch all-but-one vector commitment scheme (BAVC) is a two-phase protocol between two PPT machines, a sender and a receiver. In the first phase, called the commitment phase, the sender commits to multiple vectors of messages while keeping them secret; in the second phase, the decommitment phase, all but one of the entries of each vector are opened. The vectors may have different lengths. We require the binding and hiding properties of regular commitments, and additionally also that the messages at the unopened indices remain hidden, even after opening all other indices of each committed vector. In addition, we do not allow the sender to choose the messages, which instead are just random elements from the message space \(\mathcal {M}\). This definition captures how vector commitments are used in MPC-in-the-head or VOLE-in-the-head constructions.

Let \(\tau \) be the number of vectors, and let the \(\alpha \)-th vector have length \(N_\alpha \) for \(\alpha \in [\tau ]\). We will denote by \(i_{\tau }\) the index of vector \(\tau \) that remains unopened and by I the vector \((i_1,\dots ,i_\tau )\) comprising all the indices that remain unopened.

Definition 2

(\(\textsf{BAVC}\) ). Let H be a random oracle. A (non-interactive) batch all-but-one vector commitment scheme \(\textsf{BAVC}\) (with message space \(\mathcal {M}\)) in the RO model is defined by the following PPT algorithms, where all of them have access to a RO, and obtain the security parameter \(1^\lambda \) as well as \(\tau ,N_1,\dots ,N_\tau \) as input:

  • \(\textsf{Commit}() \rightarrow (\textsf{com}, \textsf{decom}, (m_1^{(\alpha )}, \ldots , m_{N_\alpha }^{(\alpha )})_{\alpha \in [\tau ]})\) : output a commitment \(\textsf{com}\) with opening information \(\textsf{decom}\) for messages \((m_1^{(\alpha )}, \ldots , m_{N}^{(\alpha )})_{\alpha \in [\tau ]}\) \(\in \) \(\mathcal {M}^{N_1 + \dots + N_\tau }\).

  • \(\textsf{Open}(\textsf{decom}, I) \rightarrow \textsf{decom}_I \vee \bot \) : On input an opening \(\textsf{decom}\) and the index vector \(I \subset [N_1] \times \dots \times [N_\tau ]\), output \(\bot \) or an opening \(\textsf{decom}_I\) for I.

  • \(\textsf{Verify}(\textsf{com}, \textsf{decom}_I, I) \!\rightarrow \! ((m_j^{(\alpha )})_{j \in [N_\alpha ] \setminus \{i_{\alpha }\}})_{\alpha \in [\tau ]} \vee \bot \) : Given a commitment \(\textsf{com}\), an opening \(\textsf{decom}_I\), for an index vector I, as well as the index vector I, either output all messages \((m_j^{(\alpha )})_{j \in [N_\alpha ] \setminus \{i^{\alpha }\}}\) (accept the opening) or \(\bot \) (reject the opening).

We now define correctness for the commitment scheme. We allow the sender to potentially abort for certain choices of I during \(\textsf{Open}\). Looking ahead, this does not pose any problem if the abort probability is low, as aborts only happen during signature generation.

Definition 3

(Correctness with aborts). \(\textsf{BAVC}\) is correct with aborts if for all \(I \subset [N_1]\times \dots \times [N_\tau ]\), the following outputs True

$$\begin{aligned} (\textsf{com}, \textsf{decom}, M) &\leftarrow \textsf{Commit}() \\ \forall \, \textsf{decom}_I &\leftarrow \textsf{Open}(\textsf{decom}, I) \\ \text {output }\textsf{decom}_I = \bot &\vee \textsf{Verify}(\textsf{com}, \textsf{decom}_I, I) = M \end{aligned}$$

with all but a negligible probability, where \(M = (m_1^{(\alpha )}, \ldots , m_{N}^{(\alpha )})_{\alpha \in [\tau ]}\).

Informally, we say that a commitment scheme is extractable-binding if there exists an extractor \( \textsf{Ext}\) such that for any commitment opening, the extracted message is equal to the opened message. More formally, we have the following definition.

Definition 4

(Extractable-Binding). Let \(\textsf{BAVC}\) be defined as above in the RO-model with RO \(H\). Let \(\textsf{Ext}\) be a PPT algorithm such that

  • \(\textsf{Ext}( Q, \textsf{com}) \rightarrow ((m_j^{(\alpha )})_{j \in [N_\alpha ] })_{\alpha \in [\tau ]} \), i.e., given a set Q of query-response pairs of random oracle queries, and a commitment \(\textsf{com}\), \(\textsf{Ext}\) outputs the committed messages. (\(\textsf{Ext}\) may output \(m_j^{(\alpha )} = \bot \), e.g. if committed value at this index is invalid.)

For any \(\tau ,N_\alpha = poly(\lambda )\), define the straightline extractable-binding game for \(\textsf{BAVC}\) and stateful adversary \(\mathcal {A}^{H}\) with oracle access to the random oracle \(H\) as follows:

  1. 1.

    \(\textsf{com}\leftarrow \mathcal {A}^{H}(1^\lambda )\)

  2. 2.

    \(((\overline{m}_1^{(\alpha )}, \ldots , \overline{m}_{N}^{(\alpha )})_{\alpha \in [\tau ]}) \leftarrow \textsf{Ext}( Q, \textsf{com})\), where Q is the set \(\{(x_i, H(x_i))\}\) of query-response pairs of queries \(\mathcal {A}\) made to \(H\).

  3. 3.

    \((((m_j^{(\alpha )})_{j \in [N_\alpha ]\setminus \{i_{\alpha }\}})_{\alpha \in [\tau ]}, \textsf{decom}_I, I) \!\leftarrow \! \mathcal {A}^{H}(\texttt{com})\).

  4. 4.

    Output 1 (success) if: \(\textsf{Verify}(\textsf{com}, \textsf{decom}_I, I) = ((m_j^{(\alpha )})_{j \in [N_\alpha ] \setminus \{i_{\alpha }\}})_{\alpha \in [\tau ]}\), but \(m_j^{(\alpha )} \ne \overline{m}_j^{(\alpha )}\) for some \(\alpha \in [\tau ], j \in [N_\alpha ]\setminus \{i_{\alpha }\}\). Else output 0 (failure).

We say \(\textsf{BAVC}\) is straightline extractable w.r.t. \( \textsf{Ext}\) if any PPT adversary \(\mathcal {A}\) has a negligible probability of winning the extractable binding game. We denote the advantage, i.e. probability to win, by \(\textsf{AdvEB}^{\textsf{BAVC}}_{\mathcal {A}}\).

We define the n-hiding real-or-random game where \(0< n\le \tau \). Here, the attacker has to guess if claimed committed values for the first n commitments at the hidden index are correct or not. We allow for a parameter n to permit hybrids in security proofs.

Definition 5

(Hiding (real-or-random)). Let \(\textsf{BAVC}\) be a vector commitment scheme in the RO-model with random oracle \(H\). The selective hiding experiment for \(\textsf{BAVC}\) with \(\tau ,N_\alpha = poly(\lambda )\), parameter n and stateful \(\mathcal {A}\) is defined as follows.

  1. 1.

    \(\overline{b} \leftarrow \{0,1\}\)

  2. 2.

    \((\textsf{com}, \textsf{decom}, (\overline{m}_1^{(\alpha )}, \ldots , \overline{m}_{N}^{(\alpha )})_{\alpha \in [\tau ]}) \leftarrow \textsf{Commit}()\)

  3. 3.

    \(I \leftarrow \mathcal {A}^{H}(1^\lambda , \textsf{com})\), where \(I\in [N_1]\times \dots \times [N_\tau ]\).

  4. 4.

    \(\textsf{decom}_I \leftarrow \textsf{Open}(\textsf{decom}, I)\)

  5. 5.

    \(m_j^{(\alpha )} \leftarrow \overline{m}_j^{(\alpha )}\) for \(j \in [N_\alpha ]\setminus \{i_{\alpha }\},\alpha \in [\tau ]\).

  6. 6.

    Set \(m_{i_\alpha }^{(\alpha )}\leftarrow {\left\{ \begin{array}{ll} \text {random from } \mathcal {M}& \text { if } \overline{b}=0 \wedge \alpha \le n \\ \overline{m}_{i_\alpha }^{(\alpha )} & \text { otherwise } \end{array}\right. }\)

  7. 7.

    \(b \leftarrow \mathcal {A}((m_j^{(\alpha )})_{j \in [N_\alpha ]},\textsf{decom}_i)\).

  8. 8.

    Output 1 (success) if: \(b = \overline{b}\), else 0 (failure).

The advantage \(\textsf{AdvSelHide}^{\textsf{BAVC}}_{\mathcal {A},i}\) of an adversary \(\mathcal {A}\) is defined by \(|\Pr \left[ \mathcal {A}\text { wins and }\right. \left. n=i\right] - \frac{1}{2}|\) in the hiding experiment. We say \(\textsf{BAVC}\) is selectively hiding if every PPT adversary \(\mathcal {A}\) has a negligible advantage of winning \(\textsf{AdvSelHide}^{\textsf{BAVC}}_{\mathcal {A},i}\) for all \(i\in [\tau ]\).

Note that the GGM-based VC scheme of [7] can be defined using our definitions as well. We show this in the full version of the paper.

3.1 Optimized Batch All-but-One Vector Commitments

The GGM-based [35] VC construction has been extensively used both in MPCitH based signature schemes like Picnic [24], BBQ [45], Banquet [13], Helium [37] and also VOLEitH-based FAEST to construct the commitment scheme. It expands a random seed into a tree of Pseudorandom values by recursively applying a length-doubling Pseudo Random Generator (PRG) to each seed. To obtain a VC, the prover commits to the tree leaves to represent one vector commitment towards the verifier. Then, at a later stage, it can reveal parts of the leaves by opening intermediate seeds (i.e. inner nodes of the tree), allowing the verifier to check the opening against the VC. MPCitH-based signatures usually generate a forest of \(\tau \) such trees in parallel, whose roots are generated from a single seed. This approach (which we recap in the full version) allows expressing \(\tau \) VCs as one BAVC.

One Big Tree Instead of \(\tau \) Small Ones. We now describe an optimization of this construction, where instead of generating a forest of \(\tau \) trees with \(N_1, \dots , N_\tau \) leaves each, we generate a single GGM tree with \(L = \sum _{i = 1}^\tau N_i\) leaves. Opening all but \(\tau \) leaves of the big tree is more efficient than opening all but one leaf in each of the \(\tau \) smaller trees, because with high probability some of the active paths in the tree will merge relatively close to the leaves, which reduces the number of internal nodes that need to be revealed. Importantly, we map entries of the individual vector commitments to the leaves of the tree in an interleaved fashion. The first \(\tau \) leaves of the tree correspond to the first entry of the \(\tau \) vector commitments, the next leaves correspond to the second entries, and so on. The other way around would force the \(\tau \) unopened leaves to be spaced far apart, which is detrimental to the number of nodes that need to be revealed. The number of internal nodes that need to be revealed depends on I, which would cause some variability in the size of the signature. To prevent this, we fix a threshold \(\mathsf {T_{open}}\) for the number of internal nodes in an opening, and we let the \(\textsf{Open}\) algorithm abort if the number of nodes exceeds \(\mathsf {T_{open}}\). The value of \(\mathsf {T_{open}}\) controls a trade-off between the opening size of \(\textsf{BAVC}\) and the success probability of \(\textsf{BAVC}. \textsf{Open}\).

Towards formalizing our optimized \(\textsf{BAVC}\) scheme \(\textsf{BAVC}_{\textsf{opt}}\), let \(\textsf{PRG}:\{0,1\}^{\lambda } \rightarrow \{0,1\}^{2\lambda }\) be a PRG, \(\textsf{H}:\{0,1\}^* \rightarrow \{0,1\}^{2\lambda }\) be a collision-resistant hash function (CRHF) and \(\textsf{G}:\{0,1\}^{\lambda } \rightarrow \{0,1\}^{\lambda } \times \{0,1\}^{2\lambda }\) be a PRG and CRHF. We define the scheme \(\textsf{BAVC}_{\textsf{opt}}\), which is parameterized by the number of vectors \(\tau \), the lengths of the vectors \(N_1,\dots ,N_\tau \), and the opening size threshold \(\mathsf {T_{open}}\). Let \(\pi : [L-1, 2L-2] \rightarrow \{(\alpha ,i)\}_{1 \le i \le N_\alpha }\) be a bijective mapping from roots of the GGM tree to positions in the vector commitment.

  1. 1.

    Set \(k \leftarrow \{0,1\}^\lambda \) and let \(k_0 \leftarrow k\).

  2. 2.

    For \(i \in [0,L-2]\), compute \((k_{2i+1}, k_{2i+2}) \leftarrow \textsf{PRG}(k_i)\) to create a tree with L leaves \(k_{L-1}, \dots , k_{2L-2}\).

  3. 3.

    Deinterleave the leaves: \(\{\textsf{sd}_1^{(\alpha )}, \dots , \textsf{sd}_{N_{\alpha }}^{(\alpha )}\}_{\alpha \in [\tau ]\!} {\mathop {\leftarrow }\limits ^{\pi }} \{ k_{L-1}, \cdots , k_{2L - 2} \}\).

  4. 4.

    Compute \((m_i^{(\alpha )}, {\textsf{com}}_i^{(\alpha )}) \leftarrow \textsf{G}(\textsf{sd}_i^{(\alpha )})\), for \(\alpha \in [\tau ]\) and \(i \in [N_\alpha ]\).

  5. 5.

    Compute \(h^{(\alpha )} \leftarrow \textsf{H}({\textsf{com}}_1^{(\alpha )}, \dots , {\textsf{com}}_{N_\alpha }^{(\alpha )})\) for \(\alpha \in [\tau ]\) and \(h\leftarrow \textsf{H}(h^{(1)},\dots ,h^{(\tau )})\).

  6. 6.

    Output the commitment \(\textsf{com}= h\), the opening \(\textsf{decom}= k\) and the messages \((m_1^{(\alpha )},\dots ,m_{N_\alpha }^{(\alpha )})_{\alpha \in [\tau ]}\).

  1. 1.

    Recompute \(k_j\) for and \(j \in [0, \dots , 2L-2]\) from k as in \(\textsf{Commit}\).

  2. 2.

    Let \(S=\{ k_{L-1},\dots ,k_{2L-2} \}\).

  3. 3.

    For each \(\alpha \in [\tau ]\), remove \(k_{\pi ^{-1}(\alpha ,i^{(\alpha )})}\) from S.

  4. 4.

    For i from \(i=L-2\) to 0:

    • If \(k_{2i+1} \in S\) and \(k_{2i+2} \in S\) then replace both with \(k_i\).

  5. 5.

    If \(|S| \le \mathsf {T_{open}}\) output the opening information \(\textsf{decom}_I = (({\textsf{com}}_{i^{(\alpha )}}^{(\alpha )} )_{\alpha \in [\tau ]} ,S)\), otherwise output \(\bot \).

  1. 1.

    Recompute \(\textsf{sd}_i^{(\alpha )} \) from \(\textsf{decom}_I\), for each \(\alpha \in [\tau ]\) and \(i \ne i^{(\alpha )} \) using the available keys in S, and compute \(({m}_i^{(\alpha )} , {\textsf{com}}_i^{(\alpha )} ) \leftarrow \textsf{G}(\textsf{sd}_i^{(\alpha )} )\).

  2. 2.

    Compute \(h^{(\alpha )}=\textsf{H}({\textsf{com}}_1^{(\alpha )}, \dots , {\textsf{com}}_{N_\alpha }^{(\alpha )})\) for each \(\alpha \in [\tau ]\).

  3. 3.

    If \(h \ne \textsf{H}(h^{(1)},\dots ,h^{(\tau )})\) output \(\bot \).

  4. 4.

    Output \((({m}_i^{(\alpha )})_{i \in [N_\alpha ]\setminus \{i^{(\alpha )}\}})_{\alpha \in [\tau ]}\).

Lemma 1

(Extractable Binding). Decompose \(\textsf{G}:\{0,1\}^{\lambda } \rightarrow \{0,1\}^{2\lambda }\) into \(\textsf{G}(x):=(\textsf{G}_1(x),\textsf{G}_2(x))\) and suppose \(\textsf{G}_2,\textsf{H}\) are straight-line extractable. Then \(\textsf{BAVC}_{\textsf{GGM}}\) is straight-line extractable-binding according to Definition 4: Given any adversary \(\mathcal {A}\) breaking the extractable-binding of \(\textsf{BAVC}_{\textsf{opt}}\) with advantage \(\textsf{AdvEB}\) we can construct a PPT adversary breaking extractability on \(\textsf{G}_2,\textsf{H}\) with advantage

$$ \textsf{AdvEB}\le L \cdot \textsf{Adv}_{\textsf{G}_2}+(\tau +1)\cdot \textsf{Adv}_{\textsf{H}} \text {.}$$

Proof

The proof is similar to [7, Lemma 1]. We extract \(\textsf{Ext}\) after obtaining \(\textsf{com}=h\) using the straight-line extractability of \(\textsf{G}_2,\textsf{H}\). For this, we first find \(h^{(1)},\dots ,h^{(\tau )}\) which hash to h, and then \(\textsf{com}_{i}^{(\alpha )}\) for each \(i\in [N_\alpha ],\alpha \in [\tau ]\), in both cases using extractability of \(\textsf{H}\). Then, we extract \(\textsf{sd}_i^{(\alpha )}\) from \(\textsf{com}_i^{(\alpha )}\) using the extractability of \(\textsf{G}_2\), and compute \(m_i^{(\alpha )}\) using \(\textsf{G}_1\).

Assume \(\mathcal {A}\) breaks extractable binding, i.e. provides values during \(\textsf{Open}\) which differ from the extracted \(h^{(\alpha )},\textsf{com}_{i}^{(\alpha )}\), \(\textsf{sd}_i^{(\alpha )}\). Then, our constructed adversary will simply guess in advance at which index \(\mathcal {A}\) will break extractability of \(\textsf{G}_2,\textsf{H}\) and play the extractability game at that index. This guess leads to the loss outlined in the statement.

Lemma 2

(Selectively Hiding). Given any adversary \(\mathcal {A}\) breaking the selective hiding of \(\textsf{BAVC}_{\textsf{GGM}}\) for parameter n with advantage \(\textsf{AdvSelHide}_n\) we can construct a PPT adversary breaking the pseudorandomness of \(\textsf{G},\textsf{PRG}\) with advantage

$$ \textsf{AdvSelHide}_n \le \lceil \log _2(L)\rceil \cdot \textsf{Adv}_{\textsf{PRG}}+ \textsf{Adv}_{\textsf{G}} \text {.}$$

Proof

The proof is similar to [7, Lemma 2]. By using that the GGM construction is a puncturable PRF according to [20] and since we know the unopened index I for each commitment vector, and in particular for vector n, in advance, one can iteratively replace the unopened PRG seeds \(k_i\) on the path from the root to \(\textsf{sd}_{i^{(n)}}^{(n)}\) which are not seeds on paths to \(\textsf{sd}_{i^{(1)}}^{(1)},\dots ,\textsf{sd}_{i^{(n-1)}}^{(n-1)}\) as well as the output of \(\textsf{G}(\textsf{sd}_{i^{(n)}}^{(n)})\) with uniformly random values. For this to be possible, we fully randomize the seeds on the paths to \(\textsf{sd}_{i^{(1)}}^{(1)},\dots ,\textsf{sd}_{i^{(n-1)}}^{(n-1)}\) first, to allow for any hybrids distinguishing at indices n to \(n+1\) to be meaningful. The bound then follows from the maximal number of hybrids possible.

4 Using BAVCs in FAEST

When generating a FAEST signature, the signing algorithm initially samples \(\tau \) independent VC instances, in order to set up the underlying VOLE-in-the-head-based ZK proof. Then, in the last round of the ZK proof (i.e. as output of the RO call to \(\textsf{H}_2^3\) which generates the last \(\lambda \)-bit challenge \(\textsf{chal}_3\)), the individual indices that are opened in each VC, i.e. the set \(I \in [N_1] \times \cdots \times [N_\tau ]\), are chosen using the injective decoding function \(\textsf{DecodeChallenge}\). Therefore, modifying the FAEST scheme to work with one BAVC instance instead of \(\tau \) independent VC instances is straightforward: sample the \(\tau \) vector commitments using one instance of \(\textsf{BAVC}\), and open them all simultaneously based on \(\textsf{chal}_3\). However, more modifications are necessary since \(\textsf{BAVC}\) does not necessarily enjoy perfect completeness, whereas the standard GGM-based VC used in FAEST does. Hence, it will happen that the signer cannot open a BAVC commitment based on a challenge \(\textsf{chal}_3\), and hence some signing attempts will abort.

Instead, we make the following changes to the FAEST signing algorithm: To handle aborts in the \(\textsf{Open}\) algorithm, we add a counter value \(\textsf{ctr}\) to the input of \(\textsf{H}_2^3\). If the challenge \(\textsf{chal}_3\) decodes to a sequence of indices I for which \(\textsf{Open}\) fails, then the signing algorithm repeatedly increases \(\textsf{ctr}\) and hashes again until it reaches a challenge for which \(\textsf{Open}\) succeeds. The counter \(\textsf{ctr}\) is included in the signature to allow for efficient verification.

Security of the Modification. The FAEST scheme has been proven secure both in [7] and [9]. While the proof in [7] is more modular, arguing security by modifying the proof of [9] is more straightforward.

The proof of [9] says that for every query to the \(\textsf{H}_2^3\) there are at most 2 out of \(2^\lambda \) challenge responses that can lead to a forgery, because challenges correspond one-to-one with field elements \(\varDelta \in \mathbb {F}_{2^\lambda }\), and to cheat, the adversary needs \(\varDelta \) to be a root of a non-zero quadratic polynomial in the Quicksilver check. The proof then considers a union bound over all Q queries to \(\textsf{H}_2^3\) to obtain the term \(Q/2^{\lambda -1}\) in the bound on the forgery probability of the adversary. The same proof strategy still works for the signing algorithm with counter, because for every query to \(\textsf{H}_2^3\) there are still at most two challenges that map to the roots of the Quicksilver polynomial.

Using Fewer and Shorter Vector Commitments. In the original FAEST scheme we need to have \(\prod _{\alpha =1}^{\tau } N_\alpha \ge 2^\lambda \), because the \(\lambda \)-bit challenges need to map injectively to index sequences \(I \in [N_1] \times \cdots \times [N_\tau ]\). In the setting with aborts, we only need the non-aborting challenges to map injectively to index sequences I. Therefore, as an additional optimization, we can choose to reduce the number and/or the length of some of the vectors (reducing the signature size or the signing and verification time respectively), at the cost of increasing the probability of a restart (which slows down signing). Concretely, we set parameters such that \(\sum _{\alpha =1}^{\tau } \log N_\alpha = \lambda - w\), and let \(I \leftarrow \textsf{DecodeChallenge}(\textsf{chal}_3)\) injectively decode the first \(\lambda -w\) bits of \(\textsf{chal}_3\). If some of the remaining w bits of \(\textsf{chal}_3\) are nonzero, or if \(\textsf{Open}(I)\) aborts, then the signing algorithm tries again with the next counter. The verifier rejects the signature if the last w bits of \(\textsf{chal}_3\) are not all zero. Since there are still at most two challenges that map to the roots of the Quicksilver polynomial, this optimization does not affect the security proof. The relevant part of the original FAEST and the optimized FAEST signing algorithm are given in Algorithm 1 and Algorithm 2 (Fig. 4). Another way to look at this optimization is that we increase efficiency by giving up w bits of security and that we regain security by making the prover solve a proof of \(2^w\) work for each forgery attempt.

4.1 Benchmarking the Optimized FAEST and FAEST-EM

We call our optimized FAEST and FAEST-EM signature versions, which benefit from the improved BAVC constructions, FAESTER and FAESTER-EM respectively. They have more parameters which allow to fine-tune their efficiency, and we describe their effects below.

When considering the non-optimized BAVC, the previous VOLEitH signatures FAEST and the recently proposed ReSolved [25] are limited to the signature size and signing/verification runtime trade-off only with respect to \(\tau \), the number of “small” VOLEs. Even though flexible, such a trade-off provides an exponential correlation between the signature size and signing time as shown in Fig. 5.

Fig. 4.
figure 4

Signing with FAEST vs signing with FAESTER.

Fig. 5.
figure 5

FAEST(-EM) \(\tau \)-signature size and signing time trade-off.

With the optimized BAVC, our proposed signature schemes, including FAESTER, enjoy both improved performance and an improved signature size-runtime trade-off. Our experiments show an improvement in the signature size of around 10% for FAESTER when compared to FAEST, in the L1 setting, while maintaining a similar runtime, as shown in the trade-off plot in Fig. 5. As a direct consequence of this improvement, FAESTER is the first signature scheme using standard AES with a signature size of 4.5KB. Similarly, FAESTER-EM enjoys a signature size of less than 4KB, with similar signing times. We refer to the full version for FAESTER performance for the L3 and L5 security levels.

Figure 8 shows the benefits of our new optimized BAVC for different signature schemes. Table 4 presents our recommended parameter choices for different signature schemes. In the FAEST NIST submission [9], the slow and the fast versions represented by (s) and (f) respectively were only determined by \(\tau \) as shown in the first 4 rows. However, for the optimized FAESTER and FAESTER-EM, along with the proposed new signature schemes, we also consider the optimal w and \(\mathsf {T_{open}}\) parameter as described above. We refer to Table 5 for the FAESTER optimized implementation benchmarks.

5 New VOLEitH Signature Schemes

We present three new signature schemes constructed following the footsteps of FAESTER using the optimized BAVC, however, instantiated with different OWFs. The first two variants take advantage of the Rain and MQ OWFs, discussed in Sect. 2.1 and 2.1 respectively, to achieve the lowest signature sizes (less than 3 KB) among all MPCitH and VOLEitH signature schemes. The third variant uses AES but with a different approach to proving the S-box, which reduces signature sizes by up to around 5%. We also show how to tweak the original AES proof in FAEST, to allow use of the full AES keyspace, instead of restricting to a subset of all keys.

5.1 MandaRain: VOLEitH + Rain

The MandaRain signature scheme uses two instantiations of the Rain OWF, namely Rain-3 and Rain-4 which use 3 and 4 rounds respectively. Rain has the same block size as its security parameter \(\lambda \), thus unlike FAEST and FAESTER, Rain can circumvent the need for multiple evaluations of the OWF. The parameters of Rain that we use for MandaRain can be found in Table 1.

Table 1. Rain Parameters
Table 2. MQ Parameters

We prove Rain using the VOLEitH NIZK proof as described in Sect. 2.2, with the optimized BAVC (Sect. 3.1). The prover uses as a witness the secret key k together with the internal state after each round, except for the last round which can be derived from the public key. This gives a total witness length of \(l = r\lambda \) bits for r rounds, and proving consistency requires r multiplication checks in \(\mathbb {F}_{2^\lambda }\). See Table 3 for a summary of the non-linear complexity of the Rain-3 and Rain-4 OWFs. Compared to the other OWFs, Rain has the smallest number of non-linear constraints that must be checked in ZK resulting in not only a very small signature size but also a competitive signing and verification time. Refer to Table 4 for details on the MandaRain parameters. Similarly to FAEST, Fig. 6 presents the parameter set exploration to find the most suitable parameter sets for signature size/runtime trade-offs with and without the BAVC optimization. We see that the signature size can be as small as around 2.8KB for the same or better signing runtime. Refer to Table 5 for the MandaRain optimized implementation benchmarks at the L1 security level. For L3 and L5 benchmarks, refer to the full version.

Table 3. Non-linear complexity of VOLEitH signature schemes using different OWFs.

5.2 KuMQuat: VOLEitH + MQ

Using a OWF relying on the MQ problem (Sect. 2.1), we obtain the smallest witness size, and hence the smallest signature size among all VOLEitH and MPCitH signature schemes.

Proving an MQ evaluation in VOLEitH is conceptually straightforward: the witness is the solution \({\textbf{x}}\in \mathbb {F}_q^n\) to the system of equations, and there are m quadratic constraints to verify. One challenge is that a naive approach using QuickSilver would require \(O(m n^2)\) multiplications in \(\mathbb {F}_{2^\lambda }\). In Sect. 2.1, we describe some optimizations that reduce this to just \(O(mn^2 q / \lambda )\) multiplications.

Although the runtime of KuMQuat is not as fast as MandaRain, it still has signing and verification speeds comparable to those of FAEST, for signatures of around half the size. Table 2 shows the MQ parameter choices for our experiments chosen according to the security estimation from [14, 32]. We set \(m = n\) (as in MQOM) and choose a field \(\mathbb {F}_{2^k}\) for a power k.

The field size of the MQ problem and security level determines the choice of n (see Sect. 2.1), which in turn influences the key and signature sizes and the runtime as shown in Table 5. We refer to Table 4 for the recommended parameter choice for the L1 security level. For L3 and L5, parameter choices, we refer to the full version. Note that the signature size of KuMQuat depends only mildly on the MQ parameters mn. One could therefore choose to increase nm to massively increase the margin of safety against MQ-solving attacks without growing the signature size much.

Table 4. VOLEitH signature schemes and their parameters. We denote the signature schemes as SCHEME-\(\lambda _\text {s/f}\). l is the number of VOLE correlations required for the NIZK proof. w and \(\mathsf {T_{open}}\) are the values for the optimized BAVC as described in Sect. 4. \(\tau \) is the number of VOLE repetitions determining the choice between s (slow) and f (fast) versions. \(k_0\) and \(k_1\) are bit lengths of small VOLEs. B is the padding parameter affecting the security of the VOLE check. Secret key (sk), public key (pk) and signature sizes are in bytes.
Fig. 6.
figure 6

MandaRain \(\tau \)-signature size and signing runtime trade-off.

The key difference between MQOM and KuMQuat is unlike the fully-randomized linear combination of constraints in MQOM, we take several fixed linear combinations of constraints in KuMQuat (which can be precomputed because they are fixed), then later take a random linear combination of these combined constraints as part of QuickSilver. Another way of looking at this (for \(q=2\)) is that we take the m-bit output of the MQ function and reinterpret it as a sequence of (\(m/\lambda \)) elements of \(\mathbb {F}_{2^\lambda }\), before taking a random linear combination like in MQOM. Having fewer constraints to randomly combine reduces the computational cost (assuming a faster-than-schoolbook implementation of extension field arithmetic) (Fig. 7).

Fig. 7.
figure 7

KuMQuat \(\tau \)-signature size and runtime trade-off.

Optimizations. One implementation difficulty with KuMQuat is the computational cost of the OWF. The MQ function itself has \(m n (n + 3)/2\) termsFootnote 7 (see Definition 1), each with coefficients in \(\mathbb {F}_q\), and evaluating the constraints with QuickSilver requires calculating the same number of terms over \(\mathbb {F}_{2^\lambda }\). While this seems to require \(\tilde{\varTheta }(m n^2 \lambda )\) work, we used an optimization to reduce this back to just \(\tilde{\varTheta }(m n^2 \log _2 q)\).

Instead of these m constraints (for \(i \in [m]\)) over \(\mathbb {F}_q\):

$$ 0 = \sum _{j k} \textbf{A}_{i j k} \, x_j x_k + \sum _j b_{i j} \, x_j - y_i , $$

we require that \(\mathbb {F}_{2^\lambda }\) is a degree \(r = \frac{\lambda }{\log _2(q)}\) field extension of \(\mathbb {F}_q\), and group the constraints into blocks of r:

$$ 0 = \sum _{i = r i'}^{r i' + r - 1} \alpha ^{i - r i'} \left( \sum _{j k} \textbf{A}_{i j k} \, x_j x_k + \sum _j b_{i j} \, x_j - y_i \right) , $$

where \(\alpha \) is a generator of \(\mathbb {F}_{2^\lambda }\) over \(\mathbb {F}_q\). These constraints are equivalent to the original ones, because \(\alpha ^0, \alpha ^1, \ldots , \alpha ^{r - 1}\) are linearly independent over \(\mathbb {F}_q\) since \(\mathbb {F}_{2^\lambda }\) is a degree r vector space over \(\mathbb {F}_q\). Now, we can precompute this linear combination of constraints

$$\begin{aligned} \textbf{A}'_{i' j k} &= \sum _{i = 0}^{r - 1} \alpha ^i \textbf{A}_{(r i' + i) j k} \\ b'_{i' j} &= \sum _{i = 0}^{r - 1} \alpha ^i b_{(r i' + i) j} \\ y'_{i'} &= \sum _{i = 0}^{r - 1} \alpha ^i y_{r i' + i} \end{aligned}$$

to get \(\lceil m/r \rceil \) constraints over \(\mathbb {F}_{2^\lambda }\):

$$ 0 = \sum _{j k} \textbf{A}'_{i' j k} \, x_j x_k + \sum _j b'_{i' j} \, x_j - y'_{i'} . $$

Note that evaluating these constraints for QuickSilver now requires only \(\varTheta (m n^2 / r)\) operations over \(\mathbb {F}_{2^\lambda }\). Assuming \(\mathbb {F}_{2^\lambda }\) multiplication can be done in \(\tilde{\varTheta }(\lambda )\) time, this is \(\tilde{\varTheta }(m n^2 \log _2 q)\) time.

As a final optimization, note that if \(r \le m / r\) then there are exactly r \(\textbf{A}_{i j k}\) elements that get mapped into a single \(\textbf{A}'_{i' j k}\), and that the transformation between them is bijective (and similarly for \({\textbf{b}}\) and y). Therefore, sampling all \(\textbf{A}'_{i'}\) uniformly at random from the subset of upper triangular matrices in \(\mathbb {F}_{2^\lambda }^{n \times n}\) is equivalent to sampling the original \(\textbf{A}_{i}\) elements uniformly from the upper triangular matrices in \(\mathbb {F}_q^{n \times n}\), for all except very last \(i'\). To save computing this transformation, other than for the last \(i'\) we sample the \(\textbf{A}'_{i'}\) and \({\textbf{b}}'_{i'}\) directly, instead of going through \(\textbf{A}_{r i'}, \ldots , \textbf{A}_{r i' + r - 1}\). Similarly, for \(i' \le m / r\) we also use \(y'_{i'}\) directly in the public key, rather than converting between them and the \({\textbf{y}}_i\)s.

5.3 Uniform AES Keys in FAEST

When using one-way functions based on AES or Rijndael, as in FAEST(ER) and FAEST(ER)-EM, the main challenge is proving consistency of the non-linear part of the S-box. We denote this by the function

$$\begin{aligned} S : x \mapsto x^{254} \in \mathbb {F}_{2^8} \end{aligned}$$

When proving AES in zero-knowledge, the committed witness is typically used to derive an input/output pair \((x,y) \in \mathbb {F}_{2^8}^2\) for each S-box, and the prover shows that \(y = S(x)\) by proving the degree-2 constraint \(xy = 1\). However, this only works when xy are non-zero; this meant that prior works [7, 10, 26] had to restrict the set of AES keys to those where the input to every S-box is non-zero. This requires adding a rejection sampling step to key generation, and slightly reduces entropy of the signing key, effectively reducing security by 1–2 bits [9, Sec. 10.3.4].

We observe that instead, \(y = S(x)\) can be proven for all values of \(x,y \in \mathbb {F}_{2^8}\), by the pair of constraints:

$$\begin{aligned} xy^2 = y \quad \wedge \quad x^2y = x \end{aligned}$$
(1)

where the first constraint guarantees that we cannot have \(x = 0 \wedge y \ne 0\), and the second ensures against \(y = 0 \wedge x \ne 0\).

While these constraints have degree-3 over \(\mathbb {F}_{2^8}\), when viewed over \(\mathbb {F}_2\), their degree is 2 (since squaring is \(\mathbb {F}_2\)-linear). In FAEST, the witness is initially committed to over \(\mathbb {F}_2\), and only lifted to \(\mathbb {F}_{2^8}\) for proving the S-box. So, we can easily modify it to prove (1) by linearly computing commitments to the bits of \(x^2\) and \(y^2\) over \(\mathbb {F}_2\), before lifting and proving the pair of degree-2 constraints over \(\mathbb {F}_{2^8}\). This doubles the number of constraints that are proven, however, in the end, only a random linear combination of all constraints is checked. This means that we can support uniform AES keys with no impact on proof size.

We implemented this tweaked AES proof by modifying the FAEST implementation (FAEST(ER)-fullkey), and using the same parameters as in FAEST(ER) noticed no change in performance when running the signing and verification benchmarks as shown in Table 5. This is because the main cost of FAEST is the PRG and hashing operations in the BAVC, so merely doubling the number of constraints does not noticeably affect performance. Moreover, due to the absence of rejection sampling when choosing the key, we get a runtime improvement in the key generation algorithm. A similar full-key tweak is also possible for the Rain OWF in MandaRain.

5.4 FAEST-d7: Proving AES via Degree-7 Constraints

We have also investigated an alternative approach to proving knowledge of a preimage for the AES-based OWFs, using higher degree constraints over \(\mathbb {F}_2\), rather than quadratic constraints over \(\mathbb {F}_{2^8}\). This allows us to use an AES witness of half the size, which in some cases reduces signature size.

FAEST-d7 is based on the variant of the QuickSilver proof system [52] that allows for proving arbitrary degree-d constraints on the committed witness. In particular, we use degree-7 constraints, since the AES S-box and its inverse can both be expressed as degree-7 circuits over \(\mathbb {F}_2\).Footnote 8 We combine this with a meet-in-the-middle idea: instead of committing to the AES state after every round, the prover only commits to the state of every other round. Given committed states \(s_i, s_{i+2}\), we can now prove consistency by verifying that \(R_i(s_i) = R_{i+1}^{-1}(s_{i+2})\), where \(R_i\) is the i-th round function. Each pair of neighbouring AES states can thus be verified with a single degree-7 QuickSilver check. The same idea can be applied to the S-boxes in the key schedule.

Computational Efficiency. In QuickSilver, proving a degree-d circuit \(C(x_1,\dots ,x_n)\) requires expressing C as a sum of polynomials \(\sum _{i=0}^d f_i(x_1,\dots ,x_n)\), where each \(f_i\) contains monomials only of degree i. While the \(f_i\)’s need not be computed explicitly, the prover is required to evaluate each \(f_i\). It’s not clear how efficiently this can be done for a complex function like the AES S-box.

We observe that it’s not necessary to compute the \(f_i\)’s at all. Instead, to prove the degree-d circuit C, it suffices for the prover to compute the coefficients of a degree-d univariate polynomial, given by \(g(y) = C(a_1 + b_1y, \dots , a_n + b_ny)\), for values \(a_i,b_i \in \mathbb {F}_{2^\lambda }\) known to the prover. Meanwhile, the verifier only needs to evaluate C at a single point. When C is the AES S-box, we estimate the cost for the prover is around 150 multiplications in \(\mathbb {F}_{2^\lambda }\). While this is a lot more than the cost of proving 1 multiplication in \(\mathbb {F}_{2^8}\), it is still insignificant when compared with all of the PRG and hash calls used in the other components of FAEST. We include further details of this method in the full version of this paper.

Signature Size. The main advantage of this approach is that the total witness size is halved, from e.g. \(l=1600\) to \(l=800\) at the 128-bit security level. However, this does not come for free, since proving degree-d relations with QuickSilver incurs a cost of \(d \tau \lambda \) bits in the signature size. Overall, when applied to FAEST variants with an l-bit witness, we reduce the signature size by \(\tau l/2 - 5 \tau \lambda \) bits. For the Even-Mansour 128-bit variants, we have \(l/2 = 5\lambda \), so there is no change in size. However, for the standard AES variants and the higher security Even-Mansour variants, we see a reduction of up to around 5%.

We have not implemented FAEST-d7, but show in Table 4 the signature sizes it obtains, as well as those of the FAESTER-d7 variant incorporating our GGM tree optimizations.

Table 5. Signing Time (ms), Verification Time (ms), and Signature Size (bytes) of different VOLEitH-based signature schemes (optimized implementations). Slow and fast versions are denoted with s and f respectively.

6 Broader Discussion

This section compares the existing VOLEitH and MPCitH signature schemes, including the candidates of NIST’s call for Additional Signatures, with our proposed optimized signature schemes.

Fig. 8.
figure 8

Signature size and runtime comparison between state-of-the-art MPCitH and VOLEitH signature schemes. The slow and fast versions are denoted with s and f respectively. Other special versions are denoted by their short names as per their publicly available specification. Due to negligible difference in the performance between the full key and the reduced key version of FAEST and FAESTER, the full key data points are not explicitly included in the figure for better readability and are represented by the reduced key data points.

Benchmark Platform. To benchmark and compare all the implementations fairly, we evaluate only the most optimized implementations of the signature schemes that are openly available. For the NIST candidates, we refer to the submitted optimized implementations. We measure all the run times on a system with an AMD Ryzen 9 7900X 12-Core CPU, 128 GB memory and running Ubuntu 22.04.

Security Assumption. The choice of different OWFs allows for a wide variety of security assumptions one can choose from when constructing a VOLEitH signature scheme. For example, using an AES-based OWF results in a highly conservative security guarantee at the cost of a performance penalty in terms of signature size and runtime. This tradeoff is similar to the previous state-of-the-art MPCitH signature schemes like BBQ, Banquet, Helium which relied on the standard AES OWF and naturally possessed larger signature size and runtime than their competing schemes which relied on optimized but non-standard OWFs like Rainier [29] or Picnic [24]. Switching to AES-EM construction for VOLEitH signature does not give us the most conservative security guarantees like standard AES, however, the general EM construction is already more than two decades old, thus guaranteeing security in a similar ballpark as of AES while still improving both the signature size and runtime considerably. On the other side, optimized OWFs like Rain and AIM [39] are rather new and not that well studied. For example, in 2023, AIM already witnessed two full round attacks [41, 54] which were later fixed in AIM2 [38]. Due to the mitigation, as per the authors, the signature scheme AIMER using AIM OWF suffers around 10% runtime penalty. This work does not consider using the AIM OWF for constructing a VOLEitH signature scheme as we conjecture that it will lead to worse runtime due large number of Mersenne exponentiation while still giving a signature size similar to Rain. On the other hand, when considering the KuMQuat signature scheme, we benefit from the MQ problem which relies on a different hardness problem, giving us more choices, when compared to the symmetric primitives like AES, Rain, or AIM. Similarly, in the recently proposed VOLEitH signature scheme ReSolveD [25], their OWF relies on the syndrome decoding problem.

Symmetric Key Primitives. FAEST’s zero-knowledge proofs are built out of pseudorandom generators and hash functions, and their instantiation is important for efficiency and security. For consistency with the FAEST and FAEST-EM proposal [9], we use AES-CTR everywhere a PRG is required, and the SHAKE hash function for all random oracle calls, including those at the leaves of the GGM tree.

Parameters. A careful choice of parameters, including the choice of OWF, is crucial for achieving the best performance of the signature scheme. In the previous sections, we extensively demonstrated the impact of w, \(\mathsf {T_{open}}\), and \(\tau \) on the signature size and runtime. Additionally, when considering the MQ OWF, the operational field (\(\mathbb {F}_n\)) is also a crucial factor determining the performance. For example, KuMQuat-\(2^1\)-\(\lambda \) operating in \(\mathbb {F}_2\) leads to the smallest signature size, however, has the largest number of non-linear constraints among the other proposed VOLEitH signature schemes leading to a long signing and verification runtime. Alternatively, KuMQuat-\(2^{8}\)-\(\lambda \) leads to a larger signature size, due to more witness bits, however, the number of constraints is roughly 70% smaller, leading to a faster runtime than KuMQuat-\(2^1\)-\(\lambda \).

Key Sizes. The key sizes only depend on the underlying OWF and are not affected by the VOLEitH parameters. With the MQ OWF, for example, the operational field \(\mathbb {F}_2^n\) and \(\lambda \) determine the size of sk and pk. The key sizes of MandaRain are determined only by \(\lambda \).

Signature Size and Runtime. FAEST-EM requires 20–30% less non-linear constraints compared to FAEST, which directly influences both the signature size and the runtime, especially for the slow signature variant with a smaller signature size as shown in Table 5. This holds also true for MandaRain which has the smallest number of non-linear constraints enabling it to enjoy the smallest signature runtime along with the smallest signature size after our proposed KuMQuat signature scheme. Looking at the signature size runtime trade-off, in terms of performance we conclude that MandaRain provides a better signature size runtime trade-off, as it has a slightly larger signature size than KuMQuat, however, to the best of our knowledge, it has the smallest runtime among all VOLEitH and MPCitH based signature schemes. We also looked into the possibility of using NIST standardised AsconFootnote 9 [28] as a OWF for constructing VOLEitH signature scheme. However, due to the design structure of Ascon, our estimates showed us that the signature size will be much worse than that of standard AES even if we can design an Ascon-style permutation for the OWF.Footnote 10 One may also question the fitness of other symmetric primitives which are especially designed for use in MPC, Homomorphic Encryption (HE) and ZKP use-cases. Even though several of these primitives focus on reducing the number of multiplications and their multiplicative depth, such primitives are designed while considering adversaries with higher adversary data complexity. The higher the number of rounds required to guarantee security from a key recovery attack increases the number of witness bits that must be communicated to the verifier. For MPCitH or VOLEitH signature schemes, an adversary knows only the public key or one plaintext-ciphertext pair, though. Hence, VOLEitH- or MPCitH-friendly symmetric primitives like Rain and AIM assume that an adversary knows only the public key, requiring them to have as low as only 3 rounds to guarantee security against key recovery attacks.

For fairness, we compare only the optimized implementations of the signature schemes and thus could not include the recent VOLEitH signature ReSolveD [25], as to the best of our knowledge, there exists no optimized implementation for it at the time of writing. However, when comparing the reference implementations of ReSolveD with FAEST and FAEST-EM, we conjecture that the optimized implementation of ReSolveD would be slower than Rain and FAESTER-EM at least, if not also FAESTER. In Fig. 8, we compare our proposed VOLEitH signature schemes with other competitive MPCitH and VOLEitH signature schemes. Here, KuMQuat provides the smallest signature size at a high runtime cost. Whereas, MandaRain provides the best signature size runtime trade-off, where it enjoys the best runtime and gives a signature size only second to KuMQuat. Notably, both MandaRain and KuMQuat are the first VOLEitH signature schemes with signature sizes less than 3 KB. This is also the lowest among all the MPCitH signature schemes. FAESTER, using the optimized BAVC, for the first time achieves a signature size of 4.5 KB while still relying on standard AES. Similarly, FAESTER-EM also enjoys a considerably smaller signature size of just 4.1 KB while relying on AES combined with the EM construction.