Papers updated in last 365 days (Page 23 of 3017 results)
Elastic MSM: A Fast, Elastic and Modular Preprocessing Technique for Multi-Scalar Multiplication Algorithm on GPUs
Zero-knowledge proof (ZKP) is a cryptographic primitive that enables a prover to convince a verifier that a statement is true, without revealing any other information beyond the correctness of the statement itself. Due to its powerful capabilities, its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been widely deployed in various privacy preserving applications such as cryptocurrencies and verifiable computation. Although state-of-the-art zkSNARKs are highly efficient for the verifier, the computational overhead for the prover is still orders of magnitude too high to warrant use in many applications. This overhead arises from several time-consuming operations, including large-scale matrix-vector multiplication (MUL), number-theoretic transform (NTT), and especially the multi-scalar multiplication (MSM) which constitutes the largest proportion. Therefore, further efficiency improvements are needed.
In this paper, we focus on comprehensive optimization of running time and storage space required by the MSM algorithm on GPUs. Specifically, we propose a novel, modular and adaptive parameter configuration technique—elastic MSM to enable us to adjust the scale of MSM according to our own wishes by performing a corresponding amount of preprocessing. This technique enables us to fully unleash the potential of various efficient parallel MSM algorithms. We have implemented and tested elastic MSM over three prevailing parallel Pippenger algorithms on GPUs. Across various preprocessing space limitations (across various MSM scales), our constructions achieve up to about 1.90×, 1.08× and 1.36× (2.58×, 1.39× and 1.91×) speedup versus three state-of-the-art parallel Pippenger algorithms on GPUs, respectively.
From another perspective, elastic MSM could also be regarded as a preprocessing technique over the well-known Pippenger algorithm, which is modular and could be used to accelerate almost all the most advanced parallel Pippenger algorithms on GPUs. Meanwhile, elastic MSM provides an adaptive trade-off between the running time and the extra storage space needed by parallel Pippenger algorithms on GPUs. This is the first preprocessing technique to retain the improved MSM computation brought by preprocessing under varying storage space limitations. Specifically, across various preprocessing space limitations (across various MSM scales), our constructions achieve up to about 192× and 223× (159× and 174×) speedup versus two state-of-the-art preprocessing parallel Pippenger algorithms on GPUs, respectively.
KpqClean Ver2: Comprehensive Benchmarking and Analysis of KpqC Algorithm Round 2 Submissions
From 2022, Korean Post-Quantum Cryptography (KpqC) Competition has been held. Among the Round 1 algorithms of KpqC, eight algorithms were selected in December 2023. To evaluate the algorithms, the performance is critical factor. However, the performance of the algorithms submitted to KpqC was evaluated in different development environments. Consequently, it is difficult to compare the performance of each algorithm fairly, because the measurements were not conducted in the identical development environments. In this paper, we introduce KpqClean ver2, the successor to the KpqClean project. KpqClean ver2 provides comprehensive benchmark analysis results for all KpqC Round 2 algorithms across various environments (Ryzen, Intel, and aarch64). This framework includes both a ``clean'' implementation and an ``avx2'' implementation of the KpqC Round 2 candidate algorithms. To benchmark the algorithms, we not only removed external library dependencies from each algorithm but also integrated the same source code for common algorithms (such as AES, SHA2, SHAKE, and etc.) to enable more accurate performance comparisons. The framework automatically recognizes the user’s environment, providing easy benchmarking for all users without the need for separate settings. This study also includes memory usage analysis using Valgrind for each algorithm and function usage proportion analysis during the execution of each cryptographic algorithm using Xcode's profiling tool. Finally we show that the practical strength of KpqC algorithms in terms of execution timing and memory usages. This result can be utilized for the understanding of KpqC finalist in terms of performance.
Automatic Quantum Multi-collision Distinguishers and Rebound Attacks with Triangulation Algorithm
In EUROCRYPT 2020, Hosoyamada and Sasaki found that differential paths with probability $2^{-2n/3}$ can be useful in quantum collision attacks, v.s. $2^{-n/2}$ for classical collision attacks. This observation led to attacks for more rounds on some AES-like hash functions. In this paper, we quantize the multi-collision distinguisher proposed by Biryukov, Khovratovich, and Nikoli{\'c} at CRYPTO 2009, and propose quantum multi-collision distinguishers. We use CP-tool to automatically search for the configurations for multi-collision distinguishers and rebound attacks by taking into account related-key/single-key differentials of the underlying block cipher. We apply our method to AES-like primitives including block ciphers AES, Rijndael, Saturnin and AES-hashing modes AES-DM and AES-HCF.
Improved YOSO Randomness Generation with Worst-Case Corruptions
We study the problem of generating public unbiased randomness in a distributed manner within the recent You Only Speak Once (YOSO) framework for stateless multiparty computation, introduced by Gentry et al. in CRYPTO 2021. Such protocols are resilient to adaptive denial-of-service attacks and are, by their stateless nature, especially attractive in permissionless environments. While most works in the YOSO setting focus on independent random corruptions, we consider YOSO protocols with worst-case corruptions, a model introduced by Nielsen et al. in CRYPTO 2022.
Prior work on YOSO public randomness generation with worst-case corruptions designed information-theoretic protocols for $t$ corruptions with either $n=6t+1$ or $n=5t$ roles, depending on the adversarial network model. However, a major drawback of these protocols is that their communication and computational complexities scale exponentially with $t$. In this work, we complement prior inefficient results by presenting and analyzing simple and efficient protocols for YOSO public randomness generation secure against worst-case corruptions in the computational setting. Our first protocol is based on publicly verifiable secret sharing and uses $n=3t+2$ roles. Since this first protocol requires setup and somewhat heavy cryptographic machinery, we also provide a second lighter protocol based on ElGamal commitments and verifiable secret sharing which uses $n=5t+4$ or $n=4t+4$ roles depending on the underlying network model. We demonstrate the practicality of our second protocol by showing experimental evaluations, significantly improving over prior proposed solutions for worst-case corruptions, especially in terms of transmitted data size.
OPTIKS: An Optimized Key Transparency System
Key Transparency (KT) refers to a public key distribution system with transparency mechanisms proving its correct operation, i.e., proving that it reports consistent values for each user's public key. While prior work on KT systems have offered new designs to tackle this problem, relatively little attention has been paid on the issue of scalability. Indeed, it is not straightforward to actually build a scalable and practical KT system from existing constructions, which may be too complex, inefficient, or non-resilient against machine failures.
In this paper, we present OPTIKS, a full featured and optimized KT system that focuses on scalability. Our system is simpler and more performant than prior work, supporting smaller storage overhead while still meeting strong notions of security and privacy. Our design also incorporates a crash-tolerant and scalable server architecture, which we demonstrate by presenting extensive benchmarks. Finally, we address several real-world problems in deploying KT systems that have received limited attention in prior work, including account decommissioning and user-to-device mapping.
Cryptographic Analysis of the Bluetooth Secure Connection Protocol Suite
We give a cryptographic analysis of the Bluetooth Secure Connections Protocol Suite. Bluetooth supports several subprotocols, such as Numeric Comparison, Passkey Entry, and Just Works, in order to match the devices' different input/output capabilities.
Previous analyses (e.g., Lindell, CT-RSA'09, or Troncoso and Hale, NDSS'21) often considered (and confirmed) the security of single subprotocols only. Recent practically verified attacks, however, such as the Method Confusion Attack (von Tschirschnitz et al., S&P'21) against Bluetooth's authentication and key secrecy property, often exploit the bad interplay of different subprotocols. Even worse, some of these attacks demonstrate that one cannot prove the Bluetooth protocol suite to be a secure authenticated key exchange protocol.
We therefore aim at the best we can hope for and show that the protocol still matches the common key secrecy requirements of a key exchange protocol if one assumes a trust-on-first-use (TOFU) relationship. This means that the adversary needs to mount an active attack during the initial connection, otherwise the subsequent reconnections remain secure.
Investigating the cryptographic strength of the Bluetooth protocol, we also look into the privacy mechanism of address randomization in Bluetooth (which is only available in the Low Energy version). We show that the cryptography indeed provides a decent level of address privacy, although this does not rule out identification of devices via other means, such as physical characteristics.
Towards a Tightly Secure Signature in Multi-User Setting with Corruptions Based on Search Assumptions
This paper is a report on how we tackled constructing a digital signature scheme whose multi-user security with corruption can be tightly reduced to search assumptions. We fail to (dis)prove the statement but obtain the following new results:
- We reveal two new properties of signature schemes whose security cannot be tightly reduced to standard assumptions.
- We construct a new signature scheme. Its multi-user security with corruption is reduced to the CDH assumption (in the ROM), and its reduction loss is independent of the number of users but depends on the number of RO queries.
HELIOPOLIS: Verifiable Computation over Homomorphically Encrypted Data from Interactive Oracle Proofs is Practical
Homomorphic encryption (HE) enables computation on encrypted data, which in turn facilitates the outsourcing of computation on private data. However, HE offers no guarantee that the returned result was honestly computed by the cloud. In order to have such guarantee, it is necessary to add verifiable computation (VC) into the system.
The most efficient recent works in VC over HE focus on verifying operations on the ciphertext space of the HE scheme, which usually lacks the algebraic structure that would make it compatible with existing VC systems. For example, multiplication of ciphertexts in the current most efficient HE schemes requires non-algebraic operations such as real division and rounding. Therefore, existing works for VC over HE have to either give up on those efficient HE schemes, or incur a large overhead (an amount of constraints proportional to the ciphertext ring's size) in order to emulate these non-algebraic operations.
In this work, we move away from that paradigm by placing the verification checks in the plaintext space of HE, all while the prover remains computing on ciphertexts. We achieve this by introducing a general transformation for Interactive Oracle Proofs (IOPs) to work over HE, whose result we denote as HE-IOPs. We apply this same transformation to the FRI [Ben-Sasson et al., ICALP 2018] IOP of proximity and we show how to compile HE-Reed Solomon-encoded IOPs and HE-$\delta$-correlated-IOPs with HE-FRI into HE-IOPs. Furthermore, our construction is compatible with a prover that provides input in zero-knowledge, and only relies on building blocks that are plausibly quantum-safe.
Aligning the security parameters of HE and FRI is a difficult task for which we introduce several optimizations. We demonstrate their efficiency with a proof-of-concept implementation and show that we can run FRI's commit phase for 4096 encrypted Reed Solomon codewords with degree bound $2^{11}$ in just 5.4 seconds (using 32 threads) on a c6i.metal instance using less than 4GB of memory. Verification takes just 12.3 milliseconds (single-threaded) for the same parameter set and can be reduced to just 5.6ms with parameters optimized for the verifier.
Plaintext-Ciphertext Matrix Multiplication and FHE Bootstrapping: Fast and Fused
Homomorphically multiplying a plaintext matrix with a ciphertext matrix (PC-MM) is a central task for the private evaluation of transformers, commonly used for large language models. We provide several RLWE-based algorithms for PC-MM that consist of multiplications of plaintext matrices (PC-MM) and comparatively cheap pre-processing and post-processing steps: for small and large dimensions compared to the RLWE ring degree, and with and without precomputation. For the algorithms with precomputation, we show how to perform a PC-MM with a single floating-point PP-MM of the same dimensions. This is particularly meaningful for practical purposes as a floating-point PC-MM can be implemented using high-performance BLAS libraries.
The algorithms rely on the multi-secret variant of RLWE, which allows to represent multiple ciphertexts more compactly. We give algorithms to convert from usual shared-secret RLWE ciphertexts to multi-secret ciphertexts and back. Further, we show that this format is compatible with homomorphic addition, plaintext-ciphertext multiplication, and key-switching. This in turn allows us to accelerate the slots-to-coeffs and coeffs-to-slots steps of CKKS bootstrapping when several ciphertexts are bootstrapped at once. Combining batch-bootstrapping with efficient PC-MM results in MaMBo (Matrix Multiplication Bootstrapping), a bootstrapping algorithm that can perform a PC-MM for a limited overhead.
Registered (Inner-Product) Functional Encryption
Registered encryption (Garg $et\ al.$, TCC'18) is an emerging paradigm that tackles the key-escrow problem associated with identity-based encryption by replacing the private-key generator with a much weaker entity known as the key curator. The key curator holds no secret information, and is responsible to:
(i) update the master public key whenever a new user registers its own public key to the system;
(ii) provide helper decryption keys to the users already registered in the system, in order to still enable them to decrypt after new users join the system.
For practical purposes, tasks (i) and (ii) need to be efficient, in the sense that the size of the public parameters, of the master public key, and of the helper decryption keys, as well as the running times for key generation and user registration, and the number of updates, must be small.
In this paper, we generalize the notion of registered encryption to the setting of functional encryption (FE). As our main contribution, we show an efficient construction of registered FE for the special case of ($attribute\text{-}hiding$) inner-product predicates, built over asymmetric bilinear groups of prime order. Our scheme supports a $large$ attribute universe and is proven secure in the bilinear generic group model. We also implement our scheme and experimentally demonstrate the efficiency requirements of the registered settings. Our second contribution is a feasibility result where we build registered FE for $\mathsf{P}/\mathsf{poly}$ based on indistinguishability obfuscation and somewhere statistically binding hash functions.
The Espresso Sequencing Network: HotShot Consensus, Tiramisu Data-Availability, and Builder-Exchange
Building a Consensus platform for shared sequencing can power an ecosystem of layer-2 solutions such as rollups which are crucial for scaling blockchains (e.g.,Ethereum). However, it drastically differs from conventional Consensus for blockchains in two key considerations:
• (No) Execution: A shared sequencing platform is not responsible for pre-validating blocks nor for processing state updates. Therefore, agreement is formed on a sequence of certificates of block data-availability (DA) without persisting them or obtaining blocks in full. At the same time, the platform must stream block data with very high efficiency to layer-2 entities for execution, or (in the case of rollups) for proof generation.
• Builder-Exchange: A shared sequencing platform delegates to external entities to build blocks and separates it from the role of a consensus proposer. This allows an ecosystem of specialized builders to pre-validate transactions for diversified rollups, languages, and MEV exploits. However, separating the task of block-building from proposing brings a new challenge. Builders want assurances that their blocks would commit in exchange for revealing their contents, whereas validators/proposers want assurance that the data in committed blocks will be available and fees paid. Neither one trusts the other, hence the shared sequencing platform should facilitate a “fair-exchange” between builders and the sequencing network. The Espresso Sequencing Network is purpose-built to address these unique considerations.
Among the main novelties of the design are (i) a three-layered DA system called Tiramisu, coupled with (ii) a costless integration of the DA with the platform’s consensus core, and (iii) a Builder-Exchange mechanism between builders and the consensus core.
Note that this paper relies substantially on and can be seen as an extension of The Espresso Sequencer: HotShot Consensus and Tiramisu Data Availability [84].
Password-authenticated Cryptography from Consumable Tokens
Passwords are widely adopted for user authentication in practice, which led to the question of whether we can bootstrap a strongly-secure setting based on them. Historically, this has been extensively studied for key exchange; bootstrap from a low-entropy password to a high entropy key securing the communication. Other instances include digital lockers, signatures, secret sharing, and encryption.
Motivated by a recent work on consumable tokens (Almashaqbeh et al., Eurocrypt 2022), we extend these efforts and investigate the unified notion of password-authenticated cryptography in which knowing a password allows executing cryptographic functionalities. Our model is resistant to exhaustive search attacks due to the self-destruction and unclonability properties of consumable tokens. We study two directions; the first is password-authenticated delegation of cryptographic capabilities in which a party can delegate her, e.g., signing or encryption/decryption, rights to another such that exercising the delegation requires knowing a password. The second direction is password-authenticated MPC, in which only participants who share the correct password can execute the MPC protocol. In both cases, an adversary who does not know the password can try a few guesses after which the functionality self-destructs.
We formally define the notions above and build constructions realizing them. Our primary goal in this work is examining the power of consumable tokens in building password-authenticated cryptography in terms of viable constructions and supported adversary models, and thus, outlining open problems and potential future work directions.
Return of the Kummer: a Toolbox for Genus-2 Cryptography
This work expands the machinery we have for isogeny-based cryptography in genus 2 by developing a toolbox of several essential algorithms for Kummer surfaces, the dimension-2 analogue of $x$-only arithmetic on elliptic curves. Kummer surfaces have been suggested in hyper-elliptic curve cryptography since at least the 1980s and recently these surfaces have reappeared to efficiently compute $(2,2)$-isogenies. We construct several essential analogues of techniques used in one-dimensional isogeny-based cryptography, such as pairings, deterministic point sampling and point compression and give an overview of $(2,2)$-isogenies on Kummer surfaces. We furthermore show how Scholten's construction can be used to transform isogeny-based cryptography over elliptic curves over $\mathbb{F}_{p^2}$ into protocols over Kummer surfaces over $\mathbb{F}_{p}$
As an example of this approach, we demonstrate that SQIsign verification can be performed completely on Kummer surfaces, and, therefore, that one-dimensional SQIsign verification can be viewed as a two-dimensional isogeny between products of elliptic curves. Curiously, the isogeny is then defined over $\mathbb{F}_{p}$ rather than $\mathbb{F}_{p^2}$. Contrary to expectation, the cost of SQIsign verification using Kummer surfaces does not explode: verification costs only 1.5$\times$ more in terms of finite field operations than the SQIsign variant AprèsSQI, optimised for fast verification. Furthermore, it is plausible that arithmetic on Kummer surfaces can be efficiently vectorised, giving Kummer-based protocols over $\mathbb{F}_{p}$ a potential performance boost on modern architectures, possibly surpassing the performance of elliptic-curve analogues over $\mathbb{F}_{p^2}$
Functional Commitment Schemes: From Polynomial Commitments to Pairing-Based Accumulators from Simple Assumptions
We formalize a cryptographic primitive called functional commitment (FC) which can be viewed as a generalization of vector commitments (VCs), polynomial commitments and many other special kinds of commitment schemes. A non-interactive functional commitment allows committing to a message in such a way that the committer has the flexibility of only revealing a function $F(M)$ of the committed message during the opening phase. We provide constructions for the functionality of linear functions, where messages consist of vectors of $n$ elements over some domain $\mathcal{D}$ (e.g., $\vec{m} = (m_1,\ldots,m_n) \in \mathcal{D}^n$) and commitments can later be opened to a specific linear function $\sum_{i=1}^n m_i x_i = y \in \mathcal{R}$ of the vector coordinates. An opening for a function $F: \mathcal{D}^n \rightarrow \mathcal{R}$ thus generates a witness for the fact that $F(\vec{m})$ indeed evaluates to $y$. One security requirement is called \emph{function binding} and requires that it be infeasible open a commitment to two different evaluations $y,y'$ for the same function $F$.
We propose a construction of functional commitment (FC) for linear functions based on constant-size assumptions in composite order groups endowed with a bilinear map. The construction has commitments and openings of constant size (i.e., independent of $n$ or function description) and is \emph{perfectly hiding} -- the underlying message is information theoretically hidden. Our security proofs build on the Déjà Q framework of Chase and Meiklejohn (Eurocrypt 2014) and its extension by Wee (TCC 2016) to encryption primitives, thus relying on constant-size subgroup decisional assumptions. We show that FCs for linear functions are sufficiently powerful to solve four open problems. They, first, imply polynomial commitments,and, then, give cryptographic accumulators (i.e., an algebraic hash function which makes it possible to efficiently prove that some input belongs to a hashed set). In particular, specializing our FC construction leads to the first pairing-based polynomial commitments and accumulators for large universes, known to achieve security under simple assumptions. We also substantially extend our pairing-based accumulator to handle subset queries which requires a non-trivial extension of the Déjà Q framework.
A Survey on SoC Security Verification Methods at the Pre-silicon Stage
This paper presents a survey of the state-of-the-art pre-silicon security verification techniques for System-on-Chip (SoC) designs, focusing on ensuring that designs, implemented in hardware description languages (HDLs) and synthesized circuits, meet security requirements before fabrication in semiconductor foundries. Due to several factors, pre-silicon security verification has become an essential yet challenging aspect of the SoC hardware lifecycle. The modern SoC design process often adheres to a design reuse philosophy, integrating multiple functional blocks or Intellectual Property (IP) cores sourced from various vendors onto a single chip. While beneficial for reducing costs and accelerating time-to-market, this approach introduces numerous untrustworthy third-party entities into the supply chain. It increases the potential for introducing security vulnerabilities significantly. Additionally, hardware fabrication, assembly, and testing are frequently outsourced to third-party entities, further exacerbating security risks. Moreover, the growing complexity of SoC designs leads to unanticipated interactions between hardware and software layers, creating potential gateways for attackers to exploit and steal confidential information from devices. In response to these challenges, recent years have seen a surge in the development of innovative SoC security verification techniques. This survey provides an overview of these methods, their high-level working principles, strengths, and weaknesses. By understanding these techniques, designers can better evaluate their effectiveness and select the most appropriate methods aligned with the specific security objectives for their SoC designs.
Quantum Key Recovery Attacks on 4-round Iterated Even-Mansour with Two Keys
In this paper, we propose quantum key recovery attacks on 4-round iterated Even-Mansour (IEM) with a key schedule that applies two keys alternately.
We first show that a conditional periodic function such that one of the secret keys appears as a period conditionally can be constructed using the encryption function and internal permutations.
By applying the offline Simon's algorithm to this function, we construct a key recovery attack with a complexity of $O(\sqrt{N} \log N)$ for $N = 2^n$, where $n$ is the block size and one secret key size.
Using quantum queries, this attack outperforms the generic quantum attack, i.e., Grover's search which takes the time complexity of $O(N)$.
Moreover, we propose the quantum version of the multibridge attack proposed by Dinur et al. in ASIACRYPT 2014 to analyze the 4-round IEM. As a result, we show that the quantum multibridge attack can achieve the optimal complexity of $O(N)$ even if we have only $O(1)$ data without quantum queries, while the classical attack requires $O(N)$ data to achieve the same time complexity.
Furthermore, we show that the quantum multibridge attack slightly outperforms Grover's search when considering the quantum circuit depth for these attacks.
Mova: Nova folding without committing to error terms
We present Mova, a folding scheme for R1CS instances that does not require committing to error or cross terms, nor makes use of the sumcheck protocol. We compute concrete costs and provide benchmarks showing that, for reasonable parameter choices, Mova's Prover is about $5$ to $10$ times faster than Nova's Prover, and about $1.05$ to $1.3$ times faster than Hypernova's Prover (applied to R1CS instances) -- assuming the R1CS witness vector contains only small elements. Mova's Verifier has a similar cost as Hypernova's Verifier, but Mova has the advantage of having only $3$ rounds of communication, while Hypernova has a logarithmic number of rounds.
Mova, which is based on the Nova folding scheme, manages to avoid committing to Nova's so-called error term $E$ and cross term $T$ by replacing said commitments with evaluations of the Multilinear Extension (MLE) of $E$ and $T$ at a random point sampled by the Verifier. A key observation used in Mova's soundness proofs is that $E$ is implicitly committed by a commitment to the input-witness vector $Z$, since $E=(A\cdot Z)\circ (B\cdot Z) -u (C\cdot Z)$.
We also note that ProtoGalaxy [EG23] can be specialized to a R1CS folding scheme with similar properties. Some of our further contributions are that 1) Mova is described with a language that sheds new insights into the topic of "Nova-style folding"; 2) we provide concrete costs, benchmarks, and optimizations for the Prover; 3) we describe how to fold two accumulated instances (which is important for applications in Proof Carrying Data); and 4) provide non-trivial knowledge soundness proofs in the context of multilinear polynomials.
Secure Multiparty Computation with Identifiable Abort from Vindicating Release
In the dishonest-majority setting, secure multiparty computation (MPC) with identifiable abort (IA) guarantees that honest parties can identify and agree upon at least one cheating party if the protocol does not produce an output. Known MPC constructions with IA rely on generic zero-knowledge proofs, adaptively secure oblivious transfer (OT) protocols, or homomorphic primitives, and thus incur a substantial penalty with respect to protocols that abort without identifiability.
We introduce a new, weaker notion of IA called input-revealing IA (IRIA), which can be constructed through selective revealing of committed input values - a technique we call vindicating release. We show that this weaker form of IA can be achieved with small concrete overheads for many interesting protocols in the literature, including the pre-processing protocols needed for several state-of-the-art MPC protocols.
We next show how to assemble these IRIA components into an MPC protocol for any functionality with standard IA. Such a realization differs minimally in terms of cost, techniques, and analysis from the equivalent realization that lacks identifiability, e.g., our total bandwidth overhead incurred is less than 2x, which is an asymptotic improvement over prior work on IA.
On a practical level, we apply our techniques to the problem of threshold ECDSA, and show that the resulting protocol with standard IA is concretely efficient. On a theoretical level, we present a compiler that transforms any secure protocol into one with standard IA assuming only a variant of statically-corruptable ideal OT.
Robust but Relaxed Probing Model
Masking has become a widely applied and heavily researched method to protect cryptographic implementations against SCA attacks. The success of masking is primarily attributed to its strong theoretical foundation enabling it to formally prove security by modeling physical properties through so-called probing models. Specifically, the robust $d$-probing model enables us to prove the security for arbitrarily masked hardware circuits, manually or with the assistance of automated tools, even when considering the imperfect nature of physical hardware, including the occurrence of physical defaults such as glitches. However, the generic strategy employed by the robust $d$-probing model comes with a downside: It tends to over-conservatively model the information leakage caused by glitches meaning that the robust $d$-probing model considers glitches that can never occur in practice. This implies that in theory, an adversary could gain more information than she would obtain in practice. From a designer's perspective, this entails that (1) securely designed hardware circuits may need to be withdrawn due to potential insecurity under the robust $d$-probing model and (2) designs that satisfy the security requirements of the robust $d$-probing model may incur unnecessary overhead, such as increased circuit size or latency.
In this work, we refine the formal treatment of glitches within the robust $d$-probing model to address glitches more accurately within a formal adversary model. Unlike the robust $d$-probing model, our approach considers glitches based on the operations performed and the data processed, ensuring that only manifesting glitches are accounted for. As a result, we introduce the RR $d$-probing model, a formal adversary model maintaining the same level of security as the robust $d$-probing model but without the overly conservative treatment of glitches. Leveraging our new model, we prove the security of \ac{LMDPL} gadgets, a class of physically secure gadgets reported as insecure based on the robust $d$-probing model. We provide manual proofs and automated security evaluations employing an updated version of PROLEAD capable of verifying the security of masked circuits under our new model.
The Insecurity of SHA2 under the Differential Fault Characteristic of Boolean Functions
SHA2 is widely used in various traditional public key ryptosystems, post-quantum cryptography, personal identification, and network communication protocols. Therefore, ensuring its robust security is of critical importance. Several differential fault attacks based on random word fault have targeted SHA1 and SHACAL-2. However, extending such random word-based fault attacks to SHA2 proves to be much more difficult due to the increased complexity of the Boolean functions in SHA2.
In this paper, assuming random word fault, we identify distinctive differential properties within the Boolean functions of SHA2. Based on these findings, we propose a novel differential fault attack methodology that can be effectively used to recover the final message block and its corresponding initial vector in SHA2, forge HMAC-SHA2 messages, extract the key of SHACAL-2, and extend our analysis to similar algorithms such as SM3. The efficacy of these attacks is validated through rigorous simulations and theoretical deductions, illustrating that they represent a considerable threat to the security of SHA2. In simulations, our approach only requires guessing $T$ bits of a register, where $T$ is at most $5$. Moreover, the probability of successfully recovering a register (excluding the guessed bits) approaches 100\% when introducing 15 faults (in 1000 instances), and the approximate probability is at least 95\% when $T=1$. Consequently, approximately 928 random faults are necessary to successfully execute the attack on the compression function. Furthermore, we discuss potential countermeasures, including verification and infection detection, and propose methods to determine the time and location of fault injection in practical experiments.
A bound on the quantum value of all compiled nonlocal games
A compiler introduced by Kalai et al. (STOC'23) converts any nonlocal game into an interactive protocol with a single computationally-bounded prover. Although the compiler is known to be sound in the case of classical provers, as well as complete in the quantum case, quantum soundness has so far only been established for special classes of games. In this work, we establish a quantum soundness result for all compiled two-player nonlocal games. In particular, we prove that the quantum commuting operator value of the underlying nonlocal game is an upper bound on the quantum value of the compiled game. Our result employs techniques from operator algebras in a computational and cryptographic setting to establish information-theoretic objects in the asymptotic limit of the security parameter. It further relies on a sequential characterization of quantum commuting operator correlations which may be of independent interest.
Efficient Accelerator for NTT-based Polynomial Multiplication
The Number Theoretic Transform (NTT) is used to efficiently execute polynomial multiplication. It has become an important part of lattice-based post-quantum methods and the subsequent generation of standard cryptographic systems. However, implementing post-quantum schemes is challenging
since they rely on intricate structures. This paper demonstrates how to develop a high-speed NTT multiplier highly optimized
for FPGAs with few logical resources. We describe a novel architecture for NTT that leverages unique precomputation. Our method efficiently maps these specific pre-computed values into the built-in Block RAMs (BRAMs), which greatly reduces the area and time required for implementation when compared to previous works. We have chosen Kyber parameters to implement the proposed architectures. Compared to the most well-known approach for implementing Kyber’s polynomial multiplication using NTT, the time is reduced by 31%, and AT (area × time) is improved by 25% as a result of the pre computation we suggest in this study. It is worth mentioning that we obtained these improvements while our method does not require DSP.
Traceable Secret Sharing: Strong Security and Efficient Constructions
Suppose Alice uses a $t$-out-of-$n$ secret sharing to store her secret key on $n$ servers. Her secret key is protected as long as $t$ of them do not collude. However, what if a less-than-$t$ subset of the servers decides to offer the shares they have for sale? In this case, Alice should be able to hold them accountable, or else nothing prevents them from selling her shares. With this motivation in mind, Goyal, Song, and Srinivasan (CRYPTO 21) introduced the concept of {\em traceable secret sharing}. In such schemes, it is possible to provably trace the leaked secret shares back to the servers who leaked them. Goyal et al.~presented the first construction of a traceable secret sharing scheme. However, secret shares in their construction are quadratic in the secret size, and their tracing algorithm is quite involved as it relies on Goldreich-Levin decoding.
In this work, we put forth new definitions and practical constructions for traceable secret sharing. In our model, some $f < t$ servers output a reconstruction box~$R$ that may arbitrarily depend on their shares. Given additional $t-f$ shares, $R$ reconstructs and outputs the secret. The task is to trace $R$ back to the corrupted servers given
black-box access to $R$. Unlike Goyal et al., we do not assume that the tracing algorithm has any information on how the corrupted servers constructed~$R$ from the shares in their possession.
We then present two very efficient constructions of traceable secret sharing based on two classic secret sharing schemes. In both of our schemes, shares are only twice as large as the secret, improving over the quadratic overhead of Goyal et al. Our first scheme is obtained by presenting a new practical tracing algorithm for the widely-used Shamir secret sharing scheme. Our second construction is based on an extension of Blakley's secret sharing scheme. Tracing in this scheme is optimally efficient, and requires just one successful query to $R$. We believe that our constructions are an important step towards bringing traceable secret-sharing schemes to practice. This work also raises several interesting open problems that we describe
in the paper.
Analyzing and Benchmarking ZK-Rollups
As blockchain technology continues to transform the realm of digital transactions, scalability has emerged as a critical issue. This challenge has spurred the creation of innovative solutions, particularly Layer 2 scalability techniques like rollups. Among these, ZK-Rollups are notable for employing Zero-Knowledge Proofs to facilitate prompt on-chain transaction verification, thereby improving scalability and efficiency without sacrificing security. Nevertheless, the intrinsic complexity of ZK-Rollups has hindered an exhaustive evaluation of their efficiency, economic impact, and performance.
This paper offers a theoretical and empirical examination aimed at comprehending and evaluating ZK-Rollups, with particular attention to ZK-EVMs. We conduct a qualitative analysis to break down the costs linked to ZK-Rollups and scrutinize the design choices of well-known implementations. Confronting the inherent difficulties in benchmarking such intricate systems, we introduce a systematic methodology for their assessment, applying our method to two prominent ZK-Rollups: Polygon zkEVM and zkSync Era. Our research provides initial findings that illuminate trade-offs and areas for enhancement in ZK-Rollup implementations, delivering valuable insights for future research, development, and deployment of these systems.
New Approaches for Estimating the Bias of Differential-Linear Distinguishers (Full Version)
Differential-linear cryptanalysis was introduced by Langford and Hellman in 1994 and has been extensively studied since then. In 2019, Bar-On et al. presented the Differential-Linear Connectivity Table (DLCT), which connects the differential part and the linear part, thus an attacked cipher is divided to 3 subciphers: the differential part, the DLCT part, and the linear part.
In this paper, we firstly present an accurate mathematical formula which establishes a relation between differential-linear and truncated differential cryptanalysis. Using the formula, the bias estimate of a differential-linear distinguisher can be converted to the probability calculations of a series of truncated differentials. Then, we propose a novel and natural concept, the TDT, which can be used to accelerate the calculation of the probabilities of truncated differentials. Based on the formula and the TDT, we propose two novel approaches for estimating the bias of a differential-linear distinguisher. We demonstrate the accuracy and efficiency of our new approaches by applying them to 5 symmetric-key primitives: Ascon, Serpent, KNOT, AES, and CLEFIA. For Ascon and Serpent, we update the best known differential-linear distinguishers. For KNOT AES, and CLEFIA, for the first time we give the theoretical differential-linear biases for different rounds.
AES-based CCR Hash with High Security and Its Application to Zero-Knowledge Proofs
The recent VOLE-based interactive zero-knowledge (VOLE-ZK) protocols along with non-interactive zero-knowledge (NIZK) proofs based on MPC-in-the-Head (MPCitH) and VOLE-in-the-Head (VOLEitH) extensively utilize the commitment schemes, which adopt a circular correlation robust (CCR) hash function as the core primitive. Nevertheless, the state-of-the-art CCR hash construction by Guo et al. (S&P'20), building from random permutations, can only provide 128-bit security, when it is instantiated from AES. This brings about a gap between AES-based CCR hash function and high security (beyond 128-bit security).
In this paper, we fill this gap by constructing a new CCR hash function from AES, supporting three security levels (i.e., 128, 192 and 256). Using the AES-based CCR hash function, we present an all-but-one vector commitment (AVC) scheme, which constitutes a computationally intensive part of the NIZK proofs from MPCitH and VOLEitH, where these NIZK proofs can in turn be transformed into the promising post-quantum signature candidates. Furthermore, we obtain an efficient VOLE-ZK protocol with security levels higher than 128 from the CCR hash function. Our benchmark results show that the AES-based CCR hash function has a comparable performance with CCR hash functions based on Rijndael with larger block sizes, which is not standardized and has a limited application range. In the AVC context, the expensive commitment component instantiated with our AES-based CCR hash function improves the running time by a factor of $7 \sim 30 \times$, compared to the SHA3-based instantiation used in the recent post-quantum signature algorithm FAEST.
Meet-in-the-Middle Attack on 4+4 Rounds of SCARF under Single-Tweak Setting
\scarf, an ultra low-latency tweakable block cipher, is the first cipher designed for cache randomization.
The block cipher design is significantly different from the other common tweakable block ciphers; with a block size of only 10 bits, and yet the input key size is a whopping $240$ bits. Notably, the majority of the round key in its round function is absorbed into the data path through AND operations, rather than the typical XOR operations.
In this paper, we present a key-recovery attack on a round-reduced version of SCARF with 4 + 4 rounds under the single-tweak setting. Our attack is essentially a Meet-in-the-Middle (MitM) attack, where the matching phase is represented by a system of linear equations. Unlike the cryptanalysis conducted by the designers, our attack is effective under both security requirements they have outlined. The data complexity of our attack is $2^{10}$ plaintexts, with a time complexity of approximately $2^{60.63}$ 4-round of SCARF encryptions. It is important to note that our attack does not threaten the overall security of SCARF.
FlexHi: A Flexible Hierarchical Threshold Signature Scheme
Threshold signature schemes have gained prominence in enhancing the security and flexibility of digital signatures, allowing a group of participants to collaboratively create signatures while maintaining a predefined threshold of participants for validity. However, conventional threshold signatures treat all participants equally, lacking the capability to accommodate hierarchical structures often seen in real-world applications. Hierarchical Threshold Signature Schemes (HTSS) naturally extend the concept of simple threshold signatures, offering a solution that aligns with hierarchical organizational structures. Our paper introduces a novel, efficient, and flexible HTSS that employs independent polynomials at each hierarchical level, removing limitations on threshold values. This adaptability enables us to tailor the scheme to diverse requirements, whether signing requires only top-level nodes or lower-level participants' involvement. Based on our analysis, our FlexHi integrated into the FROST scheme outperforms Tassa's hierarchical scheme on FROST and operates approximately 30% to 40% faster, depending on the number of participants and the chosen threshold values. This demonstrates that, in addition to flexibility, our scheme has practical benefits through improved performance.
Garuda and Pari: Faster and Smaller SNARKs via Equifficient Polynomial Commitments
SNARKs are powerful cryptographic primitives that allow a prover to produce a succinct proof of a computation. Two key goals of SNARK research are to minimize the size of the proof and to minimize the time required to generate the proof. In this work, we present new SNARK constructions that push the frontier on both of these goals.
Our first construction, Pari, is a SNARK that achieves the smallest proof size amongst *all* known SNARKs. Specifically, Pari achieves a proof size of just two group elements and two field elements, which, when instantiated with the BLS12-381 curve, totals just 160 bytes, smaller than that of Groth16 [Groth, EUROCRYPT '16] and Polymath [Lipmaa, CRYPTO '24].
Our second construction, Garuda, is a SNARK that reduces proof generation time by supporting, for the first time, arbitrary "custom" gates and *free* linear gates. To demonstrate Garuda's performance, we implement and evaluate it, and show that it provides significant prover-time savings compared to both the state-of-the-art SNARKs (Groth16 and HyperPlonk [EUROCRYPT '22])
Both constructions rely on a new cryptographic primitive: "equifficient" polynomial commitment schemes that enforce that committed polynomials have the same representation in particular bases. We provide both rigorous security definitions for this primitive as well as efficient constructions for univariate and multilinear polynomials.
Cryptographic Security through Kleene’s Theorem and Automata Theory
This study addresses the challenge of strengthening cryptographic security measures in the face of evolving cyber threats. The aim is to apply Kleene's Theorem and automata theory to improve the modeling and analysis of cybersecurity scenarios, focusing on the CyberMoraba game. Representing the game's strategic moves as regular expressions and mapping them onto finite automata provides a solid framework for understanding the interactions between attackers and defenders. This approach helps in identifying optimal strategies and predicting potential outcomes, which contributes to the development of stronger cryptographic security protocols. The research advances the theoretical use of automata theory in cybersecurity while offering practical insights into enhancing defense mechanisms against complex cyber attacks. This work connects theoretical computer science with practical cybersecurity, demonstrating the importance of automata theory in cryptology.
Hekaton: Horizontally-Scalable zkSNARKs via Proof Aggregation
Zero-knowledge Succinct Non-interactive ARguments of Knowledge (zkSNARKs) allow a prover to convince a verifier of the correct execution of a large computation in private and easily-verifiable manner. These properties make zkSNARKs a powerful tool for adding accountability, scalability, and privacy to numerous systems such as blockchains and verifiable key directories. Unfortunately, existing zkSNARKs are unable to scale to large computations due to time and space complexity requirements for the prover algorithm. As a result, they cannot handle real-world instances of the aforementioned applications.
In this work, we introduce Hekaton, a zkSNARK that overcomes these barriers and can efficiently handle arbitrarily large computations. We construct Hekaton via a new "distribute-and-aggregate" framework that breaks up large computations into small chunks, proves these chunks in parallel in a distributed system, and then aggregates the resulting chunk proofs into a single succinct proof. Underlying this framework is a new technique for efficiently handling data that is shared between chunks that we believe could be of independent interest.
We implement a distributed prover for Hekaton, and evaluate its performance on a compute cluster. Our experiments show that Hekaton achieves strong horizontal scalability (proving time decreases linearly as we increase the number of nodes in the cluster), and is able to prove large computations quickly: it can prove computations of size $2^{35}$ gates in under an hour, which is much faster than prior work.
Finally, we also apply Hekaton to two applications of real-world interest: proofs of batched insertion for a verifiable key directory and proving correctness of RAM computations. In both cases, Hekaton is able to scale to handle realistic workloads with better efficiency than prior work.
Chrysalis Cipher Suite
The formal verification of architectural strength in terms of computational complexity is achieved through reduction of the Non-Commutative Grothendieck problem in the form of a quadratic lattice. This multivariate form relies on equivalences derived from a k-clique problem within a multigraph. The proposed scheme reduces the k-clique problem as an input function, resulting in the generation of a quadratic used as parameters for the lattice. By Grothendieck’s inequality, the satisfiability of lattice constraints in terms of NP-Hard and NP-Complete bounds is provably congruent to a closest vector problem in the lattice. The base vectors of the resulting lattice are treated as a holomorphic vector bundle. From the resulting bilinear matrices, the tight hardness reduction of the closest vector problem as the shortest vector problem is introduced within the system. The derivation of the closest vector problem requires that the lattice is necessarily generated by a <0|1>-Matrix expressed as a quadratic. This vector bundle is denoted as
the unit ball with congruent topology to the Riemann sphere, symbolized as 𝒪. For the Grothendieck constraints, the relative vector norms necessarily result in satisfaction of NP-Hard requirements for shortest vector problems in the lattice.
Aggregating Falcon Signatures with LaBRADOR
Several prior works have suggested to use non-interactive arguments of knowledge with short proofs to aggregate signatures of Falcon, which is part of the first post-quantum signatures selected for standardization by NIST. Especially LaBRADOR, based on standard structured lattice assumptions and published at CRYPTO’23, seems promising to realize this task. However, no prior work has tackled this idea in a rigorous way. In this paper, we thoroughly prove how to aggregate Falcon signatures using LaBRADOR. We start by providing the first complete knowledge soundness analysis for the non-interactive version of LaBRADOR. Here, the multi-round and recursive nature of LaBRADOR requires a complex and thorough analysis. For this purpose, we introduce the notion of predicate special soundness (PSS). This is a general framework for evaluating the knowledge error of complex Fiat-Shamir arguments of knowledge protocols in a modular fashion, which we believe to be of independent interest. We then explain the exact steps to take in order to adapt the non-interactive LaBRADOR proof system for aggregating Falcon signatures and provide concrete proof size estimates. Additionally, we formalize the folklore approach of obtaining aggregate signatures from the class of hash-then-sign signatures through arguments of knowledge.
Information-Theoretic Topology-Hiding Broadcast: Wheels, Stars, Friendship, and Beyond
Topology-hiding broadcast (THB) enables parties communicating over an incomplete network to broadcast messages while hiding the network topology from within a given class of graphs. Although broadcast is a privacy-free task, it is known that THB for certain graph classes necessitates computational assumptions, even against "honest but curious" adversaries, and even given a single corrupted party. Recent works have tried to understand when THB can be obtained with information-theoretic (IT) security (without cryptography or setup assumptions) as a function of properties of the corresponding graph class.
We revisit this question through a case study of the class of wheel graphs and their subgraphs. The $n$'th wheel graph is established by connecting $n$ nodes who form a cycle with another "center" node, thus providing a natural extension that captures and enriches previously studied graph classes in the setting of IT-THB.
We present a series of new findings in this line.
We fully characterize feasibility of IT-THB for any class of subgraphs of the wheel, each possessing an embedded star (i.e., a well-defined center connected to all other nodes). Our characterization provides evidence that IT-THB feasibility may correlate with a more fine-grained degree structure---as opposed to pure connectivity---of the corresponding graphs.
We provide positive results achieving perfect IT-THB for new graph classes, including ones where the number of nodes is unknown. Further, we provide the first feasibility of IT-THB on non-degenerate graph-classes with $t>1$ corruptions, for the class of friendship graphs (Erdos, Renyi, Sos '66).
Safe curves for elliptic-curve cryptography
This paper surveys interactions between choices of elliptic curves and the security of elliptic-curve cryptography. Attacks considered include not just discrete-logarithm computations but also attacks exploiting common implementation pitfalls.
A Security Analysis of Two Classes of RSA-like Cryptosystems
Let $N=pq$ be the product of two balanced prime numbers $p$ and $q$. In 2002, Elkamchouchi, Elshenawy and Shaban introduced an RSA-like cryptosystem that uses the key equation $ed - k (p^2-1)(q^2-1) = 1$, instead of the classical RSA key equation $ed - k (p-1)(q-1) = 1$. Another variant of RSA, presented in 2017 by Murru and Saettone, uses the key equation $ed - k (p^2+p+1)(q^2+q+1) = 1$.
Despite the authors' claims of enhanced security, both schemes remain vulnerable to adaptations of common RSA attacks. Let $n$ be an integer. This paper proposes two families of RSA-like encryption schemes: one employs the key equation $ed - k (p^n-1)(q^n-1) = 1$ for $n > 0$, while the other uses $ed - k [(p^n-1)(q^n-1)]/[(p-1)(q-1)] = 1$ for $n > 1$. Note that we remove the conventional assumption of primes having equal bit sizes. In this scenario, we show that regardless of the choice of $n$, continued fraction-based attacks can still recover the secret exponent. Additionally, this work fills a gap in the literature by establishing an equivalent of Wiener's attack when the primes do not have the same bit size.
Dilithium-Based Verifiable Timed Signature Scheme
Verifiable Timed Signatures (VTS) are cryptographic constructs that enable obtaining a signature at a specific time in the future and provide evidence that the signature is legitimate. This framework particularly finds utility in applications such as payment channel networks, multiparty signing operations, or multiparty computation, especially within blockchain architectures. Currently, VTS schemes are based on signature algorithms such as BLS signature, Schnorr signature, and ECDSA. These signature algorithms are considered insecure against quantum attacks due to the effect of Shor's Algorithm on the discrete logarithm problem. We present a new VTS scheme called VT-Dilithium based on CRYSTALS-Dilithium Digital Signature Algorithm that has been selected as NIST's quantum-resistant digital signature standard and is considered secure against both classical and quantum attacks. Integrating Dilithium into the VTS scheme is more challenging problem due to its complex mathematical operations (i.e. polynomial multiplications, rounding operations) and large module parameters such as polynomials, polynomial vectors, and matrices. This work aims to provide a comprehensive exposition of the VT-Dilithium scheme.
VOLE-PSI: Fast OPRF and Circuit-PSI from Vector-OLE
In this work we present a new construction for a batched Oblivious Pseudorandom Function (OPRF) based on Vector-OLE and the PaXoS data structure. We then use it in the standard transformation for achieving Private Set Intersection (PSI) from an OPRF. Our overall construction is highly efficient with $O(n)$ communication and computation. We demonstrate that our protocol can achieve malicious security at only a very small overhead compared to the semi-honest variant. For input sizes $n = 2^{20}$, our malicious protocol needs 6.2 seconds and less than 59 MB communication. This corresponds to under 450 bits per element, which is the lowest number for any published PSI protocol (semi-honest or malicious) to date. Moreover, in theory our semi-honest (resp. malicious) protocol can achieve as low as 219 (resp. 260) bits per element for $n=2^{20}$ at the added cost of interpolating a polynomial over $n$ elements.
As a second contribution, we present an extension where the output of the PSI is secret-shared between the two parties. This functionality is generally referred to as Circuit-PSI. It allows the parties to perform a subsequent MPC protocol on the secret-shared outputs, e.g., train a machine learning model. Our circuit PSI protocol builds on our OPRF construction along with another application of the PaXoS data structure. It achieves semi-honest security and allows for a highly efficient implementation, up to 3x faster than previous work.
FuLeakage: Breaking FuLeeca by Learning Attacks
FuLeeca is a signature scheme submitted to the recent NIST call for additional signatures. It is an efficient hash-and-sign scheme based on quasi-cyclic codes in the Lee metric and resembles the lattice-based signature Falcon. FuLeeca proposes a so-called concentration step within the signing procedure to avoid leakage of secret-key information from the signatures. However, FuLeeca is still vulnerable to learning attacks, which were first observed for lattice-based schemes. We present three full key-recovery attacks by exploiting the proximity of the code-based FuLeeca scheme to lattice-based primitives.
More precisely, we use a few signatures to extract an $n/2$-dimensional circulant sublattice from the given length-$n$ code, that still contains the exceptionally short secret-key vector. This significantly reduces the classical attack cost and, in addition, leads to a full key recovery in quantum-polynomial time. Furthermore, we exploit a bias in the concentration procedure to classically recover the full key for any security level with at most 175,000 signatures in less than an hour.
Non-interactive VSS using Class Groups and Application to DKG
We put forward a non-interactive verifiable secret sharing (NI-VSS) scheme using class groups – we call it cgVSS. Our construction follows the standard framework of encrypting the shares to a set of recipients and generating a non-interactive proof of correct sharing. However, as opposed to prior works, such as Groth’s [Eprint 2021], or Gentry et al.’s [Eurocrypt 2022], we do not require any range proof - this is possible due to the unique structure of class groups, that enables efficient encryption/decryption of large field elements in the exponent of an ElGamal-style encryption scheme. Importantly, this is possible without destroying the additive holomorphic structure, which is required to make the proof-of-correctness highly efficient. This approach not only substantially simplifies the NI-VSS process, but also outperforms the state-of-art schemes significantly. For example, our implementation shows that for a 150 node system cgVSS outperforms (a simplified implementation of) Groth’s protocol in overall communication complexity by 5.6x, about 9.3 − 9.7x in the dealer time and 2.4 − 2.7x in the receiver time per node.
Additionally, we formalize the notion of public verifiability, which enables anyone, possibly outside the participants, to verify the correctness of the dealing. In fact, we re-interpret the notion of public verifiability and extend it to the setting when potentially all recipients may be corrupt and yet can not defy public verifiability – to distinguish from state-of-art, we call this strong public verifiability. Our formalization uses the universal composability framework.
Finally, through a generic transformation, we obtain a non-interactive distributed key generation (NI-DKG) scheme for threshold systems, where the secret key is the discrete log of the public key. Our security analysis in the VSS-hybrid model uses a formalization that considers a (strong) public verifiability notion for DKG, even when more than threshold parties are corrupt. Instantiating with cgVSS we obtain a NI-DKG scheme from class groups – we call it cgDKG.
Concrete Analysis of Schnorr-type Signatures with Aborts
Lyubashevsky’s signature can be viewed as a lattice-based adapation of the Schnorr signature, with the core difference being the use of aborts during signature generation process. Since the proposal of Lyubashevsky’s signature, a number of other variants of Schnorr-type signatures with aborts have been proposed, both in lattice-based and code-based setting. In this paper, we examine the security of Schnorr-type signature schemes with aborts. We give a detailed analysis of when the expected value of the signature is correlated to the secret key, and when it is not. Our analysis shows that even when abort condition is employed, it is crucial to set the parameters carefully in order to defend against statistical attack. In particular, we recommend to set δ ≥ β (where δ, β are public parameters) as in this case we prove that the signature does not reveal any information about the secret key. On the other hand, if this condition is not satisfied, then some information about the secret key are leaked, making the scheme susceptible to statistical attacks. For completeness, we also analyze the security of Schnorr-type signatures without aborts. In particular, we present a detailed key recovery attack via statistical method on the EagleSign signature, which is one of the submission to the NIST call for Additional PQC Signature. Moreover, we give a formula for determining the number of required signatures to successfully launch the statistical attack.
Non-Interactive Zero-Knowledge from LPN and MQ
We give the first construction of non-interactive zero-knowledge (NIZK) arguments from post-quantum assumptions other than Learning with Errors. In particular, we achieve NIZK under the polynomial hardness of the Learning Parity with Noise (LPN) assumption, and the exponential hardness of solving random under-determined multivariate quadratic equations (MQ). We also construct NIZK satisfying statistical zero-knowledge assuming a new variant of LPN, Dense-Sparse LPN, introduced by Dao and Jain (CRYPTO 2024), together with exponentially-hard MQ.
The main technical ingredient of our construction is an extremely natural (but only in hindsight!) construction of correlation-intractable (CI) hash functions from MQ, for a NIZK-friendly sub-class of constant-degree polynomials that we call concatenated constant-degree polynomials. Under exponential security, this hash function also satisfies the stronger notion of approximate CI for concatenated constant-degree polynomials. The NIZK construction then follows from a prior blueprint of Brakerski-Koppula-Mour (CRYPTO 2020). In addition, we show how to construct (approximate) CI hashing for degree-$d$ functions from the (exponential) hardness of solving random degree-$d$ equations, a natural generalization of MQ. To realize NIZK with statistical zero-knowledge, we design a lossy public-key encryption scheme with approximate linear decryption and inverse-polynomial decryption error from Dense-Sparse LPN. These constructions may be of independent interest.
Our work therefore gives a new way to leverage MQ with uniformly random equations, which has found little cryptographic applications to date. Indeed, most applications in the context of encryption and signature schemes make use of structured variants of MQ, where the polynomials are not truly random but posses a hidden planted structure. We believe that the MQ assumption may plausibly find future use in the designing other advanced proof systems.
FELIX (XGCD for FALCON): FPGA-based Scalable and Lightweight Accelerator for Large Integer Extended GCD
The Extended Greatest Common Divisor (XGCD) computation is a critical component in various cryptographic applications and algorithms, including both pre- and post-quantum cryptosystems. In addition to computing the greatest common divisor (GCD) of two integers, the XGCD also produces Bezout coefficients $b_a$ and $b_b$ which satisfy $\mathrm{GCD}(a,b) = a\times b_a + b\times b_b$. In particular, computing the XGCD for large integers is of significant interest. Most recently, XGCD computation between 6,479-bit integers is required for solving $N$-th degree Truncated polynomial Ring Unit (NTRU) trapdoors in Falcon, a National Institute of Standards and Technology (NIST)-selected Post-Quantum digital signature scheme. To this point, existing literature has primarily focused on exploring software-based implementations for XGCD. The few existing high-performance hardware architectures require significant hardware resources and may not be desirable for practical usage, and the lightweight architectures suffer from poor performance. To fill the research gap, this work proposes a novel FPGA-based scalablE and Lightweight accelerator for large Integer XGCD (FELIX). First, a new algorithm suitable for scalable and lightweight computation of XGCD is proposed. Next, a hardware accelerator (FELIX) is presented, including both constant- and variable-time versions. Finally, a thorough evaluation is carried out to showcase the efficiency of the proposed FELIX. In certain configurations, FELIX involves 81% less equivalent area-time product (eATP) than the state-of-the-art design for 1,024-bit integers, and achieves a 95% reduction in latency over the software for 6,479-bit integers (Falcon parameter set) with reasonable resource usage. Overall, the proposed FELIX is highly efficient, scalable, lightweight, and suitable for very large integer computation, making it the first such XGCD accelerator in the literature (to the best of our knowledge).
Legendre Sequences are Pseudorandom under the Quadratic-Residuosity Assumption
The Legendre sequence of an integer $x$ modulo a prime $p$ with respect to offsets $\vec a = (a_1, \dots, a_\ell)$ is the string of Legendre symbols $(\frac{x+a_1}{p}), \dots, (\frac{x+a_\ell}{p})$. Under the quadratic-residuosity assumption, we show that the function that maps the pair $(x,p)$ to the Legendre sequence of $x$ modulo $p$, with respect to public random offsets $\vec a$, is a pseudorandom generator. This answers an open question of Damgård (CRYPTO 1988), up to the choice of the offsets $\vec a$.
Complete Knowledge: Preventing Encumbrance of Cryptographic Secrets
Uncategorized
Uncategorized
Most cryptographic protocols model a player’s knowledge of secrets in a simple way. Informally, the player knows a secret in the sense that she can directly furnish it as a (private) input to a protocol, e.g., to digitally sign a message.
The growing availability of Trusted Execution Environments (TEEs) and secure multiparty computation, however, undermines this model of knowledge. Such tools can encumber a secret sk and permit a chosen player to access sk conditionally, without actually knowing sk. By permitting selective access to sk by an adversary, encumbrance of secrets can enable vote-selling in cryptographic voting schemes, illegal sale of credentials for online services, and erosion of deniability in anonymous messaging systems.
Unfortunately, existing proof-of-knowledge protocols fail to demonstrate that a secret is unencumbered. We therefore introduce and formalize a new notion called complete knowledge (CK). A proof (or argument) of CK shows that a prover does not just know a secret, but also has fully unencumbered knowledge, i.e., unrestricted ability to use the secret.
We introduce two practical CK schemes that use special-purpose hardware, specifically TEEs and off-the-shelf mining ASICs. We prove the security of these schemes and explore their practical deployment with a complete, end-to-end prototype with smart-contract verification that supports both. We show how CK can address encumbrance attacks identified in previous work. Finally, we introduce two new applications enabled by CK that involve proving ownership of blockchain assets.
A Simple and Generic Approach to Dynamic Collusion Model
Functional Encryption (FE) is a powerful notion of encryption which enables computations and partial message recovery of encrypted data. In FE, each decryption key is associated with a function $f$ such that decryption recovers the function evaluation $f(m)$ from an encryption of $m$. Informally, security states that a user with access to function keys $sk_{f_1}, sk_{f_2}, \ldots$ (and so on) can only learn $f_1(m), f_2(m), \ldots$ (and so on) but nothing more about the message. The system is said to be $q$-bounded collusion resistant if the security holds as long as an adversary gets access to at most $q = q(\lambda)$ decryption keys. In the last decade, numerous works have proposed many FE constructions from a wide array of algebraic and general cryptographic assumptions, and proved their security in the bounded collusion model.
However, until very recently, all these works studied bounded collusion resistance in a ``static model", where the collusion bound $q$ was a global system parameter. While the static collusion model led to great research progress in the community, it has many major drawbacks. Very recently, Agrawal et al. (Crypto 2021) and Garg et al. (Eurocrypt 2022) independently introduced the dynamic model for bounded collusion resistance, where the collusion bound $q$ was a fluid parameter that was not globally set but only chosen by each encryptor. The dynamic collusion model enabled harnessing the many virtues of the static collusion model, while avoiding its various drawbacks.
In this work, we give a simple and generic approach to upgrade any scheme from the static collusion model to the dynamic collusion model. Our result captures all existing results in the dynamic model in the form of a single unified framework, and also gives new results as simple corollaries with a lot more potential in the future. An interesting artifact of our result is that it gives a generic way to match existing lower bounds in functional encryption.
Beyond the Whitepaper: Where BFT Consensus Protocols Meet Reality
This paper presents a collection of lessons learned from analyzing the real-world security of various Byzantine Fault Tolerant (BFT) consensus protocol implementations. Drawing upon our experience as a team of security experts who have both developed and audited BFT systems, including BA★, HotStuff variants, Paxos variants, and DAG-based algorithms like Narwhal and Bullshark, we identify and analyze a variety of security vulnerabilities discovered in the translation of theoretical protocols into real-world code. Our analysis covers a range of issues, including subtle logic errors, concurrency bugs, cryptographic vulnerabilities, and mismatches between the theoretical model and the implementation. We provide detailed case studies illustrating these vulnerabilities, discuss their potential impact, and propose mitigation strategies. This work aims to provide valuable insights for both designers and implementers of BFT consensus protocols, ultimately contributing to the development of more secure and reliable distributed systems.
File-Injection Attacks on Searchable Encryption, Based on Binomial Structures
One distinguishable feature of file-inject attacks on searchable encryption schemes is the 100% query recovery rate, i.e., confirming the corresponding keyword for each query. The main efficiency consideration of file-injection attacks is the number of injected files. In the work of Zhang et al. (USENIX 2016), $|\log_2|K||$ injected files are required, each of which contains $|K|/2$ keywords for the keyword set $K$. Based on the construction of the uniform $(s,n)$-set, Wang et al. need fewer injected files when considering the threshold countermeasure. In this work, we propose a new attack that further reduces the number of injected files where Wang et al. need up to 38% more injections to achieve the same results. The attack is based on an increment $(s,n)$-set, which is also defined in this paper.
Solving McEliece-1409 in One Day --- Cryptanalysis with the Improved BJMM Algorithm
Syndrome decoding problem (SDP) is the security assumption of the code-based cryptography. Three out of the four NIST-PQC round 4 candidates are code-based cryptography. Information set decoding (ISD) is known for the fastest existing algorithm to solve SDP instances with relatively high code rate. Security of code-based cryptography is often constructed on the asymptotic complexity of the ISD algorithm. However, the concrete complexity of the ISD algorithm has hardly ever been known. Recently, Esser, May and Zweydinger (Eurocrypt '22) provide the first implementation of the representation-based ISD, such as May--Meurer--Thomae (MMT) or Becker--Joux--May--Meurer (BJMM) algorithm and solve the McEliece-1284 instance in the decoding challenge, revealing the practical efficiency of these ISDs.
In this work, we propose a practically fast depth-2 BJMM algorithm and provide the first publicly available GPU implementation. We solve the McEliece-1409 instance for the first time and present concrete analysis for the record. Cryptanalysis for NIST-PQC round 4 code-based candidates against the improved BJMM algorithm is also conducted. In addition, we revise the asymptotic space complexity of the time-memory trade-off MMT algorithm presented by Esser and Zweydinger (Eurocrypt '23) from $2^{0.375n}$ to $2^{0.376n}$.
Uncovering Impact of Mental Models towards Adoption of Multi-device Crypto-Wallets
Uncategorized
Uncategorized
Cryptocurrency users saw a sharp increase in different types of crypto wallets in the past decade. However, the emerging multi-device (threshold) wallets, even with improved security guarantees over their single-device counterparts, are yet to receive proportionate adoption. This work presents a data-driven investigation into the perceptions of users towards multi-device/threshold wallets, using a survey of 357 crypto-wallet users. Our results revealed two significant groups among our participants—Newbies and Non-newbies. Our follow-up qualitative analysis, after educating revealed a gap between the mental model for these participants and actual security guarantees. Furthermore, we investigated preferred default settings for crypto-wallets across our participants over different key-share distribution settings of multi-device wallets—the threat model considerations affected user preferences, signifying a need for contextualizing default settings. We identify concrete, actionable design avenues for future multi-device wallet designs and present novel cryptographic problems to realize those.
EMI Shielding for Use in Side-Channel Security: Analysis, Simulation and Measurements
Considering side-channel analysis (SCA) security for cryptographic devices, the mitigation of electromagnetic leakage and electromagnetic interference (EMI) between modules poses significant challenges. This paper presents a comprehensive review and deep analysis of the utilization of EMI shielding materials, devised for reliability purposes and standards such as EMI/EMC, as a countermeasure to enhance EM-SCA security. We survey the current landscape of EMI-shields materials, including conductive polymers, metal-foams, carbon-based materials, and meta-materials, evaluating their effectiveness in attenuating emissions and preventing information-leakage, a task done with security-centric metrics for such materials for the first time. Through a systematic examination of existing literature, experimental studies and a construction of fully-simulatable EM environment in ANSYS-solver, we identify key factors influencing the performance of EMI-shield materials, such as shielding-effectiveness (SE), bandwidth, thickness, and material properties, on security characteristics.
We devise a connection between SE and cryptographic-SNR, and we demonstrate from real hardware measurements how and in what conditions can such materials provide very high security levels. By synthesizing insights from multidisciplinary research domains, this paper aims to provide valuable two-way benefit and guidance for researchers, engineers, and practitioners in the design and deployment of robust side-channel security measures leveraging EMI-shields, already in utilization devised by reliability standards.
EagleSignV3 : A new secure variant of EagleSign signature over lattices
With the potential arrival of quantum computers, it is essential to build cryptosystems resistant to attackers with the computing power of a quantum computer. With Shor's algorithm, cryptosystems based on discrete logarithms and factorization become obsolete. Reason why NIST has launching two competitions in 2016 and 2023 to standardize post-quantum cryptosystems (such as KEM and signature ) based on problems supposed to resist attacks using quantum computers. EagleSign was prosed to NIT competition in Jun 2023 as an additional signature. An improvement called EagleSign-V2 was proposed in December 2023 but Tibouchi and Pells prove that these two variants don't hold the zero knowledge property. In this document we present the family of lattices based post-quantum signatures called EagleSignV3. They are secure and efficient successors of both EagleSign-V1 (NIST, June 2023) and EagleSign-V2 (NIST forum, December 2023). The public key of EagleSignV3 is based on a mix of MLE (Module Learning with Error) and MNTRU (module variant of the famous NTRU problem). The instantiations EagleSignV3 are new variants of the EagleSign signatures family posted to NIST competition in June 2023 as additional signatures. EagleSignV3 uses the rejection of Lyubashevsky-2012 to achieve the zero-knowledge property. The main difference between EagleSign and Dilithium is the public key.
We have two instantiations based either on ring or on module. The sizes of the ring based variant of EagleSignV3 are close to those of Dilithium but the sizes of its module based instantiation is bigger than those of Dilithium.
NB: The implementation of EagleSign-V1 is available on NIST website and those of EagleSign-V2 can be found on Github at https://github.com/EagleSignteam/EagleSign_v2 and in NIST forum as a comment on improvements on EagleSign in December 2023. The implementation of EagleSign-V3 can be deduced from those of EagleSignV2.
Impossibilities in Succinct Arguments: Black-box Extraction and More
The celebrated result by Gentry and Wichs established a theoretical barrier for succinct non-interactive arguments (SNARGs), showing that for (expressive enough) hard-on-average languages, we must assume non-falsifiable assumptions.
We further investigate those barriers by showing new negative and positive results related to the proof size.
1. We start by formalizing a folklore lower bound for the proof size of black-box extractable arguments based on the hardness of the language. This separates knowledge-sound SNARGs (SNARKs) in the random oracle model (that can have black-box extraction) and those in the standard model.
2. We find a positive result in the non-adaptive setting. Under the existence of non-adaptively sound SNARGs (without extractability) and from standard assumptions, it is possible to build SNARKs with black-box extractability for a non-trivial subset of NP.
3. On the other hand, we show that (under some mild assumptions) all NP languages cannot have SNARKs with black-box extractability even in the non-adaptive setting.
4. The Gentry-Wichs result does not account for the preprocessing model, under which fall several efficient constructions. We show that also, in the preprocessing model, it is impossible to construct SNARGs that rely on falsifiable assumptions in a black-box way.
Along the way, we identify a class of non-trivial languages, which we dub “trapdoor languages”, that bypass some of these impossibility results.
Revisiting the Security of Approximate FHE with Noise-Flooding Countermeasures
Approximate fully homomorphic encryption (FHE) schemes, such as the CKKS scheme (Cheon, Kim, Kim, Song, ASIACRYPT '17), are among the leading schemes in terms of efficiency and are particularly suitable for Machine Learning (ML) tasks. Although efficient, approximate FHE schemes have some inherent risks: Li and Micciancio (EUROCRYPT '21) demonstrated that while these schemes achieved the standard notion of CPA-security, they failed against a variant, $\mathsf{IND}\mbox{-}\mathsf{CPA}^D$, in which the adversary is given limited access to the decryption oracle. Subsequently, Li, Micciancio, Schultz, and Sorrell (CRYPTO '22) proved that with noise-flooding countermeasures which add Gaussian noise of sufficiently high variance before outputting the decrypted value, the CKKS scheme is secure. However, the variance required for provable security is very high, inducing a large loss in message precision.
In this work, we consider a broad class of attacks on CKKS with noise-flooding countermeasures, which we call ``semi-honest'' attacks, in which an adversary may submit only correctly distributed ciphertexts to the decryption oracle. The ciphertexts submitted for decryption can be fresh ciphertexts, or can be ciphertexts resulting from the homomorphic evaluation of some circuit on fresh and independent ciphertexts. Our motivation is to model an internal threat scenario where an adversary can passively access the internal randomness of the system.
We analyze the concrete security of CKKS with various levels of noise-flooding in the face of such attacks. The aim of this work is to outline and precisely quantify the various trade-offs between the number of allowed decryptions before refreshing the keys, noise-flooding levels, and the concrete security of the scheme after a number of decryptions have been observed by the adversary.
Due to the large dimension and modulus in typical FHE parameter sets, previous techniques even for \emph{estimating} the concrete runtime of such attacks -- such as those in (Dachman-Soled, Ducas, Gong, Rossi, CRYPTO '20) -- become computationally infeasible, since they involve high dimensional and high precision matrix multiplication and inversion. We therefore develop new techniques that allow us to perform fast security estimation, even for FHE-size parameter sets.
AutoHoG: Automating Homomorphic Gate Design for Large-Scale Logic Circuit Evaluation
Recently, an emerging branch of research in the field of fully homomorphic encryption (FHE) attracts growing attention, where optimizations are carried out in developing fast and efficient homomorphic logic circuits. While existing works have pointed out that compound homomorphic gates can be constructed without incurring significant computational overheads, the exact theory and mechanism of homomorphic gate design have not yet been explored. In this work, we propose AutoHoG, an automated procedure for the generation of compound gates over FHE. We show that by formalizing the gate generation procedure, we can adopt a match-and-replace strategy to significantly improve the evaluation speed of logic circuits over FHE. In the experiment, we first show the effectiveness of AutoHoG through a set of benchmark gates. We then apply AutoHoG to optimize common Boolean tasks, including adders, multipliers, the ISCAS’85 benchmark circuits, and the ISCAS’89 benchmark circuits. We show that for various circuit benchmarks, we can achieve up to 5.7x reduction in computational latency when compared to the state-of-the-art implementations of logic circuits using conventional gates.
Koala: A Low-Latency Pseudorandom Function
This paper introduces the Koala PRF, which maps a variable-length sequence of $64$-bit input blocks to a single $257$-bit output block.
Its design focuses on achieving low latency in its implementation in ASIC.
To construct Koala, we instantiate the recently introduced Kirby construction with the Koala-P permutation and add an input encoding layer.
The Koala-P permutation is obtained as the $8$-fold iteration of a simple round function inspired by that of Subterranean.
Based on careful preliminary cryptanalysis, we made a variant of the Subterranean permutation by reordering and modifying it in a way that does not introduce any implementation overhead and enhances the cryptographic resistance of the resulting PRF.
Indeed, we demonstrate that Koala exhibits a high resistance against integral, cube, division property, and higher-order differential attacks.
Additionally, we compare the hardware implementation of Koala with the smallest latency with state-of-the-art low-latency PRF Orthros and Gleeok and the block cipher Prince in the same ASIC synthesis setup.
Our results show that Koala outperforms these primitives not only in terms of latency but also with respect to various other performance measures.
XHash: Efficient STARK-friendly Hash Function
Zero-knowledge proofs are widely used in real-world applications
for authentication, access control, blockchains, and cryptocurren-
cies, to name a few. A core element in zero-knowledge proof systems
is the underlying hash function, which plays a vital role in the effi-
ciency of the proof system. While the traditional hash functions,
such as SHA3 or BLAKE3 are efficient on CPU architectures, they
perform poorly within zero-knowledge proof systems. This is pri-
marily due to the requirement of these systems for hash functions
that operate efficiently over finite fields of large prime order as well
as binary fields. To address this challenge, a new paradigm called
Arithmetization-Orientation has emerged. These designs are tai-
lored to improve the efficiency of hashing within zero-knowledge
proof systems while providing reliable security guarantees.
In this work, we propose XHash, which is a high-performance
hash function designed for ZK-STARKs and is inspired by the Mar-
vellous design strategy. When using Algebraic Intermediate Repre-
sentation, XHash outperforms Rescue and Poseidon as the most im-
portant ZK-friendly hash functions for STARKs. Moreover, XHash
has a competitive performance on CPU architectures with an av-
erage speed of ≈ 3𝜇𝑠 for 2-to-1 hashing. Compared to RPO, which
is the fastest hash function of the Marvellous family, XHash per-
forms ≈ 2.5 times faster on CPU. From the security perspective,
XHash inherits the security of the Marvellous design strategy, and
we analyze its security against state-of-the-art algebraic attacks.
Additionally, we propose a new type of security argument against
algebraic attacks that relies on a single well-defined and reasonable
conjecture of a novel type. Finally, we specify a standard version of
XHash designed for Polygon Miden VM, with its AIR complexity
being 504, compared to Rescue with an AIR complexity of 672, and
Poseidon with an AIR complexity of 1176.
More Embedded Curves for SNARK-Pairing-Friendly Curves
Embedded curves are elliptic curves defined over a prime field whose order (characteristic) is the prime subgroup order (the scalar field) of a pairing-friendly curve. Embedded curves have a large prime-order subgroup of cryptographic size but are not pairing-friendly themselves. Sanso and El Housni published families of embedded curves for BLS pairing-friendly curves. Their families are parameterized by polynomials, like families of pairing-friendly curves are. However their work did not found embedded families for KSS pairing-friendly curves. In this note we show how the problem of finding families of embedded curves is related to the problem of finding optimal formulas for $\G_1$ subgroup membership testing on the pairing-friendly curve side. Then we apply Smith's technique and Dai, Lin, Zhao, and Zhou (DLZZ) criteria to obtain the formulas of embedded curves with KSS, and outline a generic algorithm for solving this problem in all cases. We provide two families of embedded curves of prime-order for KSS18 that can form a plain cycle, and give examples of cryptographic size. We also give families of even-order $j=1728$ embedded curves for KSS16 with examples. We also suggest alternative embedded curves for BLS that have a seed of much lower Hamming weight than Sanso et al.~and much higher 2-valuation for fast FFT. In particular we highlight BLS12 curves which have a prime-order embedded curve that form a plain cycle (no pairing), and a second (plain) embedded curve in Montgomery form. A Brezing-Weng outer curve to have a pairing-friendly 2-chain is also possible like in the BLS12-377-BW6-761 construction. All curves have $j$-invariant 0 and an endomorphism for a faster arithmetic on the curve side.
A Note on the Quasigroup Lai-Massey Structures
In our paper, we explore the consequences of replacing the commutative group operation used in Lai-Massey structures with a quasigroup operation.
We introduce four quasigroup versions of the Lai-Massey structure, and prove that for quasigroups isotopic with a group $\mathbb{G}$, the complexity of launching a differential attack against these variants of the Lai-Massey structure is equivalent to attacking an alternative structure based on $\mathbb{G}$.
Then we provide the conditions needed for correct decryption, and further refine the resulting structure. The emerging structure is both intriguing and novel, and we hope that it will form the basis for future secure block ciphers based on non-commutative groups. In the case of commutative groups, we show that the resulting structure reduces to the classical Lai-Massey structure.
MSMAC: Accelerating Multi-Scalar Multiplication for Zero-Knowledge Proof
Multi-scalar multiplication (MSM) is the most computation-intensive part in proof generation of Zero-knowledge proof (ZKP). In this paper, we propose MSMAC, an FPGA accelerator for large-scale MSM. MSMAC adopts a specially designed Instruction Set Architecture (ISA) for MSM and optimizes pipelined Point Addition Unit (PAU) with hybrid Karatsuba multiplier. Moreover, a runtime system is proposed to split MSM tasks with the optimal sub-task size and orchestrate execution of Processing Elements (PEs). Experimental results show that MSMAC achieves up to 328× and 1.96× speedups compared to the state-of-the-art implementation on CPU (one core) and GPU, respectively, outperforming the state-of-the-art ASIC accelerator by 1.79×. On 4 FPGAs, MSMAC performs 1,261× faster than a single CPU core.
A Note on ``Three-Factor Anonymous Authentication and Key Agreement Based on Fuzzy Biological Extraction for Industrial Internet of Things''
We show that the key agreement scheme [IEEE Trans. Serv. Comput. 16(4): 3000-3013, 2023] fails to keep user anonymity, not as claimed. The scheme simply acknowledges that user anonymity is equivalent to preventing user's identity from being recovered. But the true anonymity means that the adversary cannot attribute different sessions to target users. It relates to entity-distinguishable, not just identity-revealable. To the best of our knowledge, it is the first time to clarify the explicit signification of user anonymity.
PROF: Protected Order Flow in a Profit-Seeking World
Users of decentralized finance (DeFi) applications face significant risks from adversarial actions that manipulate the order of transactions to extract value from users. Such actions---an adversarial form of what is called maximal-extractable value (MEV)---impact both individual outcomes and the stability of the DeFi ecosystem. MEV exploitation, moreover, is being institutionalized through an architectural paradigm known Proposer-Builder Separation (PBS).
This work introduces a system called PROF (PRotected Order Flow) that is designed to limit harmful forms of MEV in existing PBS systems. PROF aims at this goal using two ideas. First, PROF imposes an ordering on a set ("bundle") of privately input transactions and enforces that ordering all the way through to block production-preventing transaction-order manipulation. Second, PROF creates bundles whose inclusion is profitable to block producers, thereby ensuring that bundles see timely inclusion in blocks.
PROF is backward-compatible, meaning that it works with existing and future PBS designs. PROF is also compatible with any desired algorithm for ordering transactions within a PROF bundle (e.g., first-come, first-serve, fee-based, etc.). It executes efficiently, i.e., with low latency, and requires no additional trust assumptions among PBS entities. We quantitatively and qualitatively analyze PROF’s incentive structure, and its utility to users compared with existing solutions. We also report on inclusion likelihood of PROF transactions, and concrete latency numbers through our end-to-end implementation.
Efficient Variants of TNT with BBB Security
At EUROCRYPT'20, Bao et al. have shown that three-round cascading of $\textsf{LRW1}$ construction, which they dubbed as $\textsf{TNT}$, is a strong tweakable pseudorandom permutation that provably achieves $2n/3$-bit security bound. Jha et al. showed a birthday bound distinguishing attack on $\textsf{TNT}$ and invalidated the proven security bound and proved a tight birthday bound security on the $\textsf{TNT}$ construction in EUROCRYPT'24.
In a recent work, Datta et al. have shown that four round cascading of the $\textsf{LRW1}$ construction, which they dubbed as $\textsf{CLRW1}^4$ is a strong tweakable pseudorandom permutation that provably achieves $3n/4$-bit security. In this paper, we propose a variant of the $\textsf{TNT}$ construction, called $\textsf{b-TNT1}$, and proved its security up to $2^{3n/4}$ queries. However, unlike $\textsf{CLRW1}^4$, $\textsf{b-TNT1}$ requires three block cipher calls along with a field multiplication. Besides, we also propose another variant of the $\textsf{TNT}$ construction, called $\textsf{b-TNT2}$ and showed a similar security bound. Compared to $\textsf{b-TNT1}$, $\textsf{b-TNT2}$ requires four block cipher calls. Nevertheless, its execution of block cipher calls can be pipelined which makes it efficient over $\textsf{CLRW1}^4$. We have also experimentally verified that both $\textsf{b-TNT1}$ and $\textsf{b-TNT2}$ outperform $\textsf{CLRW1}^4$.
Bandwidth-Hard Functions: Reductions and Lower Bounds
Memory Hard Functions (MHFs) have been proposed as an answer to the growing inequality between the computational speed of general purpose CPUs and Application Specific Integrated Circuits (ASICs).
MHFs have seen widespread applications including password hashing, key stretching and proofs of work.
Several metrics have been proposed to quantify the ``memory hardness'' of a function. Cumulative memory complexity (CMC) (Alwen and Serbinenko, STOC 2015) (or amortized Area $\times$ Time complexity (Alwen et. al., CCS 2017)) attempts to quantify the cost to acquire/build the hardware to evaluate the function --- after normalizing the time it takes to evaluate the function repeatedly at a given rate.
By contrast, bandwidth hardness (Ren and Devadas, TCC 2017) attempts to quantify the energy costs of evaluating this function --- which in turn is largely dominated by the number of cache misses.
Ideally, a good MHF would be both bandwidth hard and have high cumulative memory complexity.
While the cumulative memory complexity of leading MHF candidates is well understood, little is known about the \emph{bandwidth hardness} of many prominent MHF candidates.
Our contributions are as follows:
First, we provide the first reduction proving that, in the parallel random oracle model, the bandwidth hardness of a Data-Independent Memory Hard Function (iMHF) is described by the red-blue pebbling cost of the directed acyclic graph (DAG) associated with that iMHF.
Second, we show that the goals of designing an MHF with high CMC/bandwidth hardness are well aligned. In particular, we prove that any function (data-independent or not) with high CMC also has relatively high bandwidth costs.
Third, we analyze the bandwidth hardness of several prominent iMHF candidates such as Argon2i (Biryukov et. al., 2015), winner of the password hashing competition, aATSample and DRSample (Alwen et. al., CCS 2017) --- the first practical iMHF with essentially asymptotically optimal CMC. We prove that in the parallel random oracle model each iMHFs are maximally bandwidth hard.
Fourth, we analyze the bandwidth hardness of a prominent dMHF called Scrypt. We prove the first unconditional tight lower bound on the energy cost of Scrypt in the parallel random oracle model.
Finally, we show that the problem of finding the minimum cost red-blue pebbling of a directed acyclic graph is NP-hard.
Formal Analysis of SPDM: Security Protocol and Data Model version 1.2
DMTF is a standards organization by major industry players in IT infrastructure including AMD, Alibaba, Broadcom, Cisco, Dell, Google, Huawei, IBM, Intel, Lenovo, and NVIDIA, which aims to enable interoperability, e.g., including cloud, virtualization, network, servers and storage. It is currently standardizing a security protocol called SPDM, which aims to secure communication over the wire and to enable device attestation, notably also explicitly catering for communicating hardware components.
The SPDM protocol inherits requirements and design ideas from IETF’s TLS 1.3. However, its state machines and transcript handling are substantially different and more complex. While architecture, specification, and open-source libraries of the current versions of SPDM are publicly available, these include no significant security analysis of any kind.
In this work we develop the first formal models of the three modes of the SPDM protocol version 1.2.1, and formally analyze their main security properties.
Route Discovery in Private Payment Channel Networks
In this work, we are the first to explore route discovery in private channel networks.
We first determine what ``ideal" privacy for a routing protocol means in this setting. We observe that protocols achieving this strong privacy definition exist by leveraging (topology hiding) Multi-Party Computation but they are (inherently) inefficient as route discovery must involve the entire network.
We then present protocols with weaker privacy guarantees but much better efficiency. In particular, route discovery typically only involves small fraction of the nodes but some information on the topology and balances -- beyond what is necessary for performing the transaction -- is leaked.
The core idea is that both sender and receiver gossip a message which then slowly propagates through the network, and the moment any node in the network receives both messages, a path is found. In our first protocol the message is always sent to all neighbouring nodes with a delay proportional to the fees of that edge. In our second protocol the message is only sent to one neighbour chosen randomly with a probability proportional to its degree. While the first instantiation always finds the cheapest path, the second might not, but it involves a smaller fraction of the network.
% We discuss some extensions like employing bilinear maps so the gossiped messages can be re-randomized, making them unlikeable and thus improving privacy.
We also discuss some extensions to further improve privacy by employing bilinear maps.
Simulations of our protocols on the Lightning network topology (for random transactions and uniform fees) show that our first protocol (which finds the cheapest path) typically involves around 12\% of the 6376 nodes, while the second only touches around 18 nodes $(<0.3\%)$, and the cost of the path that is found is around twice the cost of the optimal one.
Dynamic Collusion Functional Encryption and Multi-Authority Attribute-Based Encryption
Functional Encryption (FE) is a powerful notion of encryption which enables computations and partial message recovery of encrypted data. In FE, each decryption key is associated with a function $f$ such that decryption recovers the function evaluation $f(m)$ from an encryption of $m$. Informally, security states that a user with access to function keys $\mathsf{sk}_{f_1}, \mathsf{sk}_{f_2}, \ldots$ (and so on) can only learn $f_1(m), f_2(m), \ldots$ (and so on) but nothing more about the message. The system is said to be $q$-bounded collusion resistant if the security holds as long as an adversary gets access to at most $q = q(\lambda)$ decryption keys. In the last decade, numerous works have proposed many FE constructions from a wide array of algebraic and general cryptographic assumptions, and proved their security in the bounded collusion model.
However, until very recently, all these works studied bounded collusion resistance in a "static model", where the collusion bound $q$ was a global system parameter. While the static collusion model led to great research progress in the community, it has many major drawbacks. Very recently, Agrawal et al. (Crypto 2021) and Garg et al. (Eurocrypt 2022) independently introduced the "dynamic model" for bounded collusion resistance, where the collusion bound $q$ was a fluid parameter that was not globally set but only chosen by each encryptor. The dynamic collusion model enabled harnessing the many virtues of the static collusion model, while avoiding its various drawbacks.
In this work, we give a simple and generic approach to upgrade any scheme from the static collusion model to the dynamic collusion model. Our result captures all existing results in the dynamic model in the form of a single unified framework, and also gives new results as simple corollaries with a lot more potential in the future. An interesting artifact of our result is that it gives a generic way to match existing lower bounds in functional encryption.
A Statistical Verification Method of Random Permutations for Hiding Countermeasure Against Side-Channel Attacks
As NIST is putting the final touches on the standardization of PQC (Post Quantum Cryptography) public key algorithms, it is a racing certainty that peskier cryptographic attacks undeterred by those new PQC algorithms will surface. Such a trend in turn will prompt more follow-up studies of attacks and countermeasures. As things stand, from the attackers’ perspective, one viable form of attack that can be implemented thereupon is the so-called “side-channel attack”. Two best-known countermeasures heralded to be durable against side-channel attacks are: “masking” and “hiding”. In that dichotomous picture, of particular note are successful single-trace attacks on some of the NIST’s PQC then-candidates, which worked to the detriment of the former: “masking”. In this paper, we cast an eye over the latter: “hiding”. Hiding proves to be durable against both side-channel attacks and another equally robust type of attacks called “fault injection attacks”, and hence is deemed an auspicious countermeasure to be implemented. Mathematically, the hiding method is fundamentally based on random permutations. There has been a cornucopia of studies on generating random permutations. However, those are not tied to implementation of the hiding method. In this paper, we propose a reliable and efficient verification of permutation implementation, through employing Fisher–Yates’ shuffling method. We introduce the concept of an 𝑛-th order permutation and explain how it can be used to verify that our implementation is more efficient than its previous-gen counterparts for hiding countermeasures.
Privacy Preserving Biometric Authentication for Fingerprints and Beyond
Biometric authentication eliminates the need for users to remember secrets and serves as a convenient mechanism for user authentication. Traditional implementations of biometric-based authentication store sensitive user biometry on the server and the server becomes an attractive target of attack and a source of large-scale unintended disclosure of biometric data. To mitigate the problem, we can resort to privacy-preserving computation and store only protected biometrics on the server. While a variety of secure computation techniques is available, our analysis of privacy-preserving biometric computation and biometric authentication constructions revealed that available solutions fall short of addressing the challenges of privacy-preserving biometric authentication. Thus, in this work we put forward new constructions to address the challenges.
Our solutions employ a helper server and use strong threat models, where a client is always assumed to be malicious, while the helper server can be semi-honest or malicious. We also determined that standard secure multi-party computation security definitions are insufficient to properly demonstrate security in the two-phase (enrollment and authentication) entity authentication application. We thus extend the model and formally show security in the multi-phase setting, where information can flow from one phase to another and the set of participants can change between the phases. We implement our constructions and show that they exhibit practical performance for authentication in real time.
Optimizing Big Integer Multiplication on Bitcoin: Introducing w-windowed Approach
A crucial component of any zero-knowledge system is operations with finite fields. This, in turn, leads to the implementation of the fundamental operation: multiplying two big integers. In the realm of Bitcoin, this problem gets revisited, as Bitcoin utilizes its own stack-based and not Turing-complete scripting system called Bitcoin Script. Inspired by Elliptic Curve scalar multiplication, this paper introduces the $w$-windowed method for multiplying two numbers. We outperform state-of-the-art approaches, including BitVM’s implementation. Finally, we also show how the windowed method can lead to optimizations not only in big integer arithmetic solely but in more general arithmetic problems.
Competitive Policies for Online Collateral Maintenance
Layer-two blockchain protocols emerged to address scalability issues related to fees, storage cost, and confirmation delay of on-chain transactions. They aggregate off-chain transactions into a fewer on-chain ones, thus offering immediate settlement and reduced transaction fees. To preserve security of the underlying ledger, layer-two protocols often work in a collateralized model; resources are committed on-chain to backup off-chain activities. A fundamental challenge that arises in this setup is determining a policy for establishing, committing, and replenishing the collateral in a way that maximizes the value of settled transactions.
In this paper, we study this problem under two settings that model collateralized layer-two protocols. The first is a general model in which a party has an on-chain collateral $C$ with a policy to decide on whether to settle or discard each incoming transaction. The policy also specifies when to replenish $C$ based on the remaining collateral value. The second model considers a discrete setup in which $C$ is divided among $k$ wallets, each of which is of size $C/k$, such that when a wallet is full, and so cannot settle any incoming transactions, it will be replenished. We devise several online policies for these models, and show how competitive they are compared to optimal (offline) policies that have full knowledge of the incoming transaction stream. To the best of our knowledge, we are the first to study and formulate online competitive policies for collateral and wallet management in the blockchain setting.
SoK: Model Reverse Engineering Threats for Neural Network Hardware
There has been significant progress over the past seven years in model reverse engineering (RE) for neural network (NN) hardware. Although there has been systematization of knowledge (SoK) in an overall sense, however, the treatment from the hardware perspective has been far from adequate. To bridge this gap, this paper systematically categorizes the types of NN hardware used prevalently by the industry/academia, and also the model RE attacks/defenses published in each category. Further, we sub-categorize existing NN model RE attacks based on different criteria including the degree of hardware parallelism, threat vectors like side channels, fault-injection, scan-chain attacks, system-level attacks, type of asset under attack, the type of NN, exact versus approximate recovery, etc.
We make important technical observations and identify key open research directions. Subsequently, we discuss the state-of-the-art defenses against NN model RE, identify certain categorization criteria, and compare the existing works based on these criteria. We note significant qualitative gaps for defenses, and suggest recommendations for important open research directions for protection of NN models. Finally, we discuss limitations of existing work in terms of the types of models where security evaluation or defenses were proposed, and suggest open problems in terms of protecting practically expensive model IPs.
Anonymous, Timed and Revocable Proxy Signatures
A proxy signature enables a party to delegate her signing power to another. This is useful in practice to achieve goals related to robustness, crowd-sourcing, and workload sharing. Such applications, especially in the blockchain model, usually require delegation to satisfy several properties, including time bounds, anonymity, revocability, and policy enforcement. Despite the large amount of work on proxy signatures in the literature, none of the existing schemes satisfy all these properties; even there is no unified formal notion that captures them.
In this work, we close this gap and propose RelaySchnorr, an anonymous, timed, and revocable proxy signature scheme. We achieve this in two steps: First, we introduce a tokenizable digital signature based on Schnorr signature allowing for secure distribution of signing tokens. Second, we utilize a public bulletin board, instantiated as a blockchain, and timelock encryption to support: (1) one-time usage of the signing tokens by tracking tokens used so far based on unique values associated to them, (2) timed delegation so that a proxy signer cannot sign outside a given period, and (3) delegation revocation allowing the original signer to end a delegation earlier than provisioned. All of these are done in a decentralized and anonymous way so that no one can tell that someone else signed on behalf of the original signer or even that a delegation took place. We define a formal notion for proxy signatures capturing all these properties, and prove that our construction realizes this notion. We also discuss several design considerations addressing issues related to deployment in practice.
On the practical CPAD security of “exact” and threshold FHE schemes and libraries
In their 2021 seminal paper, Li and Micciancio presented a passive attack against the CKKS approximate FHE scheme and introduced the notion of CPAD security. The current status quo is that this line of attacks does not apply to ``exact'' FHE. In this paper, we challenge this status quo by exhibiting a CPAD key recovery attack on the linearly homomorphic Regev cryptosystem which easily generalizes to other xHE schemes such as BFV, BGV and TFHE showing that these cryptosystems are not CPAD secure in their basic form. We also show that existing threshold variants of BFV, BGV and CKKS are particularily exposed to CPAD attackers and would be CPAD-insecure without smudging noise addition after partial decryption. Finally we successfully implement our attack against several mainstream FHE libraries and discuss a number of natural countermeasures as well as their consequences in terms of FHE practice, security and efficiency. The attack itself is quite practical as it typically takes less than an hour on an average laptop PC, requiring a few thousand ciphertexts as well as up to around a million evaluations/decryptions, to perform a full key recovery.
Representations of Group Actions and their Applications in Cryptography
Cryptographic group actions provide a flexible framework that allows the instantiation of several primitives, ranging from key exchange protocols to PRFs and digital signatures. The security of such constructions is based on the intractability of some computational problems. For example, given the group action $(G,X,\star)$, the weak unpredictability assumption (Alamati et al., Asiacrypt 2020) requires that, given random $x_i$'s in $X$, no probabilistic polynomial time algorithm can compute, on input $\{(x_i,g\star x_i)\}_{i=1,\dots,Q}$ and $y$, the set element $g\star y$.
In this work, we study such assumptions, aided by the definition of group action representations and a new metric, the $q$-linear dimension, that estimates the "linearity'' of a group action, or in other words, how much it is far from being linear.
We show that under some hypotheses on the group action representation, and if the $q$-linear dimension is polynomial in the security parameter, then the weak unpredictability and other related assumptions cannot hold.
This technique is applied to some actions from cryptography, like the ones arising from the equivalence of linear codes, as a result, we obtain the impossibility of using such actions for the instantiation of certain primitives.
As an additional result, some bounds on the $q$-linear dimension are given for classical groups, such as $\mathcal{S}_n$, $\mathrm{GL}(\mathbb{F}^n)$ and the cyclic group $\mathbb{Z}_n$ acting on itself.
Efficient and Privacy-Preserving Collective Remote Attestation for NFV
The virtualization of network functions is a promising technology, which can enable mobile network operators to provide more flexibility and better resilience for their infrastructure and services. Yet, virtualization comes with challenges, as 5G operators will require a means of verifying the state of the virtualized network components (e.g. Virtualized Network Functions (VNFs) or managing hypervisors) in order to fulfill security and privacy commitments. One such means is the use of attestation protocols. In this paper, we focus on Collective Remote Attestation (cRA), which is used to attest the state of a group of devices. Although cRA has been extensively studied in the context of IoT, it has not been used yet in virtualized mobile networks, a different use-case, with constraints of its own.
In this paper, we propose the first protocol to efficiently and securely attest a group of Virtualized Network Functions which make up a VNF Forwarding Graph. Our protocol comes with strong and provable guarantees of: unforgeability of attestation, the linkability of attestations for related components, and the privacy of sensitive configuration details for the infrastructure provider. In particular, we are the first to formally define and analyze such properties for VNF-FG attestation. Finally, through our Proof-of-Concept implementation, we show that our construction is not only strongly secure, but also efficient.
Attacks Against the INDCPA-D Security of Exact FHE Schemes
A recent security model for fully homomorphic encryption (FHE), called IND-CPA^D security and introduced by Li and Micciancio [Eurocrypt'21], strengthens IND-CPA security by giving the attacker access to a decryption oracle for ciphertexts for which it should know the underlying plaintexts. This includes ciphertexts that it (honestly) encrypted and those obtained from the latter by evaluating circuits that it chose. Li and Micciancio singled out the CKKS FHE scheme for approximate data [Asiacrypt'17] by giving an IND-CPA^D attack on it and claiming that IND-CPA security and IND-CPA^D security coincide for exact FHE schemes.
We correct the widespread belief according to which IND-CPA^D attacks are specific to approximate homomorphic computations. Indeed, the equivalency formally proved by Li and Micciancio assumes that the schemes have a negligible probability of incorrect decryption. However, almost all competitive implementations of exact FHE schemes give away strong correctness by analyzing correctness heuristically and allowing noticeable probabilities of incorrect decryption.
We exploit this imperfect correctness to mount efficient non-adaptive indistinguishability and key-recovery attacks against all major exact FHE schemes. We illustrate their strength by implementing them for BFV using OpenFHE and simulating an attack for the default parameter set of the CGGI implementation of TFHE-rs (the attack experiment is too expensive to be run on commodity desktops, because of the cost of CGGI bootstrapping). Our attacks extend to CKKS for discrete data, and threshold versions of the exact FHE schemes, when the correctness is similarly loose.
Efficient Quantum Algorithm for SUBSET-SUM Problem
Problems in the complexity class $NP$ are not all known to be solvable, but are verifiable given the solution, in polynomial time by a classical computer. The complexity class $BQP$ includes all problems solvable in polynomial time by a quantum computer. Prime factorization is in $NP$ class, and is also in $BQP$ class, owing to Shor's algorithm. The hardest of all problems within the $NP$ class are called $NP$-complete. If a quantum algorithm can solve an $NP$-complete problem in polynomial time, it would imply that a quantum computer can solve all problems in $NP$ in polynomial time. Here, we present a polynomial-time quantum algorithm to solve an $NP$-complete variant of the $SUBSET-SUM$ problem, thereby, rendering $NP\subseteq BQP$. We illustrate that given a set of integers, which may be positive or negative, a quantum computer can decide in polynomial time whether there exists any subset that sums to zero. There are many real-world applications of our result, such as finding patterns efficiently in stock-market data, or in recordings of the weather or brain activity. As an example, the decision problem of matching two images in image processing is $NP$-complete, and can be solved in polynomial time, when amplitude amplification is not required.
On the Number of Restricted Solutions to Constrained Systems and their Applications
In this paper, we formulate a special class of systems of linear equations over finite fields that appears naturally in the provable security analysis of several MAC and PRF modes of operation. We derive lower bounds on the number of solutions for such systems adhering to some predefined restrictions, and apply these lower bounds to derive tight PRF security for several constructions. We show security up to $2^{3n/4}$ queries for the single-keyed variant of the Double-block Hash-then-Sum (DBHtS) construction, called 1k-DBHtS, under appropriate assumptions on the underlying hash function. We show that the single-key variants of PMAC+ and LightMAC+, called 1k-PMAC+ and 1k-LightMAC+ achieve the required hash function properties, and thus, achieve $3n/4$-bit security. Additionally, we show that the sum of $r$ independent copies of the Even-Mansour cipher is a secure PRF up to $2^{\frac{r}{r+1}n}$ queries.
Efficient Implementation of Super-optimal Pairings on Curves with Small Prime Fields at the 192-bit Security Level
For many pairing-based cryptographic protocols such as Direct Anonymous Attestation (DAA) schemes, the arithmetic on the first pairing subgroup $\mathbb{G}_1$ is more fundamental. Such operations heavily depend on the sizes of prime fields. At the 192-bit security level, Gasnier and Guillevic presented a curve named GG22D7-457 with CM-discriminant $D = 7$ and embedding degree $k = 22$. Compared to other well-known pairing-friendly curves at the same security level, the curve GG22D7-457 has smaller prime field size and $\rho$-value, which benefits from the fast operations on $\mathbb{G}_1$. However, the pairing computation on GG22D7-457 is not efficient.
In this paper, we investigate to derive a higher performance for the pairing computation on GG22D7-457. We first propose novel formulas of the super-optimal pairing on this curve by utilizing a $2$-isogeny as GLV-endomorphism. Besides, this tool can be generalized to more generic families of pairing-friendly curves with $n$-isogenies as endomorphisms. In our paper, we provide the explicit formulas for the super-optimal pairings exploiting $2, 3$-isogenies. Finally, we make a concrete computational cost analysis and implement the pairing computations on curve GG22D7-457 using our approaches. In terms of Miller function evaluation, employing the techniques in this paper obtain a saving of $24.44\% $ in $\mathbb{F}_p$-multiplications compared to the optimal ate pairing. As for the running time, the experimental results illustrate that the Miller loop on GG22D7-457 by utilizing our methods is $26.0\%$ faster than the state-of-the-art. Additionally, the performance of the super-optimal pairing on GG22D7-457 is competitive compared to the well-known pairing-friendly curves at the 192-bit security level. These results show that GG22D7-457 becomes an attractive candidate for the pairing-based protocols. Furthermore, our techniques have the potential to enhance the applications of super-optimal pairings on more pairing-friendly curves.
Automated Software Vulnerability Static Code Analysis Using Generative Pre-Trained Transformer Models
Generative Pre-Trained Transformer models have been shown to be surprisingly effective at a variety of natural language processing tasks -- including generating computer code. However, in general GPT models have been shown to not be incredibly effective at handling specific computational tasks (such as evaluating mathematical functions).
In this study, we evaluate the effectiveness of open source GPT models, with no fine-tuning, and with context introduced by the langchain and localGPT Large Language Model (LLM) framework, for the task of automatic identification of the presence of vulnerable code syntax (specifically targeting C and C++ source code). This task is evaluated on a selection of $36$ source code examples from the NIST SARD dataset, which are specifically curated to not contain natural English that indicates the presence, or lack thereof, of a particular vulnerability (including the removal of all source code comments). The NIST SARD source code dataset contains identified vulnerable lines of source code that are examples of one out of the $839$ distinct Common Weakness Enumerations (CWE), allowing for exact quantification of the GPT output classification error rate. A total of $5$ GPT models are evaluated, using $10$ different inference temperatures and $100$ repetitions at each setting, resulting in $5,000$ GPT queries per vulnerable source code analyzed.
Ultimately, we find that the open source GPT models that we evaluated are not suitable for fully automated vulnerability scanning because the false positive and false negative rates are too high to likely be useful in practice. However, we do find that the GPT models perform surprisingly well at automated vulnerability detection for some of the test cases, in particular surpassing random sampling (for some GPT models and inference temperatures), and being able to identify the exact lines of code that are vulnerable albeit at a low success rate. The best performing GPT model result found was Llama-2-70b-chat-hf with inference temperature of $0.1$ applied to NIST SARD test case 149165 (which is an example of a buffer overflow vulnerability), which had a binary classification recall score of $1.0$ and a precision of $1.0$ for correctly and uniquely identifying the vulnerable line of code and the correct CWE number.
Additionally, the GPT models are able to, with a rate quantifiably better than random sampling, identify the specific line of source that contains the identified CWE for many of the NIST SARD test cases.
SILBE: an Updatable Public Key Encryption Scheme from Lollipop Attacks
We present a new post-quantum Public Key Encryption scheme (PKE) named Supersingular Isogeny Lollipop Based Encryption or SILBE. SILBE is obtained by leveraging the generalised lollipop attack of Castryck and Vercauteren on the M-SIDH Key exchange by Fouotsa, Moriya and Petit.
Doing so, we can in fact make SILBE a post-quantum secure Updatable Public Key Encryption scheme (UPKE). SILBE is in fact the first isogeny-based UPKE which is not based on group actions. Hence, SILBE overcomes the limitations highlighted by Eaton, Jao, Komlo and Mokrani at SAC'21 regarding the design of an SIDH-style UPKE. This is possible by leveraging both the Deuring Correspondence and Kani's Lemma, two central concepts in Isogeny-Based Cryptography.
An Efficient ZK Compiler from SIMD Circuits to General Circuits
We propose a generic compiler that can convert any zero-knowledge (ZK) proof for SIMD circuits to general circuits efficiently, and an extension that can preserve the space complexity of the proof systems. Our compiler can immediately produce new results improving upon state of the art.
-By plugging in our compiler to Antman, an interactive sublinear-communication protocol, we improve the overall communication complexity for general circuits from $\mathcal{O}(C^{3/4})$ to $\mathcal{O}(C^{1/2})$. Our implementation shows that for a circuit of size $2^{27}$, it achieves up to $83.6\times$ improvement on communication compared to the state-of-the-art implementation. Its end-to-end running time is at least $70\%$ faster in a $10$Mbps network.
-Using the recent results on compressed $\Sigma$-protocol theory, we obtain a discrete-log-based constant-round zero-knowledge argument with $\mathcal{O}(C^{1/2})$ communication and common random string length, improving over the state of the art that has linear-size common random string and requires heavier computation.
-We improve the communication of a designated $n$-verifier zero-knowledge proof from $\mathcal{O}(nC/B+n^2B^2)$ to $\mathcal{O}(nC/B+n^2)$.
To demonstrate the scalability of our compilers, we were able to extract a commit-and-prove SIMD ZK from Ligero and cast it in our framework. We also give one instantiation derived from LegoSNARK, demonstrating that the idea of CP-SNARK also fits in our methodology.
ZIPNet: Low-bandwidth anonymous broadcast from (dis)Trusted Execution Environments
Anonymous Broadcast Channels (ABCs) allow a group of clients to announce messages without revealing the exact author. Modern ABCs operate in a client-server model, where anonymity depends on some threshold (e.g., 1 of 2) of servers being honest. ABCs are an important application in their own right, e.g., for activism and whistleblowing. Recent work on ABCs (Riposte, Blinder) has focused on minimizing the bandwidth cost to clients and servers when supporting large broadcast channels for such applications. But, particularly for low bandwidth settings, they impose large costs on servers, make cover traffic costly, and make volunteer operators unlikely.
In this paper, we describe the design, implementation, and evaluation of ZIPNet, an anonymous broadcast channel that 1) scales to hundreds of anytrust servers by minimizing the computational costs of each server, 2) substantially reduces the servers’ bandwidth costs by outsourcing the aggregation of client messages to untrusted (for privacy) infrastructure, and 3) supports cover traffic that is both cheap for clients to produce and for servers to handle.
A Spectral Analysis of Noise: A Comprehensive, Automated, Formal Analysis of Diffie-Hellman Protocols
The Noise specification describes how to systematically construct a large family of Diffie-Hellman based key exchange protocols, including the secure transports used by WhatsApp, Lightning, and WireGuard. As the specification only makes informal security claims, earlier work has explored which formal security properties may be enjoyed by protocols in the Noise framework, yet many important questions remain open.
In this work we provide the most comprehensive, systematic analysis of the Noise framework to date. We start from first principles and, using an automated analysis tool, compute the strongest threat model under which a protocol is secure, thus enabling formal comparison between protocols. Our results allow us to objectively and automatically associate each informal security level presented in the Noise specification with a formal security claim.
We also provide a fine-grained separation of Noise protocols that were previously described as offering similar security properties, revealing a subclass for which alternative Noise protocols exist that offer strictly better security guarantees. Our analysis also uncovers missing assumptions in the Noise specification and some surprising consequences, e.g. in some situations higher security levels yield strictly worse security.
The syzygy distinguisher
We present a new distinguisher for alternant and Goppa codes, whose complexity is subexponential in the error-correcting capability, hence better than that of generic decoding algorithms. Moreover it does not suffer from the strong regime limitations of the previous distinguishers or structure recovery algorithms: in particular, it applies to the codes used in the Classic McEliece candidate for postquantum cryptography standardization. The invariants that allow us to distinguish are graded Betti numbers of the homogeneous coordinate ring of a shortening of the dual code.
Since its introduction in 1978, this is the first time an analysis of the McEliece cryptosystem breaks the exponential barrier.
Generic Construction of Secure Sketches from Groups
Secure sketches are designed to facilitate the recovery of originally enrolled data from inputs that may vary slightly over time. This capability is important in applications where data consistency cannot be guaranteed due to natural variations, such as in biometric systems and hardware security. Traditionally, secure sketches are constructed using error-correcting codes to handle these variations effectively. Additionally, principles of information theory ensure the security of these sketches by managing the trade-off between data recoverability and confidentiality. In this paper, we show how to construct a new family of secure sketches generically from groups. The notion of groups with unique factorization property is first introduced, which is of independent interest and serves as a building block for our secure sketch construction. Next, an in-depth study of the underlying mathematical structures is provided, and some computational and decisional hardness assumptions are defined. As a result, it is argued that our secure sketches are efficient; can handle a linear fraction of errors with respect to the norm 1 distance; and that they are reusable and irreversible. To our knowledge, such generic group-based secure sketch construction is the first of its kind, and it offers a viable alternative to the currently known secure sketches.
What Have SNARGs Ever Done for FHE?
In recent years, there have been several constructions combining FHE with SNARGs to add integrity guarantees to FHE schemes. Most of these works focused on improving efficiency, while the precise security model with regards to client side input privacy has remained understudied. Only recently it was shown by Manulis and Nguyen (Eurocrypt'24) that this combination does not yield IND-CCA1 security. So an interesting open question is: does the SNARG actually add any meaningful security to input privacy? We address this question in this note and give a security definition that meaningfully captures the security of the FHE plus SNARG construction.
Quantum Implementation and Analysis of ARIA
The progression of quantum computing is considered a potential threat to traditional cryptography system, highlighting the significance of post-quantum security in cryptographic systems. Regarding symmetric key encryption, the Grover algorithm can approximately halve the search complexity. Despite the absence of fully operational quantum computers at present, the necessity of assessing the security of symmetric key encryption against quantum computing continues to grow. In this paper, we implement the ARIA block cipher in a quantum circuit and compare it with previous research. Our implementation of the ARIA quantum circuit achieves over 92.5% improvement in full depth and over 98.7% improvement in Toffoli depth compared to the implementation proposed in Chauhan et al. Compared to Yang et al.’s implementation, our implementation is improved the full depth by 36.7% and the number of qubits by 8%. Additionally, we analyze the complexity of Grover’s search attack and compare it with NIST criteria. We confirm that ARIA achieves quantum security level 1, 3, and 5 (ARIA-128, 192, and 256, respectively).
Depth Optimized Quantum Circuits for HIGHT and LEA
Quantum computers can model and solve several problems that have posed challenges for classical super computers, leveraging their natural quantum mechanical characteristics. A large-scale quantum computer is poised to significantly reduce security strength in cryptography. In this context, extensive research has been conducted on quantum cryptanalysis. In this paper, we present optimized quantum circuits for Korean block ciphers, HIGHT and LEA. Our quantum circuits for HIGHT and LEA demonstrate the lowest circuit depth compared to previous results. Specifically, we achieve depth reductions of 48% and 74% for HIGHT and LEA, respectively. We employ multiple novel techniques that effectively reduce the quantum circuit depth with a reasonable increase in qubit count. Based on our depth-optimized quantum circuits for HIGHT and LEA block ciphers, we estimate the lowest quantum attack complexity for Grover’s key search. Our quantum circuit can be utilized for other quantum algorithms, not only for Grover’s algorithm. Furthermore, the optimization methods gathered in this work can be adopted for generic quantum implementations in cryptography.
Foldable, Recursive Proofs of Isogeny Computation with Reduced Time Complexity
The security of certain post-quantum isogeny-based cryptographic schemes relies on the ability to provably and efficiently compute isogenies between supersingular elliptic curves without leaking information about the isogeny other than its domain and codomain. Earlier work in this direction give mathematical proofs of knowledge for the isogeny, and as a result when computing a chain of $n$ isogenies each proceeding node must verify the correctness of the proof of each preceding node, which is computationally linear in $n$.
In this work, we empirically build a system to prove the execution of the circuit computing the isogeny rather than produce a proof of knowledge. This proof can then be used as part of the verifiable folding scheme Nova, which reduces the complexity of an isogeny proof of computation for a chain of $n$ isogenies from $O(n)$ to $O(1)$ by providing at each step a single proof that proves the whole preceding chain. To our knowledge, this is the first application of this type of solution to this problem.
A Note on the use of the Double Boomerang Connectivity Table (DBCT) for Spotting Impossibilities
In this short note we examine one of the impossible boomerang distinguishers of Skinny-128-384 provided by Zhang, Wang and Tang at ToSC 2024 Issue 2 and disprove it.
The issue arises from the use of the Double Boomerang Connectivity Table (DBCT) as a tool to establish that a boomerang switch over 2 rounds has probability zero, whereas the DBCT only covers specific cases of difference propagation, missing a large set of events that might make the connection possible.
We study in details the specific instance provided by Zhang et al. and display one example of a returning quartet that contradicts the impossibility.
hodlCoin: A Financial Game
The hodlCoin game is a competitive zero-sum massively multiplayer financial game where the goal is to hodl an asset for long periods of time. By hodling, a player deposits coins of a given asset in a common reserve and receives a proportional amount of hodlCoins. Players who un-hodl pay a fee that is accumulated in the common reserve. Thus, the longer a player hodls, in comparison with other players, the more the player will benefit from fees paid by the players who are un-hodling earlier. Surprisingly, we prove here that, thanks to the accumulation of fees, the price of hodlCoins in comparison with the underlying asset never decreases.
LatticeFold: A Lattice-based Folding Scheme and its Applications to Succinct Proof Systems
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure and require a large (256-bit) field. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Since LatticeFold can operate over a small (64-bit) field, our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing plausible post-quantum security. Moreover, LatticeFold operates over the same module structure used by fully homomorphic encryption (FHE) and lattice signatures schemes, and can therefore benefit from software optimizations and custom hardware designed to accelerate these lattice schemes.
Robust Quantum Public-Key Encryption with Applications to Quantum Key Distribution
Quantum key distribution (QKD) allows Alice and Bob to agree on a shared secret key, while communicating over a public (untrusted) quantum channel. Compared to classical key exchange, it has two main advantages: (i) The key is unconditionally hidden to the eyes of any attacker, and (ii) its security assumes only the existence of authenticated classical channels which, in practice, can be realized using Minicrypt assumptions, such as the existence of digital signatures. On the flip side, QKD protocols typically require multiple rounds of interactions, whereas classical key exchange can be realized with the minimal amount of two messages using public-key encryption. A long-standing open question is whether QKD requires more rounds of interaction than classical key exchange.
In this work, we propose a two-message QKD protocol that satisfies everlasting security, assuming only the existence of quantum-secure one-way functions. That is, the shared key is unconditionally hidden, provided computational assumptions hold during the protocol execution. Our result follows from a new construction of quantum public-key encryption (QPKE) whose security, much like its classical counterpart, only relies on authenticated classical channels.
Efficient Layered Circuit for Verification of SHA3 Merkle Tree
We present an efficient layered circuit design for SHA3-256 Merkle tree verification, suitable for a GKR proof system, that achieves logarithmic verification and proof size. We demonstrate how to compute the predicate functions for our circuit in $O(\log n)$ time to ensure logarithmic verification and provide GKR benchmarks for our circuit.
Less Effort, More Success: Efficient Genetic Algorithm-Based Framework for Side-channel Collision Attacks
By introducing collision information, the existing side-channel Correlation-Enhanced Collision Attacks (CECAs) performed collision-chain detection, and reduced a given candidate space to a significantly smaller collision-chain space, leading to more efficient key recovery. However, they are still limited by low collision detection speed and low success rate of key recovery. To address these issues, we first give a Collision Detection framework with Genetic Algorithm (CDGA), which exploits Genetic Algorithm to detect the collision chains and has a strong capability of global searching. Secondly, we theoretically analyze the performance of CECA, and bound the searching depth of its output candidate
vectors with a confidence level using a rigorous hypothesis test, which is suitable both for Gaussian and non-Gaussian leakages. This facilitates the
initialization of the population.
Thirdly, we design an innovative goal-directed mutation method to randomly select new gene values for replacement, thus improving efficiency and adaptability of the CDGA. Finally, to optimize the evolutionary of CDGA,
we introduce roulette selection strategy to employ a probability assignment based on individual fitness values to guarantee the preferential selection of superior genes. A single-point crossover strategy is also used to introduce novel gene segments into the chromosomes, thus enhancing the genetic diversity of the population. Experiments verify the superiority of our CDGA.
Bounded-Collusion Streaming Functional Encryption from Minimal Assumptions
Streaming functional encryption (sFE), recently introduced by Guan, Korb, and Sahai [Crypto 2023], is an extension of functional encryption (FE) tailored for iterative computation on dynamic data streams. Unlike in regular FE, in an sFE scheme, users can encrypt and compute on the data as soon as it becomes available and in time proportional to just the size of the newly arrived data.
As sFE implies regular FE, all known constructions of sFE and FE for $\mathsf{P/Poly}$ require strong cryptographic assumptions which are powerful enough to build indistinguishability obfuscation. In contrast, bounded-collusion FE, in which the adversary is restricted to making at most $Q$ function queries for some polynomial $Q$ determined at setup, can be built from the minimal assumptions of public-key encryption (for public-key FE) [Sahai and Seyalioglu, CCS 2010; Gorbunov, Vaikuntanathan, and Wee, CRYPTO 2012] and secret-key encryption (for secret-key FE)[Ananth, Vaikuntanathan, TCC 2019].
In this paper, we introduce and build bounded-collusion streaming FE for any polynomial bound $Q$ from the same minimal assumptions of public-key encryption (for public-key sFE) and secret-key encryption (for secret-key sFE). Similarly to the original sFE paper of Guan, Korb, and Sahai, our scheme satisfies semi-adaptive-function-selective security which is similar to standard adaptive indistinguishability-based security except that we require all functions to be queried before any of the challenge messages.
Along the way, our work also replaces a key ingredient (called $\mathsf{One}\text{-}\mathsf{sFE}$) from the original work of Guan, Korb, and Sahai with a much simpler construction based on garbled circuits.
Asymptotics for the standard block size in primal lattice attacks: second order, formally verified
Many proposals of lattice-based cryptosystems estimate security levels by following a recipe introduced in the New Hope proposal. This recipe, given a lattice dimension n, modulus q, and standard deviation s, outputs a "primal block size" β and a security level growing linearly with β. This β is minimal such that some κ satisfies ((n+κ)s^2+1)^{1/2} < (d/β)^{1/2} δ^{2β−d−1} q^{κ/d}, where d = n + κ + 1 and δ = (β(πβ)^{1/β}/(2π exp 1))^{1/2(β−1)}.
This paper identifies how β grows with n, with enough precision to show the impact of adjusting q and s by constant factors. Specifically, this paper shows that if lg q grows as Q_0 lg n + Q_1 + o(1) and lg s grows as S_0 lg n + S_1 + o(1), where 0 <= S_0 <= 1/2 < Q_0 − S_0, then β/n grows as z_0 + (z_1+o(1))/lg n, where z_0 = 2Q_0/(Q_0−S_0+1/2)^2 and z_1 has a formula given in the paper. The paper provides a traditional-format proof and a proof verified by the HOL Light proof assistant.
More Optimizations to Sum-Check Proving
Many fast SNARKs apply the sum-check protocol to an $n$-variate polynomial of the form $g(x) = \text{eq}(w,x) \cdot p(x)$, where $p$ is a product of multilinear polynomials, $w \in \mathbb{F}^n$ is a random vector, and $\text{eq}$ is the multilinear extension of the equality function.
In this setting, we describe an optimization to the sum-check prover that substantially reduces the cost coming from the $\text{eq}(w, x)$ factor. Our work further improves on a prior optimization by Gruen (ePrint 2023), and in the small-field case, can be combined with additional optimizations by Bagad, Domb, and Thaler (ePrint 2024), and Dao and Thaler (ePrint 2024).
Over large prime-order fields, our optimization eliminates roughly $2^{n + 1}$ field multiplications compared to a standard linear-time implementation of the prover, and roughly $2^{n-1}$ field multiplications when considered on top of Gruen's optimization. These savings are about a 25% (respectively 10%) end-to-end prover speedup in common use cases, and potentially even larger when working over binary tower fields.