Papers updated in last 365 days (Page 24 of 3018 results)
Collaborative CP-NIZKs: Modular, Composable Proofs for Distributed Secrets
Non-interactive zero-knowledge (NIZK) proofs of knowledge have proven to be highly relevant for securely realizing a wide array of applications that rely on both privacy and correctness. They enable a prover to convince any party of the correctness of a public statement for a secret witness. However, most NIZKs do not natively support proving knowledge of a secret witness that is distributed over multiple provers. Previously, collaborative proofs [51] have been proposed to overcome this limitation. We investigate the notion of composability in this setting, following the Commit-and-Prove design of LegoSNARK [17]. Composability allows users to combine different, specialized NIZKs (e.g., one arithmetic circuit, one boolean circuit, and one for range proofs) with the aim of reducing the prove generation time. Moreover, it opens the door to efficient realizations of many applications in the collaborative setting such as mutually exclusive prover groups, combining collaborative and single-party proofs and efficiently implementing publicly auditable MPC (PA-MPC).
We present the first, general definition for collaborative commit-and-prove NIZK (CP-NIZK) proofs of knowledge and construct distributed protocols to enable their realization. We implement our protocols for two commonly used NIZKs, Groth16 and Bulletproofs, and evaluate their practicality in a variety of computational settings. Our findings indicate that composability adds only minor overhead, especially for large circuits. We experimented with our construction in an application setting, and when compared to prior works, our protocols reduce latency by 18–55× while requiring only a fraction (0.2%) of the communication.
OBSCURE: Versatile Software Obfuscation from a Lightweight Secure Element
Software obfuscation is a powerful tool to protect the intellectual property or secret keys inside programs. Strong software obfuscation is crucial in the context of untrusted execution environments (e.g., subject to malware infection) or to face potentially malicious users trying to reverse-engineer a sensitive program. Unfortunately, the state-of-the-art of pure software-based obfuscation (including white-box cryptography) is either insecure or infeasible in practice.
This work introduces OBSCURE, a versatile framework for practical and cryptographically strong software obfuscation relying on a simple stateless secure element (to be embedded, for example, in a protected hardware chip or a token). Based on the foundational result by Goyal et al. from TCC 2010, our scheme enjoys provable security guarantees, and further focuses on practical aspects, such as efficient execution of the obfuscated programs, while maintaining simplicity of the secure element. In particular, we propose a new rectangular universalization technique, which is also of independent interest. We provide an implementation of OBSCURE taking as input a program source code written in a subset of the C programming language. This ensures usability and a broad range of applications of our framework. We benchmark the obfuscation on simple software programs as well as on cryptographic primitives, hence highlighting the possible use cases of the framework as an alternative to pure software-based white-box implementations.
White-box filtering attacks breaking SEL masking: from exponential to polynomial time
This work proposes a new white-box attack technique called filtering, which can be combined with any other trace-based attack method. The idea is to filter the traces based on the value of an intermediate variable in the implementation, aiming to fix a share of a sensitive value and degrade the security of an involved masking scheme.
Coupled with LDA (filtered LDA, FLDA), it leads to an attack defeating the state-of-the-art SEL masking scheme (CHES 2021) of arbitrary degree and number of linear shares with quartic complexity in the window size. In comparison, the current best attacks have exponential complexities in the degree (higher degree decoding analysis, HDDA), in the number of linear shares (higher-order differential computation analysis, HODCA), or the window size (white-box learning parity with noise, WBLPN).
The attack exploits the key idea of the SEL scheme - an efficient parallel combination of the nonlinear and linear masking schemes. We conclude that a proper composition of masking schemes is essential for security.
In addition, we propose several optimizations for linear algebraic attacks: redundant node removal (RNR), optimized parity check matrix usage, and chosen-plaintext filtering (CPF), significantly improving the performance of security evaluation of white-box implementations.
MUSES: Efficient Multi-User Searchable Encrypted Database
Searchable encrypted systems enable privacy-preserving keyword search on encrypted data. Symmetric systems achieve high efficiency (e.g., sublinear search), but they mostly support single-user search. Although systems based on public-key or hybrid models support multi-user search, they incur inherent security weaknesses (e.g., keyword-guessing vulnerabilities) and scalability limitations due to costly public-key operations (e.g., pairing). More importantly, most encrypted search designs leak statistical information (e.g., search, result, and volume patterns) and thus are vulnerable to devastating leakage-abuse attacks. Some pattern-hiding schemes were proposed. However, they incur significant user bandwidth/computation costs, and thus are not desirable for large-scale outsourced databases with resource-constrained users.
In this paper, we propose MUSES, a new multi-writer encrypted search platform that addresses the functionality, security, and performance limitations in the existing encrypted search designs. Specifically, MUSES permits single-reader, multi-writer functionalities with permission revocation and hides all statistical information (including search, result, and volume patterns) while featuring minimal user overhead. In MUSES, we demonstrate a unique incorporation of various emerging distributed cryptographic protocols including Distributed Point Function, Distributed PRF, and Oblivious Linear Group Action. We also introduce novel distributed protocols for oblivious counting and shuffling on arithmetic shares for the general multi-party setting with a dishonest majority, which can be found useful in other applications. Our experimental results showed that the keyword search by MUSES is two orders of magnitude faster with up to 97× lower user bandwidth cost than the state-of-the-art.
MFKDF: Multiple Factors Knocked Down Flat
Nair and Song (USENIX 2023) introduce the concept of a Multi-Factor Key Derivation Function (MFKDF), along with constructions and a security analysis.
MFKDF integrates dynamic authentication factors, such as HOTP and hardware tokens, into password-based key derivation.
The aim is to improve the security of password-derived keys, which can then be used for encryption or as an alternative to multi-factor authentication.
The authors claim an exponential security improvement compared to traditional password-based key derivation functions (PBKDF).
We show that the MFKDF constructions proposed by Nair and Song fall short of the stated security goals.
Underspecified cryptographic primitives and the lack of integrity of the MFKDF state lead to several attacks, ranging from full key recovery when an HOTP factor is compromised, to bypassing factors entirely or severely reducing their entropy.
We reflect on the different threat models of key-derivation and authentication, and conclude that MFKDF is always weaker than plain PBKDF and multi-factor authentication in each setting.
Column-wise Garbling, and How to Go Beyond the Linear Model
In the linear garbling model introduced by Zahur, Rosulek, and Evans (Eurocrypt 2015), garbling an AND gate requires at least \(2\kappa\) bits of ciphertext, where $\kappa$ is the security parameter. Though subsequent works, including those by Rosulek and Roy (Crypto 2021) and Acharya et al. (ACNS 2023), have advanced beyond these linear constraints, a more comprehensive design framework is yet to be developed.
Our work offers a novel, unified, and arguably simple perspective on garbled circuits. We introduce a hierarchy of models that captures all existing practical garbling schemes. By determining the lower bounds for these models, we elucidate the capabilities and limits of each. Notably, our findings suggest that simply integrating a nonlinear processing function or probabilistic considerations does not break the \(2\kappa\) lower bound by Zahur, Rosulek, and Evans. However, by incorporating column correlations, the bound can be reduced to \((1+1/w)\kappa\), where \(w\ge 1\). Additionally, we demonstrate that a straightforward extension of Rosulek and Roy's technique (Crypto 2021) does not yield improved results. We also present a methodology for crafting new models and for exploring further extensions of both the new and the existing models.
Our new models set the course for future designs. We introduce three innovative garbling schemes based on a common principle called ``majority voting.'' The third construction performs on par with the state-of-the-art.
Applying Post-Quantum Cryptography Algorithms to a DLT-Based CBDC Infrastructure: Comparative and Feasibility Analysis
This article presents an innovative project for a Central Bank Digital Currency (CBDC) infrastructure. Focusing on security and reliability, the proposed architecture: (1) employs post-quantum cryptography (PQC) algorithms for long-term security, even against attackers with access to cryptographically-relevant quantum computers; (2) can be integrated with a Trusted Execution Environment (TEE) to safeguard the confidentiality of transaction contents as they are processed by third-parties; and (3) uses Distributed Ledger Technology (DLT) to promote a high level of transparency and tamper resistance for all transactions registered in the system. Besides providing a theoretical discussion on the benefits of this architecture, we experimentally evaluate its components. Namely, as PQC algorithms, we consider three signature schemes being standardized by the National Institute of Standards and Technology (NIST), CRYSTALS-Dilithium, Falcon, and SPHINCS+. Those algorithms are integrated into the Hyperledger Besu (DLT) and executed both inside and outside an Intel SGX TEE environment. According to our results, CRYSTALS-Dilithium-2 combined with classical secp256k1 signatures leads to the shortest execution times when signing blocks in the DLT, reaching 1.68ms without the TEE, and 2.09ms with TEE. The same combination also displays the best results for signature verifications, achieving 0.5ms without a TEE and 1.98ms with a TEE. We also describe the main aspects of the evaluation methodology and the next steps in validating the proposed infrastructure. The conclusions drawn from our experiments is that the combination of PQC and TEE promises highly secure and effective DLT-based CBDC scenarios, ready to face the challenges of the digital financial future and potential quantum threats.
Analysis of One Scheme for User Authentication and Session Key Agreement in Wireless Sensor Network Using Smart Card
We show that the Chunka-Banerjee-Goswami authentication and
key agreement scheme [Wirel. Pers. Commun., 117, 1361-1385, 2021] fails to keep user anonymity, not as claimed. It only keeps pseudonymity. Anonymous actions are designed to be unlinkable to any entity, but pseudonymous actions can be traced back to a certain entity. We also find the scheme is insecure against offline dictionary attack.
A New CRT-based Fully Homomorphic Encryption
We have proposed a novel FHE scheme that uniquely encodes the plaintext with noise in a way that prevents the increasing noise from overflowing and corrupting the plaintext. This allows users to perform computations on encrypted data smoothly. The scheme is constructed using the Chinese Remainder Theorem (CRT), supporting a predefined number of modular operations on encrypted plaintext without the need for bootstrapping.
Although FHE recently became popular after Gentry's work and various developments have occurred in the last decade, the idea of "Fully Homomorphic Encryption (FHE)" scheme was first introduced in the 1970s by Rivest. The Chinese Remainder Theorem is one of the most suitable tools for developing a FHE Scheme because it forms a ring homomorphism \( Z_{p_1} \times Z_{p_2} \times \ldots \times Z_{p_k} \cong Z_{p_1 p_2 \ldots p_k} \).
Various attempts have been made to develop a FHE using CRT, but most of them were unsuccessful, mainly due to the chosen plaintext attack (CPA).
The proposed scheme overcomes the chosen plaintext attack. The scheme also adds random errors to the message during encryption. However, these errors are added in such a way that, when homomorphic operations are performed over encrypted data, the increasing values of errors never overwrite the values of the messages, as happens in LWE-based homomorphic schemes. Therefore, one can perform a predefined number of homomorphic operations (both addition and multiplication) without worrying about the increasing values of errors.
ECO-CRYSTALS: Efficient Cryptography CRYSTALS on Standard RISC-V ISA
The field of post-quantum cryptography (PQC) is continuously evolving. Many researchers are exploring efficient PQC implementation on various platforms, including x86, ARM, FPGA, GPU, etc. In this paper, we present an Efficient CryptOgraphy CRYSTALS (ECO-CRYSTALS) implementation on standard 64-bit RISC-V Instruction Set Architecture (ISA). The target schemes are two winners of the National Institute of Standards and Technology (NIST) PQC competition: CRYSTALS-Kyber and CRYSTALS-Dilithium, where the two most time-consuming operations are Keccak and polynomial multiplication. Notably, this paper is the first to deploy Kyber and Dilithium on the 64-bit RISC-V ISA. Firstly, we propose a better scheduling strategy for Keccak, which is specifically tailored for the 64-bit dual-issue RISC-V architecture. Our 24-round Keccak permutation (Keccak-$p$[1600,24]) achieves a 59.18% speed-up compared to the reference implementation. Secondly, we apply two modular arithmetic (Montgomery arithmetic and Plantard arithmetic) in the polynomial multiplication of Kyber and Dilithium to get a better lazy reduction. Then, we propose a flexible dual-instruction-issue scheme of Number Theoretic Transform (NTT). As for the matrix-vector multiplication, we introduce a row-to-column processing methodology to minimize the expensive memory access operations. Compared to the reference implementation, we obtain a speedup of 53.85%$\thicksim$85.57% for NTT, matrix-vector multiplication, and INTT in our ECO-CRYSTALS. Finally, our ECO-CRYSTALS implementation for key generation, encapsulation, and decapsulation in Kyber achieves 399k, 448k, and 479k cycles respectively, achieving speedups of 60.82%, 63.93%, and 65.56% compared to the NIST reference implementation. Similarly, our ECO-CRYSTALS implementation for key generation, sign, and verify in Dilithium reaches 1,364k, 3,191k, and 1,369k cycles, showcasing speedups of 54.84%, 64.98%, and 57.20%, respectively.
Homomorphic Evaluation of LWR-based PRFs and Application to Transciphering
Certain applications such as FHE transciphering require randomness while operating over encrypted data. This randomness has to be obliviously generated in the encrypted domain and remain encrypted throughout the computation. Moreover, it should be guaranteed that independent-looking random coins can be obliviously generated for different computations.
In this work, we consider the homomorphic evaluation of pseudorandom functions (PRFs) with a focus on practical lattice-based candidates. In the homomorphic PRF evaluation setting, given a fully homomorphic encryption of the PRF secret key $\vec{s}$, it should be possible to homomorphically compute encryptions of PRF evaluations $\{ \text{PRF}_{\vec{s}}(x_i) \}_{i=1}^M$ for public inputs $\{ x_i\}_{i=1}^M$. We consider this problem for PRF families based on the hardness of the Learning-With-Rounding (LWR) problem introduced by Banerjee, Peikert and Rosen (Eurocrypt '12). We build on the random-oracle variant of a PRF construction suggested by Banerjee et al. and demonstrate that it can be evaluated using only two sequential programmable bootstraps in the TFHE homomorphic encryption scheme. We also describe several modifications of this PRF---which we prove as secure as the original function---that support homomorphic evaluations using only one programmable bootstrap per slot.
Numerical experiments were conducted using practically relevant FHE parameter sets from the TFHE-rs library. Our benchmarks show that a throughput of about $1000$ encrypted pseudorandom bits per second (resp. $900$ encrypted pseudorandom bits per second) can be achieved on an AWS hpc7a.96xlarge machine (resp. on a standard laptop with an Apple M2 chip), on a single thread. The PRF evaluation keys in our experiments have sizes roughly $40\%$ and $60\%$ of a bootstrapping key. Applying our solution to transciphering enables important bandwidth savings, typically trading $64$-bit values for $4$-bit values per transmitted ciphertext.
Divide and Surrender: Exploiting Variable Division Instruction Timing in HQC Key Recovery Attacks
We uncover a critical side-channel vulnerability in the Hamming Quasi-Cyclic (HQC) round 4 optimized implementation arising due to the use of the modulo operator. In some cases, compilers optimize uses of the modulo operator with compile-time known divisors into constant-time Barrett reductions. However, this optimization is not guaranteed: for example, when a modulo operation is used in a loop the compiler may emit division (div) instructions which have variable execution time depending on the numerator. When the numerator depends on secret data, this may yield a timing side-channel. We name vulnerabilities of this kind Divide and
Surrender (DaS) vulnerabilities.
For processors supporting Simultaneous Multithreading (SMT) we propose a new approach called DIV-SMT which enables precisely measuring small division timing variations using scheduler and/or execution unit contention. We show that using only 100 such side-channel traces we can build a Plaintext-Checking (PC) oracle with above 90% accuracy. Our approach may also prove applicable to other instances of the DaS vulnerability, such as KyberSlash. We stress that exploitation with DIV-SMT requires co-location of the attacker on the same physical core as the victim.
We then apply our methodology to HQC and present a novel way to recover HQC secret keys faster, achieving an 8-fold decrease in the number of idealized oracle queries when compared to previous approaches. Our new PC oracle attack uses our newly developed Zero Tester method to quickly determine whether an entire block of bits contains only zero-bits. The Zero Tester method enables the DIV-SMT powered attack on HQC-128 to complete in under 2 minutes on our targeted AMD Zen2
machine.
Cryptanalysis of signature schemes based on the root extraction problem over braid group
Cumplido, María et al. have recently shown that the Wang-Hu digital signature is not secure and has presented a potential attack on the root extraction problem. The effectiveness of generic attacks on solving this problem for braids is still uncertain and it is unknown if it is possible to create braids that require exponential time to solve these problems. In 2023, Lin and al. has proposed a post-quantum signature scheme similar to the Wang-Hu scheme that is proven to be able to withstand attacks from quantum computers. However, evidence is presented here for the existence of an algorithm based on mean-set attacks that can recover the private key in both schemes without solving the root extraction problem. In the post-quantum signature version, we prove that the attacker can forge a signature passing the verification without recovering the private key
MATTER: A Wide-Block Tweakable Block Cipher
In this note, we introduce the MATTER Tweakable Block Cipher, designed principally for low latency in low-area hardware implementations, but that can also be implemented in an efficient and compact way in software.
MATTER is a 512-bit wide balanced Feistel network with three to six rounds, using the ASCON permutation as the round function.
The Feistel network defines a keyed, non-tweakable core, which is made tweakable by using the encryption of the tweak as its key.
Key and tweak are 320-bit inputs.
MATTER is particularly suitable for use in an OCB-like mode of operation, with an encrypted checksum for authentication.
Depth-Aware Arithmetization of Common Primitives in Prime Fields
A common misconception is that the computational abilities of circuits composed of additions and multiplications are restricted to simple formulas only. Such arithmetic circuits over finite fields are actually capable of computing any function, including equality checks, comparisons, and other highly non-linear operations. While all those functions are computable, the challenge lies in computing them efficiently. We refer to this search problem as arithmetization. Arithmetization is a key problem in secure computation, as techniques like homomorphic encryption and secret sharing compute arithmetic circuits rather than the high-level programs that programmers are used to. The objective in arithmetization has typically been to minimize the number of multiplications (multiplicative size), as multiplications in most secure computation techniques are significantly more expensive to compute than additions. However, the multiplicative depth of a circuit arguably plays an even more important role in deciding the computational cost: For homomorphic encryption, it strongly affects the choice of cryptographic parameters and the number of bootstrapping operations required, which are orders of magnitude more expensive to compute than multiplications. In fact, if we can limit the multiplicative depth of a circuit such that we do not need to perform any bootstrapping, we can omit the large bootstrapping keys required to perform them all together. We argue that arithmetization should be treated as a multi-objective minimization problem, in which a trade-off can be made between a circuit's multiplicative size and depth. We present efficient depth-aware arithmetization methods for many primitive operations such as exponentiation, univariate functions, equality checks, comparisons, and ANDs and ORs, which take into account that squaring can be cheaper than arbitrary multiplications, and we study how to compose them.
On degrees of carry and Scholz's conjecture
Exploiting the notion of carries, we obtain improved upper bounds for the length of the shortest addition chains $\iota(2^n-1)$ producing $2^n-1$. Most notably, we show that if $2^n-1$ has carries of degree at most $$\kappa(2^n-1)=\frac{1}{2}(\iota(n)-\lfloor \frac{\log n}{\log 2}\rfloor+\sum \limits_{j=1}^{\lfloor \frac{\log n}{\log 2}\rfloor}\{\frac{n}{2^j}\})$$ then the inequality $$\iota(2^n-1)\leq n+1+\sum \limits_{j=1}^{\lfloor \frac{\log n}{\log 2}\rfloor}\bigg(\{\frac{n}{2^j}\}-\xi(n,j)\bigg)+\iota(n)$$ holds for all $n\in \mathbb{N}$ with $n\geq 4$, where $\iota(\cdot)$ denotes the length of the shortest addition chain producing $\cdot$, $\{\cdot\}$ denotes the fractional part of $\cdot$ and where $\xi(n,1):=\{\frac{n}{2}\}$ with $\xi(n,2)=\{\frac{1}{2}\lfloor \frac{n}{2}\rfloor\}$ and so on.
Optimizing Rectangle and Boomerang Attacks: A Unified and Generic Framework for Key Recovery
The rectangle attack has shown to be a very powerful form of cryptanalysis against block ciphers. Given a rectangle distinguisher, one expects to mount key recovery attacks as efficiently as possible. In the literature, there have been four algorithms for rectangle key recovery attacks. However, their performance varies from case to case. Besides, numerous are the applications where the attacks lack optimality. In this paper, we delve into the rectangle key recovery and propose a unified and generic key recovery algorithm, which supports any possible attacking parameters. Not only does it encompass the four existing rectangle key recovery algorithms, but it also reveals five new types of attacks that were previously overlooked. Further, we put forward a counterpart for boomerang key recovery attacks, which supports any possible attacking parameters as well. Along with these new key recovery algorithms, we propose a framework to automatically determine the best parameters for the attack. To demonstrate the efficiency of the new key recovery algorithms, we apply them to \serpent, \aes-192, \craft, \skinny, and \deoxysbc-256 based on existing distinguishers, yielding a series of improved attacks.
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
Watermarking generative models consists of planting a statistical signal (watermark) in a model’s output so that it can be later verified that the output was generated by the given model. A strong watermarking scheme satisfies the property that a computationally bounded attacker cannot erase the watermark without causing significant quality degradation. In this paper, we study the (im)possibility of strong watermarking schemes. We prove that, under well-specified and natural assumptions, strong watermarking is impossible to achieve. This holds even in the private detection algorithm setting, where the watermark insertion and detection algorithms share a secret key, unknown to the attacker. To prove this result, we introduce a generic efficient watermark attack; the attacker is not required to know the private key of the scheme or even which scheme is used.
Our attack is based on two assumptions: (1) The attacker has access to a “quality oracle” that can evaluate whether a candidate output is a high-quality response to a prompt, and (2) The attacker has access to a “perturbation oracle” which can modify an output with a nontrivial probability of maintaining quality, and which induces an efficiently mixing random walk on high-quality outputs. We argue that both assumptions can be satisfied in practice by an attacker with weaker computational capabilities than the watermarked model itself, to which the attacker has only black-box access. Furthermore, our assumptions will likely only be easier to satisfy over time as models grow in capabilities and modalities.
We demonstrate the feasibility of our attack by instantiating it to attack three existing watermarking schemes for large language models: Kirchenbauer et al. (2023a), Kuditipudi et al. (2023), and Zhao et al. (2023a), as well as those for vision-language models Fernandez et al. (2023) and Mountain (2021). The same attack successfully removes the watermarks planted by all schemes, with only minor quality degradation.
A Refined Hardness Estimation of LWE in Two-step Mode
Recently, researchers have proposed many LWE estimators, such as lattice-estimator (Albrecht et al, Asiacrypt 2017) and leaky-LWE-Estimator (Dachman-Soled et al, Crypto 2020), while the latter has already been used in estimating the security level of Kyber and Dilithium using only BKZ. However, we prove in this paper that solving LWE by combining a lattice reduction step (by LLL or BKZ) and a
target vector searching step (by enumeration or sieving), which we call a Two-step mode, is more efficient than using only BKZ.
Moreover, we give a refined LWE estimator in Two-step mode by analyzing the relationship between the probability distribution of the target vector and the solving success rate in a Two-step mode LWE solving algorithm. While the latest Two-step estimator for LWE, which is the “primal-bdd” mode in lattice-estimator1, does not take into account some up-to-date results and lacks a thorough theoretical analysis. Under the same gate-count model, our estimation for NIST PQC standards drops by 2.1∼3.4 bits (2.2∼4.6 bits while considering more flexible blocksize and jump strategy) compared with leaky-LWE-Estimator.
Furthermore, we also give a conservative estimation for LWE from the Two-step solving algorithm. Compared with the Core-SVP model, which is used in previous conservative estimations, our estimation relies on weaker assumptions and outputs higher evaluation results than the Core-
SVP model. For NIST PQC standards, our conservative estimation is 4.17∼8.11 bits higher than the Core-SVP estimation. Hence our estimator can give a closer estimation for both upper bound and lower bound of LWE hardness.
Cryptanalysis of the SNOVA signature scheme
SNOVA is a variant of a UOV-type signature scheme over a noncommutative ring. In this article, we demonstrate that certain
parameters provided by authors in SNOVA fail to meet the NIST security level, and the complexities are lower than those claimed by SNOVA.
Correlation Electromagnetic Analysis on an FPGA Implementation of CRYSTALS-Kyber
Post-quantum cryptography represents a category of cryptosystems resistant to quantum algorithms. Recently, NIST launched a process to standardize one or more of such algorithms in the key encapsulation mechanism and signature categories. Such schemes are under the scrutiny of their mathematical security, but they are not side-channel secure at the algorithm level. That is why their side-channel vulnerabilities must be assessed by the research community. In this paper, we present a non-profiled correlation electromagnetic analysis against an FPGA implementation of the chosen NIST key-encapsulation mechanism standard, CRYSTALS-Kyber. The attack correlates an electromagnetic radiation model of the polynomial multiplication execution with the captured traces. With 166,620 traces, this attack correctly recovers 100% of the subkeys. Furthermore, a countermeasure is presented for securing the target implementation against the presented attack.
Hardware Implementation and Security Analysis of Local-Masked NTT for CRYSTALS-Kyber
The rapid evolution of post-quantum cryptography, spurred by standardization efforts such as those led by NIST, has highlighted the prominence of lattice-based cryptography, notably exemplified by CRYSTALS-Kyber. However, concerns persist regarding the security of cryptographic implementations, particularly in the face of Side-Channel Attacks (SCA). The usage of operations like the Number Theoretic
Transform (NTT) in CRYSTALS-Kyber introduces vulnerabilities to SCA, especially single-trace ones, such as soft-analytical side-channel attacks. To address this threat, Ravi et al. proposed local masking as a countermeasure by randomizing the NTT’s twiddle factors, but its implementation and security implications require further investigation. This paper presents a hardware implementation of the NTT with local masking, evaluating its performance, area utilization, and security impacts. Additionally, it analyzes the vulnerabilities inherent in local masking and assesses its practical security effectiveness through non-specific t-tests, showing that there are configurations of local masking that are more prone to leakage than others.
Towards post-quantum secure PAKE - A tight security proof for OCAKE in the BPR model
We revisit OCAKE (ACNS 23), a generic recipe that constructs password-based authenticated key exchange (PAKE) from key encapsulation mechanisms (KEMs), to allow instantiations with post-quantums KEM like KYBER.
The ACNS23 paper left as an open problem to argue security against quantum attackers, with its security proof being in the universal composability (UC) framework. This is common for PAKE, however, at the time of this submission’s writing, it was not known how to prove (computational) UC security against quantum adversaries. Doing this becomes even more involved if the proof uses idealizations like random oracles or ideal ciphers.
To pave the way towards post-quantum security proofs, we therefore resort to a (still classical) game-based security proof in the BPR model (EUROCRYPT 2000). We consider this a crucial stepping stone towards a fully satisfying post-quantum security proof. We also hope that a
game-based proof is easier to (potentially formally) verify.
We prove security of (a minor variation of) OCAKE, assuming the underlying KEM satisfies notions of ciphertext indistinguishability, anonymity, and (computational) public-key uniformity. Using multi-user variants of these properties, we achieve tight security bounds.
We provide a full detailed proof – something often omitted in publications on game-based security of PAKE. As a side-contribution, we demonstrate in detail how to handle password guesses, which is something we were unable to find in the existing literature at the time of
writing.
Finally, we discuss which current PQC KEMs can be plugged into the proposed protocol and provide a concrete instantiation, accompanied by a proof-of-concept implementation and respective run-time benchmarks.
Post-quantum XML and SAML Single Sign-On
Extensible Markup Language (XML) is one of the most popular serialization languages. Since many security protocols are built using XML, it also provides cryptographic functionality. A central framework in this area is the Security Assertion Markup Language (SAML). This standard is one of the most widely used options for implementing Single Sign-On (SSO), which allows users to authenticate to different service providers using the credentials from a single identity provider. Like all other security protocols currently in use, the security and privacy of XML-based frameworks such as SAML is threatened by the development of increasingly powerful quantum computers. In fact, future attackers with access to scalable quantum computers will be able to break the currently used cryptographic building blocks and thus undermine the security of the SAML SSO to illegally access sensitive private information. Post-quantum cryptography algorithms have been developed to protect against such quantum attackers. While many security protocols have been migrated into the quantum age by using post-quantum cryptography, no such solutions for XML and the security protocols based on it have been developed, let alone tested. We make the following contributions to fill this gap. We have designed post-quantum solutions for the cryptographic building blocks in XML and integrated them into the SAML SSO protocol. We implemented our solutions in the OpenSAML, Apache Santuario, and BouncyCastle libraries and extensively tested their performance for various post-quantum instantiations. As a result, we have created a comprehensive and solid foundation for post-quantum XML and post-quantum SAML SSO migration.
SoK: Post-Quantum TLS Handshake
Transport Layer Security (TLS) is the backbone security protocol of the Internet. As this fundamental protocol is at risk from future quantum attackers, many proposals have been made to protect TLS against this threat by implementing post-quantum cryptography (PQC). The widespread interest in post-quantum TLS has given rise to a large number of solutions over the last decade. These proposals differ in many aspects, including the security properties they seek to protect, the efficiency and trustworthiness of their post-quantum building blocks, and the application scenarios they consider, to name a few.
Based on an extensive literature review, we classify existing solutions according to their general approaches, analyze their individual contributions, and present the results of our extensive performance experiments. Based on these insights, we identify the most reasonable candidates for post-quantum TLS, which research problems in this area have already been solved, and which are still open. Overall, our work provides a well-founded reference point for researching post-quantum TLS and preparing TLS in practice for the quantum age.
Towards ML-KEM & ML-DSA on OpenTitan
This paper presents extensions to the OpenTitan hardware root of trust that aim at enabling high-performance lattice-based cryptography. We start by carefully optimizing ML-KEM and ML-DSA - the two primary algorithms selected by NIST for standardization - in software targeting the OTBN accelerator. Based on profiling results of these implementations, we propose tightly integrated extensions to OTBN, specifically an interface from OTBN to OpenTitan's Keccak accelerator (KMAC core) and extensions to the OTBN ISA to support operations on 256-bit vectors. We implement these extensions in hardware and show that we achieve a speedup by a factor between 6 and 9 for different operations and parameter sets of ML-KEM and ML-DSA compared to our baseline implementation on unmodified OTBN. This speedup is achieved with an increase in cell count of less than 12% in OTBN, which corresponds to an increase of less than 2% for the full Earlgrey OpenTitan core.
Efficient Schemes for Committing Authenticated Encryption
This paper provides efficient authenticated-encryption (AE) schemes in which a ciphertext is a commitment to the key. These are extended, at minimal additional cost, to schemes where the ciphertext is a commitment to all encryption inputs, meaning key, nonce, associated data and message. Our primary schemes are modifications of GCM (for basic, unique-nonce AE security) and AES-GCM-SIV (for misuse-resistant AE security) and add both forms of commitment without any increase in ciphertext size. We also give more generic, but somewhat more costly, solutions.
Succinctly-Committing Authenticated Encryption
Recent attacks and applications have led to the need for symmetric encryption schemes that, in addition to providing the usual authenticity and privacy, are also committing. In response, many committing authenticated encryption schemes have been proposed. However, all known schemes, in order to provide s bits of committing security, suffer an expansion---this is the length of the ciphertext minus the length of the plaintext---of 2s bits. This incurs a cost in bandwidth or storage. (We typically want s=128, leading to 256-bit expansion.) However, it has been considered unavoidable due to birthday attacks. We show how to bypass this limitation. We give authenticated encryption (AE) schemes that provide s bits of committing security, yet suffer expansion only around s as long as messages are long enough, namely more than s bits. We call such schemes succinct. We do this via a generic, ciphertext-shortening transform called SC: given an AE scheme with 2s-bit expansion, SC returns an AE scheme with s-bit expansion while preserving committing security. SC is very efficient; an AES-based instantiation has overhead just two AES calls. As a tool, SC uses a collision-resistant invertible PRF called HtM, that we design, and whose analysis is technically difficult. To add the committing security that SC assumes to a base scheme, we also give a transform CTY that improves Chan and Rogaway's CTX. Our results hold in a general framework for authenticated encryption, called AE3, that includes both AE1 (also called AEAD) and AE2 (also called nonce-hiding AE) as special cases, so that we in particular obtain succinctly-committing AE schemes for both these settings.
A note on ``a novel authentication protocol for IoT-enabled devices''
We show that the authentication protocol [IEEE Internet Things J., 2023, 10(1), 867-876] is not correctly specified, because the server cannot complete its computations. To revise, the embedded device needs to compute an extra point multiplication over the underlying elliptic curve. We also find the protocol cannot provide anonymity, not as claimed. It can only provide pseudonymity.
Efficient Two-Party Secure Aggregation via Incremental Distributed Point Function
Computing the maximum from a list of secret inputs is a widely-used functionality that is employed ei- ther indirectly as a building block in secure computation frameworks, such as ABY (NDSS’15) or directly used in multiple applications that solve optimisation problems, such as secure machine learning or secure aggregation statistics. Incremental distributed point function (I-DPF) is a powerful primitive (IEEE S&P’21) that significantly reduces the client- to-server communication and are employed to efficiently and securely compute aggregation statistics.
In this paper, we investigate whether I-DPF can be used to improve the efficiency of secure two-party computation (2PC) with an emphasis on computing the maximum value and the k-th (with k unknown to the computing parties) ranked value from a list of secret inputs. Our answer is affir- mative, and we propose novel secure 2PC protocols that use I-DPF as a building block, resulting in significant efficiency gains compared to the state-of-the-art. More precisely, our contributions are: (i) We present two new secure computa- tion frameworks that efficiently compute secure aggregation statistics bit-wisely or batch-wisely; (ii) we propose novel protocols to compute the maximum value, the k-th ranked value from a list of secret inputs; (iii) we provide variations of the proposed protocols that can perform batch computations and thus provide further efficiency improvements; and (iv) we provide an extensive performance evaluation for all proposed protocols.
Our protocols have a communication complexity that is independent of the number of secret inputs and linear to the length of the secret input domain. Our experimental re- sults show enhanced efficiency over state-of-the-art solutions, particularly notable when handling large-scale inputs. For instance, in scenarios involving an input set of five million elements with an input domain size of 31 bits, our protocol ΠMax achieves an 18% reduction in online execution time and a 67% decrease in communication volume compared to the most efficient existing solution.
Lightweight Dynamic Linear Components for Symmetric Cryptography
In this paper, using the concept of equivalence of mappings we characterize all of the one-XOR matrices which are used in hardware applications and propose a family of lightweight linear mappings for software-oriented applications in symmetric cryptography. Then, we investigate interleaved linear mappings and based upon this study, we present generalized dynamic primitive LFSRs along with dynamic linear components for construction of diffusion layers.
From the mathematical viewpoint, this paper presents involutive sparse binary matrices as well as sparse binary matrices with sparse inverses. Another interesting result of our investigation is that, by our characterization of one-XOR matrices, the search space for finding a $k$ such that $x^n+x^k+1$ is a primitive trinomial could be reduced.
On the Impossibility of Algebraic NIZK In Pairing-Free Groups
Non-Interactive Zero-Knowledge proofs (NIZK) allow a prover to convince a verifier that a statement is true by sending only one message and without conveying any other information.
In the CRS model, many instantiations have been proposed from group-theoretic assumptions.
On the one hand, some of these constructions use the group structure in a black-box way but rely on pairings, an example being the celebrated Groth-Sahai proof system.
On the other hand, a recent line of research realized NIZKs from sub-exponential DDH in pairing-free groups using Correlation Intractable Hash functions, but at the price of making non black-box usage of the group.
As of today no construction is known to simultaneously reduce its security to pairing-free group problems and to use the underlying group in a black-box way.
This is indeed not a coincidence:
in this paper, we prove that for a large class of NIZK either a pairing-free group is used non black-box by relying on element representation, or security reduces to external hardness assumptions.
More specifically our impossibility applies to two incomparable cases.
The first one covers Arguments of Knowledge (AoK) which proves that a preimage under a given one way function is known.
The second one covers NIZK (not necessarily AoK) for hard subset problems, which captures relations such as DDH, Decision-Linear and Matrix-DDH.
On the Impossibility of Algebraic Vector Commitments in Pairing-Free Groups
Vector Commitments allow one to (concisely) commit to a vector of messages so that one can later (concisely) open the commitment at selected locations. In the state of the art of vector commitments, algebraic constructions have emerged as a particularly useful class, as they enable advanced properties, such as stateless updates, subvector openings and aggregation, that are for example unknown in Merkle-tree-based schemes. In spite of their popularity, algebraic vector commitments remain poorly understood objects. In particular, no construction in standard prime order groups (without pairing) is known.
In this paper, we shed light on this state of affairs by showing that a large class of concise algebraic vector commitments in pairing-free, prime order groups are impossible to realize.
Our results also preclude any cryptographic primitive that implies the algebraic vector commitments we rule out, as special cases.
This means that we also show the impossibility, for instance, of succinct polynomial commitments and functional commitments (for all classes of functions including linear forms) in pairing-free groups of prime order.
STORM — Small Table Oriented Redundancy-based SCA Mitigation for AES
Side-channel-analysis (SCA) resistance with cost optimization in AES hardware implementations remains a significant challenge. While traditional masking-based schemes offer provable security, they often incur substantial resource overheads (latency, area, randomness, performance, power consumption). Alternatively, the RAMBAM scheme introduced a redundancy-based approach to control the signal-to-noise ratio, and achieves exponential leakage reduction as redundancy increases. This method results in only a slight increase in area and in power consumption, and a significant decrease in the amount of randomness needed, without any increase in latency. However, it lacks a formal security proof.
In this study, we introduce a scheme, denoted STORM, that synergizes RAMBAM's methodology with the utilization of look-up-tables (LUTs) in memory (ROM/RAM) in a redundant domain. STORM, like RAMBAM, is as fast as a typical unprotected implementation and has the same latency, but has a significantly higher maximal clock frequency than RAMBAM, and consumes less than half the power. RAMBAM and STORM are code-based schemes in the sense that their set of representations is a code in the vector space $GF(2)^{8+d}$. RAMBAM requires a richer structure of a ring on $GF(2)^{8+d}$ and a ring homomorphism whereas STORM utilizes a simple vector space. In code-based-masking (CBM), as in all masking schemes, non-interference based notions (t-S/NI) are fundamental for establishing provable security. RAMBAM and STORM diverge from this approach. While CBM employs codes in vector spaces over $GF(2^8)$ for AES protection, RAMBAM and STORM use codes over $GF(2)$ without the need for t-S/NI-gadgets, leaving them both smaller and more efficient.
Independence in security proofs typically means that in each individual computation (in a clock-cycle), at least one share does not participate. This approach does not work for RAMBAM where several field multiplications are executed sequentially in a cycle. However, in STORM no multiplications are performed due to its memory based tables, leaving only (independent) bitwise-XORs. Therefore, the reasoning necessary for proving security is different and STORM, unlike RAMBAM, enjoys provable security. We consider two distinct scenarios, \emph{both with provable security}: (1) STORM1 --- ``leakage-free’’ memory reads, demonstrating (1,1,0)-robustness for LUTs with redundancy 2 in the 1-probe model and for LUTs with redundancy 6 in the 2-probe model, and (2) STORM2 --- leaky memory reads, where additional protection mechanisms and a notion of memory-read robustness are introduced.
STORM can be implemented not only in HW, but in SW as well. However, this paper and the proofs in it relate to STORM's HW implementations.
Erebor and Durian: Full Anonymous Ring Signatures from Quaternions and Isogenies
We construct two efficient post-quantum ring signatures with anonymity against full key exposure from isogenies, addressing limitations of existing isogeny-based ring signatures.
First, we present an efficient concrete distinguisher for the SQIsign simulator when the signing key is provided using one transcript. This shows that turning SQIsign into an efficient full anonymous ring signature requires some new ideas.
Second, we propose a variant of SQIsign that is resistant to the distinguisher attack with only a $\times 1.33$ increase in size and we render it to a ring signature, that we refer as $\mathsf{Erebor}$. This variant introduces a new zero-knowledge assumption that ensures full anonymity. The efficiency of $\mathsf{Erebor}$ remains comparable to that of SQIsign, with only a proportional increase due to the ring size. This results in a signature size of $0.68 \mathsf{KB}$ for 4 users and $1.35 \mathsf{KB}$ for 8 users, making it the most compact post-quantum ring signature for up to 31 users.
Third, we revisit the GPS signature scheme (Asiacrypt'17), developing efficient subroutines to make the scheme more efficient and significantly reduce the resulting signature size. By integrating our scheme with the paradigm by Beullens, Katsumata, and Pintore (Asiacrypt'20), we achieve an efficient logarithmic ring signature, that we call $\mathsf{Durian}$, resulting in a signature size of $9.87 \mathsf{KB}$ for a ring of size 1024.
Sloth: Key Stretching and Deniable Encryption using Secure Elements on Smartphones
Privacy enhancing technologies must not only protect sensitive data in-transit, but also locally at-rest. For example, anonymity networks hide the sender and/or recipient of a message from network adversaries. However, if a participating device is physically captured, its owner can be pressured to give access to the stored conversations. Therefore, client software should allow the user to plausibly deny the existence of meaningful data. Since biometrics can be collected without consent and server-based authentication leaks metadata, implementations typically rely on memorable passwords for local authentication.
Traditional password-based key stretching lacks a strict time guarantee due to the ease of parallelized password guessing by attackers. This paper introduces Sloth, a key stretching method leveraging the Secure Element (SE) commonly found in modern smartphones to provide a strict rate limit on password guessing. While this would be straightforward with full access to the SE, Android and iOS only provide a very limited API. Sloth utilizes the existing developer SE API and novel cryptographic constructions to build an effective rate-limit for password guessing on recent Android and iOS devices. Our approach ensures robust security even for short, randomly-generated, six-character alpha-numeric passwords against adversaries with virtually unlimited computing resources. Our solution is compatible with approximately 96% of iPhones and 45% of Android phones and Sloth seamlessly integrates without device or OS modifications, making it immediately usable by app developers today. We formally define the security of Sloth and evaluate its performance on various devices.
Finally, we present HiddenSloth, a plausibly-deniable encryption scheme leveraging Sloth. It provides multi-snapshot resistance against adversaries who can covertly capture its on-disk content multiple times.
Sanitizable and Accountable Endorsement for Dynamic Transactions in Fabric
Hyperledger Fabric, an open-source, enterprise-grade consortium platform, employs an endorsement policy wherein a set of endorsers signs transaction proposals from clients to confirm their authenticity. The signatures from endorsers constitute the core component of endorsement. However, when dealing with dynamic transactions with high timeliness and frequent updates (e.g., stock trading, real-time ad delivery, news reporting, etc.), the current endorsement process somewhat slows down the transaction execution. Meanwhile, handling these continuously updated transactions consumes significant resources from endorsers, thereby constraining overall application efficiency.
To address these issues, this paper devises a novel sanitizable and accountable endorsement scheme by proposing a sanitizable multi-signature (SMS) as the theoretical tool. Specifically, we introduce the novel concept of sanitizable multi-signature and detail its instantiation. SMS combines the advantages of multi-signature and sanitizable signature, maintaining the compactness of the signature while allowing the sanitizer to adjust the initial endorsement result to fit the updated transaction content without interacting with the endorsers, so that both the authenticity and timeliness of transactions can be ensured. Additionally, SMS incorporates an innovative accountability mechanism to trace instances of improper data updates, thereby enhancing the security and reliability of the endorsement process.
We demonstrate the security of the proposed scheme through rigorous security analysis. Performance evaluations show that SMS can significantly reduce verification overhead and transaction size compared to the default ECDSA scheme in Fabric. Specifically, when verifying multiple endorsers' endorsements, our scheme exhibits a storage space reduction by approximately 30%-40% and a verification time reduction ranging from 9.2% to nearly 26.3%.
Protecting Quantum Procrastinators with Signature Lifting: A Case Study in Cryptocurrencies
Current solutions to quantum vulnerabilities of widely used cryptographic schemes involve migrating users to post-quantum schemes before quantum attacks become feasible. This work deals with protecting quantum procrastinators: users that failed to migrate to post-quantum cryptography in time.
To address this problem in the context of digital signatures, we introduce a technique called signature lifting, that allows us to lift a deployed pre-quantum signature scheme satisfying a certain property to a post-quantum signature scheme that uses the same keys. Informally, the said property is that a post-quantum one-way function is used "somewhere along the way" to derive the public-key from the secret-key. Our constructions of signature lifting relies heavily on the post-quantum digital signature scheme Picnic (Chase et al., CCS'17).
Our main case-study is cryptocurrencies, where this property holds in two scenarios: when the public-key is generated via a key-derivation function or when the public-key hash is posted instead of the public-key itself. We propose a modification, based on signature lifting, that can be applied in many cryptocurrencies for securely spending pre-quantum coins in presence of quantum adversaries. Our construction improves upon existing constructions in two major ways: it is not limited to pre-quantum coins whose ECDSA public-key has been kept secret (and in particular, it handles all coins that are stored in addresses generated by HD wallets), and it does not require access to post-quantum coins or using side payments to pay for posting the transaction.
More Efficient Zero-Knowledge Protocols over $\mathbb{Z}_{2^k}$ via Galois Rings
A recent line of works on zero-knowledge (ZK) protocols with a vector oblivious linear function evaluation (VOLE)-based offline phase provides a new paradigm for scalable ZK protocols featuring fast proving and small prover memory.
Very recently, Baum et al. (Crypto'23) proposed the VOLE-in-the-head technique, allowing such protocols to become publicly verifiable. Many practically efficient protocols for proving circuit satisfiability over any Galois field are implemented, while protocols over rings $\mathbb{Z}_{2^k}$ are significantly lagging behind, with only a proof-of-concept pioneering work called Appenzeller to Brie (CCS'21) and a first proposal called Moz$\mathbb{Z}_{2^k}$arella (Crypto'22). The ring $\mathbb{Z}_{2^{32}}$ or $\mathbb{Z}_{2^{64}}$, though highly important (it captures computation in real-life programming and the computer architectures such as CPU words), presents non-trivial difficulties because, for example, unlike Galois fields $\mathbb{F}_{2^{k}}$, the fraction of units in $\mathbb{Z}_{2^{k}}$ is $1/2$.
In this work, we first construct ZK protocols over a high degree Galois ring extension of $\mathbb{Z}_{2^{k}}$ (fraction of units close to $1$) and then convert them to $\mathbb{Z}_{2^k}$ efficiently using amortization techniques. Our results greatly change the landscape of ZK protocols over~$\mathbb{Z}_{2^k}$.
(1) We propose a competing ZK protocol that has many advantages over the state-of-the-art Moz$\mathbb{Z}_{2^k}$arella. We remove the undesirable dependence of communication complexity on the security parameter, and achieve communication complexity {\em strictly} linear in the circuit size. Furthermore, our protocol has better concrete efficiency. For $40,80$ bits soundness on circuits over $\mathbb{Z}_{2^{32}}$ and $\mathbb{Z}_{2^{64}}$, we offer $1.15\times$--$2.9\times$ improvements in communication.
(2) Inspired by the recently proposed interactive message authentication code technique (Weng et al., CCS'22), we construct a constant round ZK protocol over $\mathbb{Z}_{2^k}$ with sublinear (in the circuit size) communication complexity, which was previously achieved only over fields.
(3) We show that the pseudorandom correlation generator approach can be adapted to efficiently implement VOLE over Galois rings, with analysis of the hardness of underlying LPN assumptions over Galois rings.
(4) We adapt the VOLE-in-the-head technique to make it work for $\mathbb{Z}_{2^k}$, yielding {\em publicly verifiable} non-interactive ZK protocols over $\mathbb{Z}_{2^k}$ which preserve most of the efficiency metrics of the VOLE-based ZK protocols.
QuickPool: Privacy-Preserving Ride-Sharing Service
Online ride-sharing services (RSS) have become very popular owing to increased awareness of environmental concerns and as a response to increased traffic congestion. To request a ride, users submit their locations and route information for ride matching to a service provider (SP), leading to possible privacy concerns caused by leakage of users' location data. We propose QuickPool, an efficient SP-aided RSS solution that can obliviously match multiple riders and drivers simultaneously, without involving any other auxiliary server. End-users, namely, riders and drivers share their route information with SP as encryptions of the ordered set of points-of-interest (PoI) of their route from their start to end locations. SP performs a zone based oblivious matching of drivers and riders, based on partial route overlap as well as proximity of start and end points. QuickPool is in the semi-honest setting, and makes use of secure multi-party computation. We provide security proof of our protocol, perform extensive testing of our implementation and show that our protocol simultaneously matches multiple drivers and riders very efficiently. We compare the performance of QuickPool with state-of-the-art works and observe a run time improvement of 1.6 - 2$\times$, and communication improvement of at least 8$\times$.
Linea Prover Documentation
Rollup technology today promises long-term solutions to the scalability of the blockchain. Among a thriving ecosystem, Consensys has launched the Linea zkEVM Rollup network for Ethereum.
At a high level, the Ethereum blockchain can be seen as a state machine and its state transition can be arithmetized carefully. Linea's prover protocol uses this arithmetization, along with transactions on layer two in order to compute a cryptographic proof that the state transition is performed correctly.
The proof is then sent over to the Ethereum layer, where the smart contract (verifier contract) on Ethereum checks the proof and accepts the state transition if the proof is valid. The interaction between layer two and Ethereum is costly, which imposes substantial limitations on the proof size. Therefore, Linea's prover aims to compress the proof via cryptographic tools such as list polynomial commitments (LPCs), polynomial interactive oracle proofs (PIOPs), and Succinct Non-Interactive Arguments of Knowledge (SNARKs).
We introduce Wizard-IOP, a cryptographic tool for handling a wide class of queries (such as range checks, scalar products, permutations checks, etc.) needed to ensure the correctness of the executions of the state machines efficiently and conveniently. Another cryptographic tool is the Arcane compiler, which outputs standard PIOPs and is employed by Wizard-IOP to make different queries homogeneous. After applying Arcane, all the queries constitute evaluation queries over the polynomials. We then apply the Unique Evaluation compiler (UniEval), which receives the output of the Arcane and provides us with a PIOP that requires only a single evaluation check.
At this point, we employ Vortex, a list polynomial commitment (LPC) scheme to convert the resulting PIOP into an argument of knowledge. The argument of knowledge is then made succinct by applying different techniques such as self-recursion, standard recursion, and proof aggregations.
Updatable Private Set Intersection from Structured Encryption
Many efficient custom protocols have been developed for two-party private set intersection (PSI), that allow the parties to learn the intersection of their private sets. However, these approaches do not yield efficient solutions in the dynamic setting when the parties’ sets evolve and the intersection has to be computed repeatedly. In this work we propose a new framework for this problem of updatable PSI — with elements being inserted and deleted — in the semihonest model based on structured encryption. The framework reduces the problem of updatable PSI to a new variant of structured encryption (StE) for an updatable set datatype, which may be of independent interest. Our final construction is a constant round protocol with worst-case communication and computation complexity that grows linearly in the size of the updates and only poly-logarithmically with the size of the accumulated sets. Our protocol is the first to support arbitrary inserts and deletes for updatable PSI.
Hyperion: Transparent End-to-End Verifiable Voting with Coercion Mitigation
We present Hyperion, an end-to-end verifiable e-voting scheme that allows the voters to identify their votes in cleartext in the final tally. In contrast to schemes like Selene or sElect, identification is not via (private) tracker numbers but via cryptographic commitment terms. After publishing the tally, the Election Authority provides each voter with an individual dual key. Voters identify their votes by raising their dual key to their secret trapdoor key and finding the matching commitment term in the tally.
The dual keys are self-certifying in that, without the voter's trapdoor key, it is intractable to forge a dual key that, when raised to the trapdoor key, will match an alternative commitment. On the other hand, a voter can use their own trapdoor key to forge a dual key to fool any would-be coercer.
Additionally, we propose a variant of Hyperion that counters the tracker collision threat present in Selene. We introduce individual verifiable views: each voter gets their own independently shuffled view of the master Bulletin Board.
We provide new improved definitions of privacy and verifiability for e-voting schemes and prove the scheme secure against these, as well as proving security with respect to earlier definitions in the literature.
Finally, we provide a prototype implementation and provide measurements which demonstrate that our scheme is practical for large scale elections.
Swoosh: Efficient Lattice-Based Non-Interactive Key Exchange
The advent of quantum computers has sparked significant interest in post-quantum cryptographic schemes, as a replacement for currently used cryptographic primitives. In this context, lattice-based cryptography has emerged as the leading paradigm to build post-quantum cryptography. However, all existing viable replacements of the classical Diffie-Hellman key exchange require additional rounds of interactions, thus failing to achieve all the benefits of this protocol. Although earlier work has shown that lattice-based Non-Interactive Key Exchange (NIKE) is theoretically possible, it has been considered too inefficient for real-life applications.
In this work, we challenge this folklore belief and provide the first evidence against it. We construct an efficient lattice-based NIKE whose security is based on the standard module learning with errors (M-LWE) problem in the quantum random oracle model.
Our scheme is obtained in two steps:
(i) A passively-secure construction that achieves a strong notion of correctness, coupled with
(ii) a generic compiler that turns any such scheme into an actively-secure one.
To substantiate our efficiency claim, we provide an optimised implementation of our passively-secure construction in Rust and Jasmin. Our implementation demonstrates the scheme's applicability to real-world scenarios, yielding public keys of approximately 220 KBs. Moreover, the computation of shared keys takes fewer than 12 million cycles on an Intel Skylake CPU, offering a post-quantum security level exceeding 120 bits.
Interactive Authentication
Authentication is the first, crucial step in securing digital assets like cryptocurrencies and online services like banking.
It relies on principals maintaining exclusive access to credentials like cryptographic signing keys, passwords, and physical devices.
But both individuals and organizations struggle to manage their credentials, resulting in loss of assets and identity theft.
In this work, we study mechanisms with back-and-forth interaction with the principals.
For example, a user receives an email notification about sending money from her bank account and is given a period of time to abort.
We define the authentication problem, where a mechanism interacts with a user and an attacker.
A mechanism's success depends on the scenario, namely, which credentials each principal knows.
The profile of a mechanism is the set of scenarios in which it succeeds.
The subset relation on profiles defines a partial order on mechanisms.
We bound the profile size and discover three types of novel mechanisms that are maximally secure.
We show the efficacy of our model by analyzing existing mechanisms and make concrete improvement proposals:
Using sticky messages for security notifications, prioritizing credentials when accessing one's bank account, and using one of our maximal mechanisms to improve a popular cryptocurrency wallet.
We demonstrate the practicality of our mechanisms by implementing the latter.
Formal Verification of Emulated Floating-Point Arithmetic in Falcon
We show that there is a discrepancy between the emulated floating-point multiplication in the submission package of the digital signature Falcon and the claimed behavior. In particular, we show that some floating-point products with absolute values the smallest normal positive floating-point number are incorrectly zeroized. However, we show that the discrepancy doesn’t affect the complex fast Fourier transform in the signature generation of Falcon by modeling the floating-point addition, subtraction, and multiplication in CryptoLine. We later implement our own floating-point multiplications in Armv7-M assembly and Jasmin and prove their equivalence with our model, demonstrating the possibility of transferring the challenging verification task (verifying highly-optimized assembly) to the presumably more readable code base (Jasmin).
Fast computation of 2-isogenies in dimension 4 and cryptographic applications
Dimension 4 isogenies have first been introduced in cryptography for the cryptanalysis of Supersingular Isogeny Diffie-Hellman (SIDH) and have been used constructively in several schemes, including SQIsignHD, a derivative of SQIsign isogeny based signature scheme. Unlike in dimensions 2 and 3, we can no longer rely on the Jacobian model and its derivatives to compute isogenies. In dimension 4 (and higher), we can only use theta-models. Previous works by Romain Cosset, David Lubicz and Damien Robert have focused on the computation of $\ell$-isogenies in theta-models of level $n$ coprime to $\ell$ (which requires to use $n^g$ coordinates in dimension $g$). For cryptographic applications, we need to compute chains of $2$-isogenies, requiring to use $\geq 3^g$ coordinates in dimension $g$ with state of the art algorithms.
In this paper, we present algorithms to compute chains of $2$-isogenies between abelian varieties of dimension $g\geq 1$ with theta-coordinates of level $n=2$, generalizing a previous work by Pierrick Dartois, Luciano Maino, Giacomo Pope and Damien Robert in dimension $g=2$. We propose an implementation of these algorithms in dimension $g=4$ to compute endomorphisms of elliptic curve products derived from Kani's lemma with applications to SQIsignHD and SIDH cryptanalysis. We are now able to run a complete key recovery attack on SIDH when the endomorphism ring of the starting curve is unknown within a few seconds on a laptop for all NIST SIKE parameters.
Inner Product Ring LWE Problem, Reduction, New Trapdoor Algorithm for Inner Product Ring LWE Problem and Ring SIS Problem
Lattice cryptography is currently a major research focus in public-key encryption, renowned for its ability to resist quantum attacks. The introduction of ideal lattices (ring lattices) has elevated the theoretical framework of lattice cryptography. Ideal lattice cryptography, compared to classical lattice cryptography, achieves more acceptable operational efficiency through fast Fourier transforms. However, to date, issues of impracticality or insecurity persist in ideal lattice problems. In order to provide a reasonable and secure trapdoor algorithm, this paper introduces the concept of "Inner Product Ring LWE" and establishes its quantum resistance and indistinguishability using knowledge of time complexity, fixed-point theory, and statistical distances. Inner product Ring LWE is easier to construct trapdoor algorithms compared to Ring LWE. Additionally, leveraging the properties of NTRU, we propose a more secure Ring SIS trapdoor algorithm.
Zero-Knowledge Proofs of Training for Deep Neural Networks
A zero-knowledge proof of training (zkPoT) enables a party to prove that they have correctly trained a committed model based on a committed dataset without revealing any additional information about the model or the dataset. An ideal zkPoT should offer provable security and privacy guarantees, succinct proof size and verifier runtime, and practical prover efficiency. In this work, we present \name, a zkPoT targeted for deep neural networks (DNNs) that achieves all these goals at once. Our construction enables a prover to iteratively train their model via (mini-batch) gradient descent, where the number of iterations need not be fixed in advance; at the end of each iteration, the prover generates a commitment to the trained model parameters attached with a succinct zkPoT, attesting to the correctness of the executed iterations. The proof size and verifier time are independent of the number of iterations.
Our construction relies on two building blocks. First, we propose an optimized GKR-style (sumcheck-based) proof system for the gradient-descent algorithm with concretely efficient prover cost; this allows the prover to generate a proof for each iteration. We then show how to recursively compose these proofs across multiple iterations to attain succinctness. As of independent interest, we propose a generic framework for efficient recursive composition of GKR-style proofs, along with aggregatable polynomial commitments.
Benchmarks indicate that \name\ can handle the training of complex models such as VGG-11 with 10~million parameters and batch size~$16$. The prover runtime is $15$~minutes per iteration, which is $\mathbf{24 \times}$ faster than generic recursive proofs, with prover memory overhead $\mathbf{27\times}$ lower. The proof size is $1.63$~megabytes, and the verifier runtime is only $130$~milliseconds, where both are independent of the number of iterations and the size of the dataset.
Towards Quantum-Safe Blockchain: Exploration of PQC and Public-key Recovery on Embedded Systems
Blockchain technology ensures accountability,
transparency, and redundancy in critical applications, includ-
ing IoT with embedded systems. However, the reliance on
public-key cryptography (PKC) makes blockchain vulnerable to
quantum computing threats. This paper addresses the urgent
need for quantum-safe blockchain solutions by integrating Post-
Quantum Cryptography (PQC) into blockchain frameworks.
Utilizing algorithms from the NIST PQC standardization pro-
cess, we aim to fortify blockchain security and resilience, partic-
ularly for IoT and embedded systems. Despite the importance
of PQC, its implementation in blockchain systems tailored for
embedded environments remains underexplored. We propose
a quantum-secure blockchain architecture, evaluating various
PQC primitives and optimizing transaction sizes through tech-
niques such as public-key recovery for Falcon, achieving up
to 17% reduction in transaction size. Our analysis identifies
Falcon-512 as the most suitable algorithm for quantum-secure
blockchains in embedded environments, with XMSS as a viable
stateful alternative. However, for embedded devices, Dilithium
demonstrates a higher transactions-per-second (TPS) rate
compared to Falcon, primarily due to Falcon’s slower sign-
ing performance on ARM CPUs. This highlights the signing
time as a critical limiting factor in the integration of PQC
within embedded blockchains. Additionally, we integrate smart
contract functionality into the quantum-secure blockchain,
assessing the impact of PQC on smart contract authentication.
Our findings demonstrate the feasibility and practicality of
deploying quantum-secure blockchain solutions in embedded
systems, paving the way for robust and future-proof IoT
applications.
Cryptanalysis of two post-quantum authenticated key agreement protocols
As the use of the internet and digital devices has grown rapidly, keeping digital communications secure has become very important. Authenticated Key Agreement (AKA) protocols play a vital role in securing digital communications. These protocols enable the communicating parties to mutually authenticate and securely establish a shared secret key. The emergence of quantum computers makes many existing AKA protocols vulnerable to their immense computational power. Consequently, designing new protocols that are resistant to quantum attacks has become essential. Extensive research in this area had led to the design of several post-quantum AKA schemes.
In this paper, we analyze two post-quantum AKA schemes proposed by Dharminder et al. [2022] and Pursharthi and Mishra. [2024] and demonstrate that these schemes are not secure against active adversaries. An adversary can impersonate an authorized user to the server. We then propose reliable solutions to prevent these attacks.
AVeCQ: Anonymous Verifiable Crowdsourcing with Worker Qualities
In crowdsourcing systems, requesters publish tasks, and interested workers provide answers to get rewards. Worker anonymity motivates participation since it protects their privacy. Anonymity with unlinkability is an enhanced version of anonymity because it makes it impossible to ``link'' workers across the tasks they participate in. Another core feature of crowdsourcing systems is worker quality which expresses a worker's trustworthiness and quantifies their historical performance. In this work, we present AVeCQ, the first crowdsourcing system that reconciles these properties, achieving enhanced anonymity and verifiable worker quality updates. AVeCQ relies on a suite of cryptographic tools, such as zero-knowledge proofs, to (i) guarantee workers' privacy, (ii) prove the correctness of worker quality scores and task answers, and (iii) commensurate payments. AVeCQ is developed modularly, where requesters and workers communicate over a platform that supports pseudonymity, information logging, and payments. To compare AVeCQ with the state-of-the-art, we prototype it over Ethereum. AVeCQ outperforms the state-of-the-art in three popular crowdsourcing tasks (image annotation, average review, and Gallup polls). E.g., for an Average Review task with 5 choices and 128 workers AVeCQ is 40% faster (including computing and verifying necessary proofs, and blockchain transaction processing overheads) with the task's requester consuming 87% fewer gas.
Grafted Trees Bear Better Fruit: An Improved Multiple-Valued Plaintext-Checking Side-Channel Attack against Kyber
As a prominent category of side-channel attacks (SCAs), plaintext-checking (PC) oracle-based SCAs offer the advantages of generality and operational simplicity on a targeted device. At TCHES 2023, Rajendran et al. and Tanaka et al. independently proposed the multiple-valued (MV) PC oracle, significantly reducing the required number of queries (a.k.a., traces) in the PC oracle. However, in practice, when dealing with environmental noise or inaccuracies in the waveform classifier, they still rely on majority voting or the other technique that usually results in three times the number of queries compared to the ideal case.
In this paper, we propose an improved method to further reduce the number of queries of the MV-PC oracle, particularly in scenarios where the oracle is imperfect. Compared to the state-of-the-art at TCHES 2023, our proposed method reduces the number of queries for a full key recovery by more than $42.5\%$. The method involves three rounds. Our key observation is that coefficients recovered in the first round can be regarded as prior information to significantly aid in retrieving coefficients in the second round. This improvement is achieved through a newly designed grafted tree. Notably, the proposed method is generic and can be applied to both the NIST key encapsulation mechanism (KEM) standard Kyber and other significant candidates, such as Saber and Frodo. We have conducted extensive software simulations against Kyber-512, Kyber-768, Kyber-1024, FireSaber, and Frodo-1344 to validate the efficiency of the proposed method. An electromagnetic attack conducted on real-world implementations, using an STM32F407G board equipped with an ARM Cortex-M4 microcontroller and Kyber implementation from the public library \textit{pqm4}, aligns well with our simulations.
Key-and-Signature Compact Multi-Signatures for Blockchain: A Compiler with Realizations
Multi-signature is a protocol where a set of signatures jointly sign a message so that the final signature is significantly shorter than concatenating individual signatures together. Recently, it finds applications in blockchain, where several users want to jointly authorize a payment through a multi-signature. However, in this setting, there is no centralized authority and it could suffer from a rogue key attack where the attacker can generate his own keys arbitrarily. Further, to minimize the storage on blockchain, it is desired that the aggregated public-key and the aggregated signature are both as short as possible. In this paper, we find a compiler that converts a kind of identification (ID) scheme (which we call a linear ID) to a multi-signature so that both the aggregated public-key and the aggregated signature have a size independent of the number of signers. Our compiler is provably secure. The advantage of our results is that we reduce a multi-party problem to a weakly secure two-party problem. We realize our compiler with two ID schemes. The first is Schnorr ID. The second is a new lattice-based ID scheme, which via our compiler gives the first regular lattice-based multi-signature scheme with key-and-signature compact without a restart during signing process.
Nova: Recursive Zero-Knowledge Arguments from Folding Schemes
We introduce a new approach to realize incrementally verifiable computation (IVC), in which the prover recursively proves the correct execution of incremental computations of the form $y=F^{(\ell)}(x)$, where $F$ is a (potentially non-deterministic) computation, $x$ is the input, $y$ is the output, and $\ell > 0$. Unlike prior approaches to realize IVC, our approach avoids succinct non-interactive arguments of knowledge (SNARKs) entirely and arguments of knowledge in general. Instead, we introduce and employ folding schemes, a weaker, simpler, and more efficiently-realizable primitive, which reduces the task of checking two instances in some relation to the task of checking a single instance. We construct a folding scheme for a characterization of $\mathsf{NP}$ and show that it implies an IVC scheme with improved efficiency characteristics: (1) the "recursion overhead" (i.e., the number of steps that the prover proves in addition to proving the execution of $F$) is a constant and it is dominated by two group scalar multiplications expressed as a circuit (this is the smallest recursion overhead in the literature), and (2) the prover's work at each step is dominated by two multiexponentiations of size $O(|F|)$, providing the fastest prover in the literature. The size of a proof is $O(|F|)$ group elements, but we show that using a variant of an existing zkSNARK, the prover can prove the knowledge of a valid proof succinctly and in zero-knowledge with $O(\log{|F|})$ group elements. Finally, our approach neither requires a trusted setup nor FFTs, so it can be instantiated efficiently with any cycles of elliptic curves where DLOG is hard.
HyperNova: Recursive arguments for customizable constraint systems
We introduce HyperNova, a new recursive argument for proving incremental computations whose steps are expressed with CCS (Setty et al. ePrint 2023/552), a customizable constraint system that simultaneously generalizes Plonkish, R1CS, and AIR without overheads. HyperNova makes four contributions, each resolving a major problem in the area of recursive arguments.
First, it provides a folding scheme for CCS where the prover’s cryptographic cost is a single multi-scalar multiplication (MSM) of size equal to the number of variables in the constraint system, which is optimal when using an MSM-based commitment scheme. The folding scheme can fold multiple instances at once, making it easier to build generalizations of IVC such as PCD. Second, when proving program executions on stateful machines (e.g., EVM, RISC-V), the cost of proving a step of a program is proportional only to the size of the circuit representing the instruction invoked by the program step ("a la carte" cost profile). Third, we show how to achieve zero-knowledge for "free" and without the need to employ zero-knowledge SNARKs: we use a folding scheme to "randomize" IVC proofs. This highlights a new application of folding schemes. Fourth, we show how to efficiently instantiate HyperNova over a cycle of elliptic curves. For this, we provide a general technique, which we refer to as CycleFold, that applies to all modern folding-scheme-based recursive arguments.
Abuse-Resistant Location Tracking: Balancing Privacy and Safety in the Offline Finding Ecosystem
Location tracking accessories (or "tracking tags") such as those sold by Apple, Samsung, and Tile, allow owners to track the location of their property via offline finding networks. The tracking protocols were designed to ensure that no entity (including the vendor) can use a tag's broadcasts to surveil its owner. These privacy guarantees, however, seem to be at odds with the phenomenon of $\textit{tracker-based stalking}$, where attackers use these very tags to monitor a target's movements. Numerous such criminal incidents have been reported, and in response, manufacturers have chosen to substantially weaken privacy guarantees in order to allow users to detect stalker tags. This compromise has been adopted in a recent IETF draft jointly proposed by Apple and Google.
We put forth the notion of $\textit{abuse-resistant offline finding protocols}$ that aim to achieve a better balance between user privacy and stalker detection. We present an efficient protocol that achieves stalker detection under realistic conditions without sacrificing honest user privacy. At the heart of our result, and of independent interest, is a new notion of $\textit{multi-dealer secret sharing}$ which strengthens standard secret sharing with novel privacy and correctness guarantees. We show that this primitive can be instantiated efficiently on edge devices using variants of Interleaved Reed-Solomon codes combined with new lattice-based decoding algorithms.
Generalized class group actions on oriented elliptic curves with level structure
We study a large family of generalized class groups of imaginary quadratic orders $O$ and prove that they act freely and (essentially) transitively on the set of primitively $O$-oriented elliptic curves over a field $k$ (assuming this set is non-empty) equipped with appropriate level structure. This extends, in several ways, a recent observation due to Galbraith, Perrin and Voloch for the ray class group. We show that this leads to a reinterpretation of the action of the class group of a suborder $O' \subseteq O$ on the set of $O'$-oriented elliptic curves, discuss several other examples, and briefly comment on the hardness of the corresponding vectorization problems.
Tight Time-Space Tradeoffs for the Decisional Diffie-Hellman Problem
In the (preprocessing) Decisional Diffie-Hellman (DDH) problem, we are given a cyclic group $G$ with a generator $g$ and a prime order $N$, and we want to prepare some advice of size $S$, such that we can efficiently distinguish $(g^{x},g^{y},g^{xy})$ from $(g^{x},g^{y},g^{z})$ in time $T$ for uniformly and independently chosen $x,y,z$ from $\mathbb{Z}_N$. This is a central cryptographic problem whose computational hardness underpins many widely deployed schemes, such as the Diffie–Hellman key exchange protocol.
We prove that any generic preprocessing DDH algorithm (operating in any cyclic group) achieves advantage at most $O(ST^2 / N)$. This bound matches the best known attack up to poly-log factors, and confirms that DDH is as secure as the (seemingly harder) discrete logarithm problem against preprocessing attacks. Our result resolves an open question by Corrigan-Gibbs and Kogan (EUROCRYPT 2018), who proved optimal bounds for many variants of discrete logarithm problems except DDH (with an $\tilde{O}(\sqrt{ST^2/N})$ bound).
We obtain our results by adopting and refining the approach by Gravin, Guo, Kwok, Lu (SODA 2021) and by Yun (EUROCRYPT 2015). Along the way, we significantly simplified and extended the above techniques which may be of independent interest.
The highlights of our techniques are as follows:
(1) We obtain a simpler reduction from decisional problems against $S$-bit advice to their $S$-wise XOR lemmas against zero-advice, recovering the reduction by Gravin, Guo, Kwok and Lu (SODA 2021).
(2) We show how to reduce generic hardness of decisional problems to their variants in the simpler hyperplane query model proposed by Yun (EUROCRYPT 2015). This is the first work analyzing a decisional problem in Yun's model, answering an open problem proposed by Auerbach, Hoffman, and Pascual-Perez (TCC 2023).
(3) We prove an $S$-wise XOR lemma of DDH in Yun's model. As a corollary, we obtain the generic hardness of the $S$-XOR DDH problem.
Shared-Custodial Password-Authenticated Deterministic Wallets
Cryptographic wallets are an essential tool in Blockchain networks to ensure the secure storage and maintenance of an user's cryptographic keys. Broadly, wallets can be divided into three categories, namely custodial, non-custodial, and shared-custodial wallets. The first two are centralized solutions, i.e., the wallet is operated by a single entity, which inherently introduces a single point of failure. Shared-custodial wallets, on the other hand, are maintained by two independent parties, e.g., the wallet user and a service provider, and hence avoid the single point of failure centralized solutions. Unfortunately, current shared-custodial wallets suffer from significant privacy issues.
In our work, we introduce password-authenticated deterministic wallets (PADW), a novel and efficient shared-custodial wallet solution, which exhibits strong security and privacy guarantees. In a nutshell, in a PADW scheme, the secret key of the user is shared between the user and the server. In order to generate a signature, the user first authenticates itself to the server by providing a password and afterwards engages in an interactive signing protocol with the server. Security is guaranteed as long as at most one of the two parties is corrupted. Privacy, on the other hand, guarantees that a corrupted server cannot link a transaction to a particular user. We formally model the notion of PADW schemes and we give an instantiation from blind Schnorr signatures. Our construction allows for deterministic key derivation, a feature that is widely used in practice by existing wallet schemes, and it does not rely on any heavy cryptographic primitives. We prove our scheme secure against adaptive adversaries in the random oracle model and under standard assumptions. That is, our security proof only relies on the assumption that the Schnorr signature scheme is unforgeable and that a public key encryption scheme is CCA-secure.
PathGES: An Efficient and Secure Graph Encryption Scheme for Shortest Path Queries
The increasing importance of graph databases and cloud storage services prompts the study of private queries on graphs. We propose PathGES, a graph encryption scheme (GES) for single-pair shortest path queries. PathGES is efficient and mitigates the state-of-the-art attack by Falzon and Paterson (2022) on the GES by Ghosh, Kamara, and Tamassia (2021), while only incurring an additional logarithmic factor in storage overhead. PathGES leverages a novel data structure that minimizes leakage and server computation.
We generalize what it means for one leakage function to leak less than another by defining a relation with respect to a family of query sequences and show that our scheme provably leaks less than the GKT scheme when all queries have been issued. We complement our security proof with a cryptanalysis that demonstrates an information-theoretic gap in the size of the query reconstruction space of our scheme as compared to the GKT scheme and provide concrete examples of the gap for several graph families. Our prototype implementation of PathGES is efficient in practice for real-world social network and geographic data sets. In comparison with the GKT scheme, PathGES has on average the same response size and up to 1.5$\times$ faster round-trip query time.
Time is not enough: Timing Leakage Analysis on Cryptographic Chips via Plaintext-Ciphertext Correlation in Non-timing Channel
In side-channel testing, the standard timing analysis works when the vendor can provide a measurement to indicate the execution time of cryptographic algorithms. In this paper, we find that there exists timing leakage in power/electromagnetic channels, which is often ignored in traditional timing analysis. Hence a new method of timing analysis is proposed to deal with the case where execution time is not available. Different execution time leads to different execution intervals, affecting the locations of plaintext and ciphertext transmission. Our method detects timing leakage by studying changes in plaintext-ciphertext correlation when traces are aligned forward and backward. Experiments are then carried out on different cryptographic devices. Furthermore, we propose an improved timing analysis framework which gives appropriate methods for different scenarios.
On the Relationship between FuncCPA and FuncCPA+
Akavia, Gentry, Halevi, and Vald introduced the security notion of function-chosen-plaintext-attack (FuncCPA security) for public-key encryption schemes.
FuncCPA is defined by adding a functional re-encryption oracle to the IND-CPA game.
This notion is crucial for secure computation applications where the server is allowed to delegate a part of the computation to the client.
Dodis, Halevi, and Wichs introduced a stronger variant called FuncCPA$^+$.
They showed FuncCPA$^+$ implies FuncCPA and conjectured that FuncCPA$^+$ is strictly stronger than FuncCPA.
They left an open problem to clarify the relationship between these variants.
Contrary to their conjecture, we show that FuncCPA is equivalent to FuncCPA$^+$.
We show it by two proofs with a trade-off between the number of queries and the number of function inputs.
Furthermore, we show these parameters determine the security levels of FuncCPA and FuncCPA$^+$.
Respire: High-Rate PIR for Databases with Small Records
Private information retrieval (PIR) is a key building block in many privacy-preserving systems, and recent works have made significant progress on reducing the concrete computational costs of single-server PIR. However, existing constructions have high communication overhead, especially for databases with small records. In this work, we introduce Respire, a lattice-based PIR scheme tailored for databases of small records. To retrieve a single record from a database with over a million 256-byte records, the Respire protocol requires just 6.1 KB of online communication; this is a 5.9x reduction compared to the best previous lattice-based scheme. Moreover, Respire naturally extends to support batch queries. Compared to previous communication-efficient batch PIR schemes, Respire achieves a 3.4-7.1x reduction in total communication while maintaining comparable throughput (200-400 MB/s). The design of Respire relies on new query compression and response packing techniques based on ring switching in homomorphic encryption.
Efficient Threshold FHE for Privacy-Preserving Applications
Threshold Fully Homomorphic Encryption (ThFHE) enables arbitrary computation over encrypted data while keeping the decryption key distributed across multiple parties at all times. ThFHE is a key enabler for threshold cryptography and, more generally, secure distributed computing. Existing ThFHE schemes relying on standard hardness assumptions, inherently require highly inefficient parameters and are unsuitable for practical deployment. In this paper, we take a novel approach towards making ThFHE practically usable by (i) proposing an efficient ThFHE scheme with a new analysis resulting in significantly improved parameters; (ii) and providing the first practical ThFHE implementation benchmark based on Torus FHE.
• We propose the first practical ThFHE scheme with a polynomial modulus-to-noise ratio that supports practically efficient parameters while retaining provable security based on standard quantum-safe assumptions. We achieve this via R ́enyi divergence-based security analysis of our proposed threshold decryption mechanism.
• We present a prototype software implementation of our proposed ThFHE scheme that builds upon the existing Torus-FHE library and supports (distributed) decryption on highly resource-constrained ARM-based handheld devices. Along the way, we implement several extensions to the Torus FHE library, including a Torus-based linear integer secret sharing subroutine to support ThFHE key sharing and distributed decryption for any threshold access structure.
We illustrate the efficacy of our proposal via an end-to-end use case involving encrypted computations over a real medical database and distributed decryptions of the computed result on resource-constrained ARM-based handheld devices.
Practical Traceable Receipt-Free Encryption
Traceable Receipt-free Encryption (TREnc) is a verifiable public-key encryption primitive introduced at Asiacrypt 2022. A TREnc allows randomizing ciphertexts in transit in order to remove any subliminal information up to a public trace that ensures the non-malleability of the underlying plaintext. A remarkable property of TREnc is the indistinguishability of the randomization of chosen ciphertexts against traceable chosen-ciphertext attacks (TCCA). This property can support applications like voting, and it was shown that receipt-free non-interactive voting, where voters are unable to convince any third party of the content of their vote, can be generically built from a TREnc.
While being a very promising primitive, the few existing TREnc mechanisms either require a secret coin CRS or are fairly demanding in time and space requirements. Their security proofs also come with a linear security degradation in the number of challenge ciphertexts.
We address these limitations and offer two efficient public coin TREnc mechanisms tailored for the two common tallying approaches in elections: homomorphic and mixnet-based. The TCCA security of our mechanisms also enjoys an almost-tight reduction to SXDH, based on a new randomizable technique of independent interest in the random oracle model.
A Rust implementation of our TREnc mechanisms demonstrates that we can verifiably encrypt 64 bits in less than a second, and full group elements in around 30 ms., which is sufficient for most real-world applications. While comparing with other solutions, we show that our approaches lead to the most efficient non-interactive receipt-free voting system to date.
On the Concrete Security of Non-interactive FRI
FRI is a cryptographic protocol widely deployed today as a building
block of many efficient SNARKs that help secure transactions of hundreds of
millions of dollars per day. The Fiat-Shamir security of FRI—vital for understanding
the security of FRI-based SNARKs—has only recently been formalized and
established by Block et al. (ASIACRYPT ’23).
In this work, we complement the result of Block et al. by providing a thorough
concrete security analysis of non-interactive FRI under various parameter settings
from protocols deploying (or soon to be deploying) FRI today. We find that these
parameters nearly achieve their desired security targets (being at most 1-bit less
secure than their targets) for non-interactive FRI with respect to a certain security
conjecture about the FRI Protocol. However, in all but one set of parameters, we
find that the provable security of non-interactive FRI under these parameters is
severely lacking, being anywhere between 21- and 63-bits less secure than the
conjectured security. The conjectured security of FRI assumes that known attacks
are optimal, the security of these systems would be severely compromised should
a better attack be discovered. In light of this, we present parameter guidelines
for achieving 100-bits of provable security for non-interactive FRI along with a
methodology for tuning these parameters to suit the needs of protocol designers.
Faster Private Decision Tree Evaluation for Batched Input from Homomorphic Encryption
Privacy-preserving decision tree evaluation (PDTE) allows a client that holds feature vectors to perform inferences against a decision tree model on the server side without revealing feature vectors to the server. Our work focuses on the non-interactive batched setting where the client sends a batch of encrypted feature vectors and then obtains classifications, without any additional interaction. This is useful in privacy-preserving credit scoring, biometric authentication, and many more applications.
In this paper, we propose two novel non-interactive batched PDTE protocols, OursRCC and OursCW, based on two batched ciphertext-plaintext comparison algorithms, our batched range cover comparison (RCC) comparator and the constant-weight (CW) piece-wise comparator, respectively. When comparing 16-bit batched encrypted values to a single plaintext value, our comparison algorithms show a speedup of up to $72\times$ compared to the state-of-the-art Level Up (CCS'23). Moreover, we introduced a new tree traversal method called adapted SumPath, to achieve $\mathcal{O}(1)$ complexity of the server's response, whereas Level Up has $\mathcal{O}(2^d)$ complexity for a depth-$d$ tree and the client needs to look up classification values in a table.
Overall, our PDTE protocols attain the optimal server-to-client communication complexity and are up to $17\times$ faster than Level Up in batch size 16384.
Last updated: 2024-07-17
Post-Quantum Access Control with Application to Secure Data Retrieval
Servan-Schreiber et al. (S&P 2023) presented a new notion called private access control lists (PACL) for function secret sharing (FSS), where the FSS evaluators can ensure that the FSS dealer is authorized to share the given function. Their construction relies on costly non-interactive secret-shared proofs and is not secure in post-quantum setting. We give a construction of PACL from publicly verifiable secret sharing (PVSS) under short integer solution (SIS). Our construction adapts the Gentry et al’s scheme (Eurocrypt 2022) for post-quantum setting based on learning with error assumption (LWE). The implementation of our PACL with different files showed that it is feasible even at different sizes, and should remain so even with large secret vectors. This construction has many applications for access control by applying FSS. We show how to apply the proposed PACL construction to secure data retrieval. We also present a scheme for secure data retrieval with access control, which might be of independent interest.
LaPSuS – A Lattice-Based Private Stream Aggregation Scheme under Scrutiny
Private Stream Aggregation (PSA) allows clients to send encryptions of their private values to an aggregator that is then able to learn the sum of these values but nothing else. It has since found many applications in practice, e.g. for smart metering or federated learning. In 2018, Becker et al. proposed the first lattice-based PSA scheme LaPS (NDSS 2018), with putative post-quantum security, which has subsequently been patented. In this paper, we describe two attacks on LaPS that break the claimed aggregator obliviousness security notion, where the second attack even allows to recover the secret keys of the clients, given enough encryptions. Moreover, we review the PSA literature for other occurrences of the responsible flawed proof steps. By explicitly tracking down and discussing these flaws, we clarify and hope to contribute to the literature on PSA schemes, in order to prevent further insecure schemes in practice. Finally, we point out that a Real-or-Random variant of the security notion that is often used as a substitute to make proofs easier, is not well-defined and potentially weaker than the standard PSA security notion. We propose a well defined variant and show that it is equivalent to the standard security notion of PSA under mild assumptions.
Fast Parallelizable Misuse-Resistant Authenticated Encryption: Low Latency (Decryption-Fast) SIV
MRAE security is an important goal for many AEAD applications where the nonce uniqueness cannot be maintained and security risks are significant. However, MRAE schemes can be quite expensive. Two of the SoTA MRAE-secure schemes; Deoxys-II and AES-GCM-SIV rely on internal parallelism and special instructions to achieve competitive performance. However, they both suffer from the same bottleneck, they have at least one call to the underlying primitive that cannot be parallelized to any other call. Romulus-M and LMDAE are two other more recent MRAE secure schemes based on TBCs that target low area hardware. However, they are unparallelizable so they are slower than their counterparts.
In this paper, we present two new AEAD modes and four instantiations based on Tweakable Block Ciphers. These new modes target equipping high-speed applications on parallel platforms with nonce misuse resistant AEAD (MRAE). The first mode, LLSIV, targets similar performance on single-core platforms to SCT-2, while eliminating the bottlenecks that make SCT-2 not fully parallelizable. The enhanced parallelism allows LLSIV to encrypt significantly more blocks on parallel platforms, compared to SCT-2, in the same amount of time. LLSIV is based on the NaT MAC, where each ciphertext block can itself be viewed as an instance of NaT when the plaintext is prepended with
. The trade-off is that LLSIV requires the inverse function of the TBC. However, the inverse function is used only once per message and we demonstrate that for parallel implementations it represents a very small overhead.
We give an instantiation of LLSIV based on the SKINNY-128-384 TBC, and a pruned scheme, dubbed pLLSIV, which targets enhanced performance compared both SCT-2 and LLSIV on all platforms, while having reduced security claims. It relies on the recently popularized prove-then-prune methodology to take full advantage of the properties of LLSIV. This leads to a significant performance improvement, making pLLSIV even faster than online TBC-based schemes that are not MRAE-secure. Last but not least, we give an instantiation that uses the primitives used in AES-GCM-SIV: the PolyVal hash function and AES. Our instantiation is faster than AES-GCM-SIV on all platforms and have better bounds. On the other hand, it relies on the ideal cipher model as it uses the ICE TBC proposed as part of the Remus AEAD design.
The second mode we describe is LLDFV. It uses ideas from LLSIV combined the Decryption-Fast SIV (DFV) framework proposed recently by Minematsu. The goal is to reduce the number of calls to the TBC by one, while making the scheme as parallelizable as LLSIV. This makes the scheme faster that DFV on all platforms.
Scalable Agreement Protocols with Optimal Optimistic Efficiency
Designing efficient distributed protocols for various agreement tasks such as Byzantine Agreement, Broadcast, and Committee Election is a fundamental problem. We are interested in $scalable$ protocols for these tasks, where each (honest) party communicates a number of bits which is sublinear in $n$, the number of parties. The first major step towards this goal is due to King et al. (SODA 2006) who showed a protocol where each party sends only $\tilde O(1)$ bits throughout $\tilde O(1)$ rounds, but guarantees only that $1-o(1)$ fraction of honest parties end up agreeing on a consistent output, assuming constant $<1/3$ fraction of static corruptions. Few years later, King et al. (ICDCN 2011) managed to get a full agreement protocol in the same model but where each party sends $\tilde O(\sqrt{n})$ bits throughout $\tilde O(1)$ rounds. Getting a full agreement protocol with $o(\sqrt{n})$ communication per party has been a major challenge ever since.
In light of this barrier, we propose a new framework for designing efficient agreement protocols. Specifically, we design $\tilde O(1)$-round protocols for all of the above tasks (assuming constant $<1/3$ fraction of static corruptions) with optimistic and pessimistic guarantees:
$\bullet$ $Optimistic$ $complexity$: In an honest execution, (honest) parties send only $\tilde O(1)$ bits.
$\bullet$ $Pessimistic$ $complexity$: In any other case, (honest) parties send $\tilde O(\sqrt{n})$ bits.
Thus, all an adversary can gain from deviating from the honest execution is that honest parties will need to work harder (i.e., transmit more bits) to reach agreement and terminate. Besides the above agreement tasks, we also use our new framework to get a scalable secure multiparty computation (MPC) protocol with optimistic and pessimistic complexities.
Technically, we identify a relaxation of Byzantine Agreement (of independent interest) that allows us to fall-back to a pessimistic execution in a coordinated way by all parties. We implement this relaxation with $\tilde O(1)$ communication bits per party and within $\tilde O(1)$ rounds.
CoGNN: Towards Secure and Efficient Collaborative Graph Learning
Collaborative graph learning represents a learning paradigm where multiple parties jointly train a graph neural network (GNN) using their own proprietary graph data. To honor the data privacy of all parties, existing solutions for collaborative graph learning are either based on federated learning (FL) or secure machine learning (SML). Although promising in terms of efficiency and scalability due to their distributed training scheme, FL-based approaches fall short in providing provable security guarantees and achieving good model performance. Conversely, SML-based solutions, while offering provable privacy guarantees, are hindered by their high computational and communication overhead, as well as poor scalability as more parties participate.
To address the above problem, we propose CoGNN, a novel framework that simultaneously reaps the benefits of both FL-based and SML-based approaches. At a high level, CoGNN is enabled by (i) a novel message passing mechanism that can obliviously and efficiently express the vertex data propagation/aggregation required in GNN training and inference and (ii) a two-stage Dispatch-Collect execution scheme to securely decompose and distribute the GNN computation workload for concurrent and scalable executions. We further instantiate the CoGNN framework, together with customized optimizations, to train Graph Convolutional Network (GCN) models. Extensive evaluations on three graph datasets demonstrate that compared with the state-of-the-art (SOTA) SML-based approach, CoGNN reduces up to $123$x running time and up to $522$x communication cost per party. Meanwhile, the GCN models trained using CoGNN have nearly identical accuracies as the plaintext global-graph training, yielding up to $11.06\%$ accuracy improvement over the GCN models trained via federated learning.
A Note on `` Provably Secure and Lightweight Authentication Key Agreement Scheme for Smart Meters''
We show that the authentication key agreement scheme
[IEEE Trans. Smart Grid, 2023, 14(5), 3816-3827] is flawed due to its inconsistent computations. We also show that the scheme fails to keep anonymity, not as claimed.
Shift-invariant functions and almost liftings
We investigate shift-invariant vectorial Boolean functions on $n$ bits that are lifted from Boolean functions on $k$ bits, for $k\leq n$. We consider vectorial functions that are not necessarily permutations, but are, in some sense, almost bijective. In this context, we define an almost lifting as a Boolean function for which there is an upper bound on the number of collisions of its lifted functions that does not depend on $n$. We show that if a Boolean function with diameter $k$ is an almost lifting, then the maximum number of collisions of its lifted functions is $2^{k-1}$ for any $n$. Moreover, we search for functions in the class of almost liftings that have good cryptographic properties and for which the non-bijectivity does not cause major security weaknesses. These functions generalize the well-known map $\chi$ used in the Keccak hash function.
A Central Limit Framework for Ring-LWE Noise Analysis
Uncategorized
Uncategorized
This paper develops Central Limit arguments for analysing the noise in ciphertexts in two homomorphic encryption schemes that are based on Ring-LWE. The first main contribution of this paper is to present and evaluate an average-case noise analysis for the BGV scheme. Our approach relies on the recent work of Costache et al. (SAC 2023) that gives the approximation of a polynomial product as a multivariate Normal distribution. We show how this result can be applied in the BGV context and evaluate its efficacy. We find this average-case approach can much more closely model the noise growth in BGV implementations than prior approaches, but in some cases it can also underestimate the practical noise growth. Our second main contribution is to develop a Central Limit framework to analyse the noise growth in the homomorphic Ring-LWE cryptosystem of Lyubashevsky, Peikert and Regev (Eurocrypt 2013, full version). Our approach is very general: apart from finite variance, no assumption on the distribution of the noise is required (in particular, the noise need not be subgaussian). We show that our approach leads to tighter bounds for the probability of decryption failure than those of prior work.
Simple Threshold (Fully Homomorphic) Encryption From LWE With Polynomial Modulus
The learning with errors (LWE) assumption is a powerful tool for building encryption schemes with useful properties, such as plausible resistance to quantum computers, or support for homomorphic computations. Despite this, essentially the only method of achieving threshold decryption in schemes based on LWE requires a modulus that is superpolynomial in the security parameter, leading to a large overhead in ciphertext sizes and computation time.
In this work, we propose a (fully homomorphic) encryption scheme that supports a simple $t$-out-of-$n$ threshold decryption protocol while allowing for a polynomial modulus. The main idea is to use the Rényi divergence (as opposed to the statistical distance as in previous works) as a measure of distribution closeness. This comes with some technical obstacles, due to the difficulty of using the Rényi divergence in decisional security notions such as standard semantic security. We overcome this by constructing a threshold scheme with a weaker notion of one-way security and then showing how to transform any one-way (fully homomorphic) threshold scheme into one guaranteeing (selective) indistinguishability-based security.
Cross Ledger Transaction Consistency for Financial Auditing
Auditing throughout a fiscal year is integral to organizations with transactional activity. Organizations transact with each other and record the details for all their economical activities so that a regulatory committee can verify the lawfulness and legitimacy of their activity. However, it is computationally infeasible for the committee to perform all necessary checks for each organization. To overcome this, auditors assist in this process: organizations give access to all their internal data to their auditors, who then produce reports regarding the consistency of the organization's data, alerting the committee to any inconsistencies. Despite this, numerous issues that result in fines annually revolve around such inconsistencies in bookkeeping across organizations. Notably, committees wishing to verify the correctness of auditor-provided reports need to redo all their calculations; a process which is computationally proportional to the number of organizations. In fact, it becomes prohibitive when considering real-world settings with thousands of organizations. In this work, we propose two protocols, CLOSC and CLOLC, whose goals are to enable auditors and a committee to verify the consistency of transactions across different ledgers. Both protocols ensure that for every transaction recorded in an organization's ledger, there exists a dual one in the ledger of another organization while safeguarding against other potential attacks. Importantly, we minimize the information leakage to auditors and other organizations and guarantee three crucial security and privacy properties that we propose: (i) transaction amount privacy, (ii) organization-auditor unlinkability, and (iii) transacting organizations unlinkability. At the core of our protocols lies a two-tier ledger architecture alongside a suite of cryptographic tools. To demonstrate the practicality and scalability of our designs, we provide extensive performance evaluation for both CLOSC and CLOLC. Our numbers are promising, i.e., all computation and verification times lie in the range of seconds, even for millions of transactions, while the on-chain storage costs for an auditing epoch are encouraging i.e. in the range of GB for millions of transactions and thousands of organizations.
Anonymous Broadcast Authentication with Logarithmic-Order Ciphertexts from DLP or LWE
We propose an anonymous broadcast authentication (ABA) scheme to simultaneously control massive numbers of devices in practical resources.
As a theoretical foundation, we find a barrier in constructing an ABA scheme that can control numerous devices: a trilemma between (i) security, (ii) ciphertext length, and (iii) freedom of target device selection.
Therefore, we propose ABAs with ciphertext sizes of $O(\log N)$, where $N$ is the number of target devices and impose a certain restriction on (iii).
We provide an ABA template and instantiate it into specific schemes from the decisional Diffie-Hellman problem or the learning with errors problem.
Further, we provide example parameters and resource consumption of space and time.
Blockchain Space Tokenization
Handling congestion in blockchain systems is a fundamental problem given that the security and decentralization objectives of such systems lead to designs that compromise on (horizontal) scalability (what sometimes is referred to as the ``blockchain trilemma''). Motivated by this, we focus on the question whether it is possible to design a transaction inclusion policy for block producers that facilitates fee and delay predictability while being incentive compatible at the same time.
Reconciling these three properties is seemingly paradoxical given that the dominant approach to transaction processing is based on first-price auctions (e.g., as in Bitcoin) or dynamic adjustment of the minimum admissible fee (e.g. as in Ethereum EIP-1559) something that breaks fee predictability. At the same time, in fixed fee mechanisms (e.g., as in Cardano), fees are trivially predictable but are subject to relatively inexpensive bribing or denial of service attacks where transactions may be delayed indefinitely by a well funded attacker, hence breaking delay predictability.
In this work, we set out to address this problem by putting forward blockchain space tokenization (BST), namely a new capability of a blockchain system to tokenize its capacity for transactions and allocate it to interested users who are willing to pay ahead of time for the ability to post transactions regularly for a period of time. We analyze our system in the face of worst-case transaction-processing attacks by introducing a security game played between the mempool mechanism and an adversary. Leveraging this framework, we prove that BST offers predictable and asymptotically optimal delays, predictable fees, and is incentive compatible, thus answering the question posed in the affirmative.
Secure Multiparty Computation of Symmetric Functions with Polylogarithmic Bottleneck Complexity and Correlated Randomness
Bottleneck complexity is an efficiency measure of secure multiparty computation (MPC) protocols introduced to achieve load-balancing in large-scale networks, which is defined as the maximum communication complexity required by any one player within the protocol execution. Towards the goal of achieving low bottleneck complexity, prior works proposed MPC protocols for computing symmetric functions in the correlated randomness model, where players are given input-independent correlated randomness in advance. However, the previous protocols with polylogarithmic bottleneck complexity in the number of players $n$ require a large amount of correlated randomness that is linear in $n$, which limits the per-party efficiency as receiving and storing correlated randomness are the bottleneck for efficiency. In this work, we present for the first time MPC protocols for symmetric functions such that bottleneck complexity and the amount of correlated randomness are both polylogarithmic in $n$, assuming collusion of size at most $n-o(n)$ players. Furthermore, one of our protocols is even computationally efficient in that each player performs only $\mathrm{polylog}(n)$ arithmetic operations while the computational complexity of the previous protocols is $O(n)$. Technically, our efficiency improvements come from novel protocols based on ramp secret sharing to realize basic functionalities with low bottleneck complexity, which we believe may be of interest beyond their applications to secure computation of symmetric functions.
A Univariate Attack against the Limited-Data Instance of Ciminion
With the increasing interest for advanced protocols for Multi Party Computation, Fully-Homomorphic Encryption or Zero Knowledge proofs, a need for cryptographic algorithms with new constraints has emerged. These algorithms, called Arithmetization-Oriented ciphers, seek to minimize the number of field multiplications in large finite fields $\mathbb{F}_{2^n}$ or $\mathbb{F}_{p}$. Among them, Ciminion is an encryption algorithm proposed by Dobraunig et al. in Eurocrypt 2021.
In this paper, we show a new univariate modelization on a variant of Ciminion proposed by the designers. This instance restricts the attacker to at most $2^{s/2}$ data, where $s$ is the security level. Because the designers chose to reduce the number of rounds in that specific attacker model, we are able to attack the cipher for large security levels. We also propose some slight modifications of Ciminion that would overcome this vulnerability.
On hermitian decomposition lattices and the module-LIP problem in rank 2
In this short note, we introduce a specific class of rank two lattices over CM fields endowed with additional symmetries, which are involved in the decomposition of algebraic integers in Hermitian squares. As an application, we show an elementary reduction from the module-LIP problem in rank 2 over a CM or totally real number field to the finding of a square basis in such lattices.
Round Efficient Byzantine Agreement from VDFs
Byzantine agreement (BA) is a fundamental primitive in distributed systems and has received huge interest as an important building block for blockchain systems. Classical byzantine agreement considers a setting where $n$ parties with fixed, known identities want to agree on an output in the presence of an adversary. Motivated by blockchain systems, the assumption of fixed identities is weakened by using a \emph{resource-based model}. In such models, parties do not have fixed known identities but instead have to invest some expensive resources to participate in the protocol. Prominent examples for such resources are computation (measured by, e.g., proofs-of-work) or money (measured by proofs-of-stake). Unlike in the classical setting where BA without trusted setup (e.g., a PKI or an unpredictable beacon) is impossible for $t \geq n/3$ corruptions, in such resource-based models, BA can be constructed for the optimal threshold of $t <n/2$. In this work, we investigate BA without a PKI in the model where parties have restricted computational resources. Concretely, we consider sequential computation modeled via computing a verifiable delay function (VDF) and establish the following results:
Positive Result: We present the first protocol for BA with expected constant round complexity and termination under adaptive corruption, honest majority and without a PKI. Earlier work achieved round complexity $O(n\kappa^2)$ (CRYPTO'15) or $O(\kappa)$ (PKC'18), where $\kappa$ is the security parameter.
Negative Result: We give the first lower bound on the communication complexity of BA in a model where parties have restricted computational resources. Concretely, we show that a multicast complexity of $O(\sqrt{n})$ is necessary even if the parties have access to a VDF oracle.
SIGMA: Secure GPT Inference with Function Secret Sharing
Secure 2-party computation (2PC) enables secure inference that offers protection for both proprietary machine learning (ML) models and sensitive inputs to them. However, the existing secure inference solutions suffer from high latency and communication overheads,
particularly for transformers. Function secret sharing (FSS) is a recent paradigm for
obtaining efficient 2PC protocols with a preprocessing phase.
We provide SIGMA, the first end-to-end system for secure transformer inference based on FSS.
By constructing new FSS-based protocols for complex machine learning functionalities, such as Softmax, GeLU and SiLU, and also accelerating their computation on GPUs, SIGMA improves the latency of secure inference of transformers by $11-19\times$ over the state-of-the-art that uses preprocessing and GPUs. We present the first secure inference of generative pre-trained transformer (GPT) models. In particular, SIGMA executes Meta's LLaMA2 (available on HuggingFace) with 13 billion parameters in 44 seconds and GPT2 in 1.6 seconds.
A Practical and Scalable Implementation of the Vernam Cipher, under Shannon Conditions, using Quantum Noise
The one-time pad cipher is renowned for its theoretical perfect security, yet its practical deployment is primarily hindered by the key-size and distribution challenge. This paper introduces a novel approach to key distribution called q-stream, designed to make symmetric-key cryptography, and the one-time pad cipher in particular, a viable option for contemporary secure communications, and specifically, post-quantum cryptography, leveraging quantum noise and combinatorics to ensure secure and efficient key-distribution between communicating parties. We demonstrate that our key-distribution mechanism has a variable, yet quantifiable hardness of at least 504 bits, established from immutable mathematical laws, rather than conjectured-intractability, and how we overcome the one-time pad key-size issue with a localised quantum-noise seeded key-generation function, having a system hardness of at least 2304 bits, while introducing sender authentication and message integrity. Whilst the proposed solution has potential applications in fields requiring very high levels of security, such as military communications and large financial transactions, we show from our research with a prototype of q-stream, that it is sufficiently practical and scaleable for use in common browser-based web-applications, without any modification to the browser (i.e. plug-ins), running above SSL/TLS at the application level, where in tests, it achieved a key-distribution rate of around 7 million keys over a 5 minute surge-window, in a single (multi-threaded) instance of q-stream.
A Note on ``Secure and Distributed IoT Data Storage in Clouds Based on Secret Sharing and Collaborative Blockchain''
We show that the data storage scheme [IEEE/ACM Trans. Netw., 2023, 31(4), 1550-1565] is flawed due to the false secret sharing protocol, which requires that some random $4\times 4$ matrixes over the finite field $F_p$ (a prime $p$) are invertible. But we find its mathematical proof for invertibility is incorrect. To fix this flaw, one needs to check the invertibility of all 35 matrixes so as to generate the proper 7 secret shares.
LR-OT: Leakage-Resilient Oblivious Transfer
Oblivious Transfer (OT) is a fundamental cryptographic primitive, becoming a crucial component of a practical secure protocol.
OT is typically implemented in software, and one way to accelerate its running time is by using hardware implementations.
However, such implementations are vulnerable to side-channel attacks (SCAs).
On the other hand, protecting interactive protocols against SCA is highly challenging because of their longer secrets (which include inputs and randomness), more complicated design, and running multiple instances.
Consequently, there are no truly practical leakage-resistant OT protocols yet.
In this paper, we introduce two tailored indistinguishability-based security definitions for leakage-resilient OT, focusing on protecting the sender's state.
Second, we propose a practical semi-honest secure OT protocol that achieves these security levels while minimizing the assumptions on the protocol's building blocks and the use of a secret state.
Finally, we extend our protocol to support sequential composition and explore efficiency-security tradeoffs.
Last updated: 2024-07-13
Predicting one class of truncated matrix congruential generators with unknown parameters
Matrix congruential generators is an important class of pseudorandom number
generators. In this paper we show how to predict a class of Matrix congruential generators matrix
congruential generators with unknown parameters. Given a few truncated digits
of high-order bits output by a matrix congruential generator, we give a method
based on lattice reduction to recover the parameters and the initial state of the
generator.
Fault-Resistant Partitioning of Secure CPUs for System Co-Verification against Faults
Fault injection attacks are a serious threat to system security, enabling attackers to bypass protection mechanisms or access sensitive information. To evaluate the robustness of CPU-based systems against these attacks, it is essential to analyze the consequences of the fault propagation resulting from the complex interplay between the software and the processor. However, current formal methodologies combining hardware and software face scalability issues due to the monolithic approach used.
To address this challenge, this work formalizes the $k$-fault resistant partitioning notion to solve the fault propagation problem when assessing redundancy-based hardware countermeasures in a first step. Proven security guarantees can then reduce the remaining hardware attack surface when introducing the software in a second step. First, we validate our approach against previous work by reproducing known results on cryptographic circuits. In particular, we outperform state-of-the-art tools for evaluating AES under a three-fault-injection attack. Then, we apply our methodology to the OpenTitan secure element and formally prove the security of its CPU's hardware countermeasure to single bit-flip injections. Besides that, we demonstrate that previously intractable problems, such as analyzing the robustness of OpenTitan running a secure boot process, can now be solved by a co-verification methodology that leverages a $k$-fault resistant partitioning. We also report a potential exploitation of the register file vulnerability in two other software use cases. Finally, we provide a security fix for the register file, prove its robustness, and integrate it into the OpenTitan project.
Prime Masking vs. Faults - Exponential Security Amplification against Selected Classes of Attacks
Fault injection attacks are a serious concern for cryptographic hardware. Adversaries may extract sensitive information from the faulty output that is produced by a cryptographic circuit after actively disturbing its computation. Alternatively, the information whether an output would have been faulty, even if it is withheld from being released, may be exploited. The former class of attacks, which requires the collection of faulty outputs, such as Differential Fault Analysis (DFA), then either exploits some knowledge about the position of the injected fault or about its value. The latter class of attacks, which can be applied without ever obtaining faulty outputs, such as Statistical Ineffective Fault Attacks (SIFA), then either exploits a dependency between the effectiveness of the fault injection and the value to be faulted (e.g., an LSB stuck-at-0 only affecting odd numbers), denoted as SIFA-1, or a conditional propagation of a faulted value based on a sensitive intermediate (e.g., multiplication of a faulted value by 0 prevents propagation), denoted as SIFA-2. The aptitude of additive masking schemes, which were designed to prevent side-channel analysis, to also thwart fault attacks is typically assumed to be limited. Common fault models, such as toggle/bit-flip, stuck-at-0 or stuck-at-1 survive the recombination of Boolean shares well enough for generic attacks to succeed. More precisely, injecting a fault into one or multiple Boolean shares often results in the same, or at least a predictable, error appearing in the sensitive variable after recombination. In this work, we show that additive masking in prime-order fields breaks such relationships, causing frequently exploited biases to decrease exponentially in the number of shares. As a result, prime masking offers surprisingly strong protection against generic statistical attacks, which require a dependency between the effectiveness of an injected fault and the secret variable that is manipulated, such as SIFA-1. Operation-dependent statistical attacks, such as SIFA-2 and Fault Template Attacks (FTA), may still be performed against certain prime-field structures, even if they are masked with many shares. Yet, we analyze the corresponding cases and are able to provide specific guidelines on how to avoid vulnerabilities either at the cipher design or implementation level by making informed decisions about the primes, non-linear mappings and masked gadgets used. Since prime-field masking appears to be one of the rare instances of affordable countermeasures that naturally provide sound protection against sidechannel analysis and certain fault injection attacks, we believe there is a strong incentive for developing new ciphers to leverage these advantages.
Permutation Superposition Oracles for Quantum Query Lower Bounds
We propose a generalization of Zhandry’s compressed oracle method to random permutations, where an algorithm can query both the permutation and its inverse. We show how to use the resulting oracle simulation to bound the success probability of an algorithm for any predicate on input-output pairs, a key feature of Zhandry’s technique that had hitherto resisted attempts at generalization to random permutations. One key technical ingredient is to use strictly monotone factorizations to represent the permutation in the oracle’s database. As an application of our framework, we show that the one-round sponge construction is unconditionally preimage resistant in the random permutation model. This proves a conjecture by Unruh.
Bake It Till You Make It: Heat-induced Power Leakage from Masked Neural Networks
Masking has become one of the most effective approaches for securing hardware designs against side-channel attacks. Regardless of the effort put into correctly implementing masking schemes on a field-programmable gate array (FPGA), leakage can be unexpectedly observed. This is due to the fact that the assumption underlying all masked designs, i.e., the leakages of different shares are independent of each other, may no longer hold in practice. In this regard, extreme temperatures have been shown to be an important factor in inducing leakage, even in correctly masked designs. This has previously been verified using an external heat generator
(i.e., a climate chamber). In this paper, we examine whether the leakage can be induced using the circuit components themselves without making any changes to the design. Specifically, we target masked neural networks (NNs) in FPGAs, one of the main building blocks of which is block random access memory (BRAM). In this respect, thanks to the inherent characteristics of NNs, our novel internal heat generators leverage solely the memories devoted to storing the user’s input, especially when frequently writing alternating patterns into BRAMs. The possibility of observing first-order leakage is evaluated by considering one of the most recent and successful first-order secure masked NNs, namely ModuloNET. ModuloNET is specifically designed for FPGAs, where BRAMs are used to store inputs and intermediate computations. Our experimental results demonstrate that undesirable first-order leakage can be observed and exploited by increasing the temperature when an alternating input is applied to the masked NN. To give a better understanding of the impact of extreme heat, we further perform a similar test on the design using an external heat generator, where a similar conclusion can be drawn.
Anonymous Outsourced Statekeeping with Reduced Server Storage
Strike-lists are a common technique for rollback and replay prevention in protocols that require that clients remain anonymous or that their current position in a state machine remain confidential. Strike-lists are heavily used in anonymous credentials, e-cash schemes, and trusted execution environments, and are widely deployed on the web in the form of Privacy Pass (PoPETS '18) and Google Private State Tokens.
In such protocols, clients submit pseudorandom tokens associated with each action (e.g., a page view in Privacy Pass) or state transition, and the token is added to a server-side list to prevent reuse.
Unfortunately, the size of a strike-list, and hence the storage required by the server, is proportional to the total number of issued tokens, $N \cdot t$, where $N$ is the number of clients and $t$ is the maximum number of tickets per client. In this work, we ask whether it is possible to realize a strike-list-like functionality, which we call the anonymous tickets functionality, with storage requirements proportional to $N \log(t)$.
For the anonymous tickets functionality we construct a secure protocol from standard assumptions that achieves server storage of $O(N)$ ciphertexts, where each ciphertext encrypts a message of length $O(\log(t))$. We also consider an extension of the strike-list functionality where the server stores an arbitrary state for each client and clients advance their state with some function $s_i\gets f(s_{i-1},\mathsf{auxinput})$, which we call the anonymous outsourced state-keeping functionality. In this setting, malicious clients are prevented from rolling back their state, while honest clients are guaranteed anonymity and confidentiality against a malicious server. We achieve analogous results in this setting for two different classes of functions.
Our results rely on a new technique to preserve client anonymity in the face of selective failure attacks by a malicious server. Specifically, our protocol guarantees that misbehavior of the server either (1) does not prevent the honest client from redeeming a ticket or (2) provides the honest client with an escape hatch that can be used to simulate a redeem in a way that is indistinguishable to the server.
Toward A Practical Multi-party Private Set Union
This paper studies a multi-party private set union (mPSU), a fundamental cryptographic problem that allows multiple parties to compute the union of their respective datasets without revealing any additional information. We propose an efficient mPSU protocol which is secure in the presence of any number of colluding semi-honest participants. Our protocol avoids computationally expensive homomorphic operations or generic multi-party computation, thus providing an efficient solution for mPSU.
The crux of our protocol lies in the utilization of new cryptographic tool, namely, Membership Oblivious Transfer (mOT) . We believe that the mOT and cOPRF may be of independent interest. We implement our mPSU protocol and evaluate their performance. Our protocol shows an improvement of up to $80.84\times$ in terms of running time and $405.73\times$ bandwidth cost compared to the existing state-of-the-art protocols.
*Note: June 2024 - Fixed a bug in the original version and simplified the protocol by removing the OPRF.
A Two-Layer Blockchain Sharding Protocol Leveraging Safety and Liveness for Enhanced Performance
Sharding is a critical technique that enhances the scalability of blockchain technology. However, existing protocols often assume adversarial nodes in a general term without considering the different types of attacks, which limits transaction throughput at runtime because attacks on liveness could be mitigated. There have been attempts to increase transaction throughput by separately handling the attacks; however, they have security vulnerabilities. This paper introduces Reticulum, a novel sharding protocol that overcomes these limitations and achieves enhanced scalability in a blockchain network without security vulnerabilities.
Reticulum employs a two-phase design that dynamically adjusts transaction throughput based on runtime adversarial attacks on either or both liveness and safety. It consists of `control' and `process' shards in two layers corresponding to the two phases. Process shards are subsets of control shards, with each process shard expected to contain at least one honest node with high confidence. Conversely, control shards are expected to have a majority of honest nodes with high confidence. Reticulum leverages unanimous voting in the first phase to involve fewer nodes in accepting/rejecting a block, allowing more parallel process shards. The control shard finalizes the decision made in the first phase and serves as a lifeline to resolve disputes when they surface.
Experiments demonstrate that the unique design of Reticulum empowers high transaction throughput and robustness in the face of different types of attacks in the network, making it superior to existing sharding protocols for blockchain networks.
Dot-Product Proofs and Their Applications
A dot-product proof (DPP) is a simple probabilistic proof system in which the input statement $\mathbf{x}$ and the proof $\boldsymbol{\pi}$ are vectors over a finite field $\mathbb{F}$, and the proof is verified by making a single dot-product query $\langle \mathbf{q},(\mathbf{x} \| \boldsymbol{\pi}) \rangle$ jointly to $\mathbf{x}$ and $\boldsymbol{\pi}$. A DPP can be viewed as a 1-query fully linear PCP. We study the feasibility and efficiency of DPPs, obtaining the following results:
- Small-field DPP. For any finite field $\mathbb{F}$ and Boolean circuit $C$ of size $S$, there is a DPP for proving that there exists $\mathbf{w}$ such that $C(\mathbf{x}, \mathbf{w})=1$ with a proof $\boldsymbol{\pi}$ of length $S\cdot\mathsf{poly}(|\mathbb{F}|)$ and soundness error $\varepsilon=O(1 / \sqrt{|\mathbb{F}|})$. We show this error to be asymptotically optimal. In particular, and in contrast to the best known PCPs, there exist strictly linear-length DPPs over constant-size fields.
- Large-field DPP. If $|\mathbb{F}|\ge\mathsf{poly}(S/\varepsilon)$, there is a similar DPP with soundness error $\varepsilon$ and proof length $O(S)$ (in field elements).
The above results do not rely on the PCP theorem and their proofs are considerably simpler. We apply our DPP constructions toward two kinds of applications.
- Hardness of approximation. We obtain a simple proof for the NP-hardness of approximating MAXLIN (with dense instances) over any finite field $\mathbb{F}$ up to some constant factor $c>1$, independent of $\mathbb{F}$. Unlike previous PCP-based proofs, our proof yields exponential-time hardness under the exponential time hypothesis (ETH).
- Succinct arguments. We improve the concrete efficiency of succinct interactive arguments in the generic group model using input-independent preprocessing. In particular, the communication is comparable to sending two group elements and the verifier's computation is dominated by a single group exponentiation. We also show how to use DPPs together with linear-only encryption to construct succinct commit-and-prove arguments.
Secret Key Recovery in a Global-Scale End-to-End Encryption System
End-to-end encrypted messaging applications ensure that an attacker cannot read a user's message history without their decryption keys. While this provides strong privacy, it creates a usability problem: if a user loses their devices and cannot access their decryption keys, they can no longer access their account. To solve this usability problem, users should be able to back up their account information with the messaging provider. For privacy, this backup should be encrypted and the provider should not have access to users' decryption keys. To solve this problem, we present Secure Value Recovery 3 (SVR3), a secret key recovery system that distributes trust across different types of hardware enclaves run by different cloud providers in order to protect users' decryption keys. SVR3 is the first deployed secret key recovery system to split trust across heterogeneous enclaves managed by different cloud providers: this design ensures that a single type of enclave does not become a central point of attack. SVR3 protects decryption keys via rollback protection and fault tolerance techniques tailored to the enclaves' security guarantees. SVR3 costs \$0.0025/user/year and takes 365ms for a user to recover their key, which is a rare operation. A part of SVR3 has been rolled out to millions of real users in a deployment with capacity for over 500 million users, demonstrating the ability to operate at scale.
Cryptanalysis of EagleSign
EagleSign is one of the 40 “Round 1 Additional Signatures” that is accepted for consideration in the supplementary round of the Post-Quantum Cryptography standardization process, organized by NIST. Its design is based on structured lattices, and it boasts greater simplicity and performance compared to the two lattice signatures already selected for standardization: Falcon and Dilithium.
In this paper, we show that those claimed advantages come at the cost of security. More precisely, we show that the distribution of EagleSign signatures leaks information about the private key, to the point that only a few hundred signatures on arbitrary known messages suffice for a full key recovery, for all proposed parameters.
A related vulnerability also affects EagleSign-V2, a subsequent version of the scheme specifically designed to thwart the initial attack. Although a larger number of signatures is required for key recovery, the idea of the attack remains largely similar. Both schemes come with proofs of security that we show are flawed.
Probabilistic Linearization: Internal Differential Collisions in up to 6 Rounds of SHA-3
The SHA-3 standard consists of four cryptographic hash functions, called SHA3-224, SHA3-256, SHA3-384 and SHA3-512, and two extendable-output functions (XOFs), called SHAKE128 and SHAKE256. In this paper, we study the collision resistance of the SHA-3 instances. By analyzing the nonlinear layer, we introduce the concept of maximum difference density subspace, and develop a new target internal difference algorithm by probabilistic linearization. We also exploit new strategies for optimizing the internal differential characteristic. Further more, we figure out the expected size of collision subsets in internal differentials, by analyzing the collision probability of the digests rather than the intermediate states input to the last nonlinear layer. These techniques enhance the analysis of internal differentials, leading to the best collision attacks on four round-reduced variants of the SHA-3 instances. In particular, the number of attacked rounds is extended to 5 from 4 for SHA3-384, and to 6 from 5 for SHAKE256.