Papers updated in last 365 days (Page 26 of 3019 results)
AITIA: Efficient Secure Computation of Bivariate Causal Discovery
Researchers across various fields seek to understand causal relationships but often find controlled experiments impractical. To address this, statistical tools for causal discovery from naturally observed data have become crucial. Non-linear regression models, such as Gaussian process regression, are commonly used in causal inference but have limitations due to high costs when adapted for secure computation. Support vector regression (SVR) offers an alternative but remains costly in an Multi-party computation context due to conditional branches and support vector updates.
In this paper, we propose Aitia, the first two-party secure computation protocol for bivariate causal discovery. The protocol is based on optimized multi-party computation design choices and is secure in the semi-honest setting. At the core of our approach is BSGD-SVR, a new non-linear regression algorithm designed for MPC applications, achieving both high accuracy and low computation and communication costs. Specifically, we reduce the training complexity of the non-linear regression model from approximately from $\mathcal{O}(N^3)$ to $\mathcal{O}(N^2)$ where $N$ is the number of training samples.
We implement Aitia using CrypTen and assess its performance across various datasets. Empirical evaluations show a significant speedup of $3.6\times$ to $340\times$ compared to the baseline approach.
Asynchronous Consensus without Trusted Setup or Public-Key Cryptography
Byzantine consensus is a fundamental building block in distributed cryptographic problems. Despite decades of research, most existing asynchronous consensus protocols require a strong trusted setup and expensive public-key cryptography. In this paper, we study asynchronous Byzantine consensus protocols that do not rely on a trusted setup and do not use public-key cryptography such as digital signatures. We give an Asynchronous Common Subset (ACS) protocol whose security is only based on cryptographic hash functions modeled as a random oracle. Our protocol has $O(\kappa n^3)$ total communication and runs in expected $O(1)$ rounds. The fact that we use only cryptographic hash functions also means that our protocol is post-quantum secure. The minimal use of cryptography and the small number of rounds make our protocol practical. We implement our protocol and evaluate it in a geo-distributed setting with up to 128 machines. Our experimental evaluation shows that our protocol is more efficient than the only other setup-free consensus protocol that has been implemented to date. En route to our asynchronous consensus protocols, we also introduce new primitives called asynchronous secret key sharing and cover gather, which may be of independent interest.
ArcEDB: An Arbitrary-Precision Encrypted Database via (Amortized) Modular Homomorphic Encryption
Fully homomorphic encryption (FHE) based database outsourcing is drawing growing research interests. At its current state, there exist two primary obstacles against FHE-based encrypted databases (EDBs): i) low data precision, and ii) high computational latency. To tackle the precision-performance dilemma, we introduce ArcEDB, a novel FHE-based SQL evaluation infrastructure that simultaneously achieves high data precision and fast query evaluation. Based on a set of new plaintext encoding schemes, we are able to execute arbitrary-precision ciphertext-to-ciphertext homomorphic comparison orders of magnitude faster than existing methods. Meanwhile, we propose efficient conversion algorithms between the encoding schemes to support highly composite SQL statements, including advanced filter-aggregation and multi-column synchronized sorting. We perform comprehensive experiments to study the performance characteristics of ArcEDB. In particular, we show that ArcEDB can be up to $57\times$ faster in homomorphic filtering and up to $20\times$ faster over end-to-end SQL queries when compared to the state-of-the-art FHE-based EDB solutions. Using ArcEDB, a SQL query over a 10K-row time-series EDB with 64-bit timestamps only runs for under one minute.
Cryptography in the Common Haar State Model: Feasibility Results and Separations
Common random string model is a popular model in classical cryptography. We study a quantum analogue of this model called the common Haar state (CHS) model. In this model, every party participating in the cryptographic system receives many copies of one or more i.i.d Haar random states.
We study feasibility and limitations of cryptographic primitives in this model and its variants:
- We present a construction of pseudorandom function-like states with security against computationally unbounded adversaries, as long as the adversaries only receive (a priori) bounded number of copies. By suitably instantiating the CHS model, we obtain a new approach to construct pseudorandom function-like states in the plain model.
- We present separations between pseudorandom function-like states (with super-logarithmic length) and quantum cryptographic primitives, such as interactive key agreement and bit commitment, with classical communication. To show these separations, we prove new results on the indistinguishability of identical versus independent Haar states against LOCC (local operations, classical communication) adversaries.
Efficient isochronous fixed-weight sampling with applications to NTRU
We present a solution to the open problem of designing a linear-time, unbiased and timing attack-resistant shuffling algorithm for fixed-weight sampling. Although it can be implemented without timing leakages of secret data in any architecture, we illustrate with ARMv7-M and ARMv8-A implementations; for the latter, we take advantage of architectural features such as NEON and conditional instructions, which are representative of features available on architectures targeting similar systems, such as Intel. Our proposed algorithm improves asymptotically upon the current approach based on constant-time sorting networks ($O(n)$ versus $O(n \log^2 n)$), and an implementation of the new algorithm applied to NTRU is also faster in practice, by a factor of up to $6.91\ (591\%)$ on ARMv8-A cores and $12.89\ (1189\%)$ on the Cortex-M4; it also requires fewer uniform random bits. This translates into performance improvements for NTRU encapsulation, compared to state-of-the-art implementations, of up to 50\% on ARMv8-A cores and 72\% on the Cortex-M4, and small improvements to key generation (up to 2.7\% on ARMv8-A cores and 6.1\% on the Cortex-M4), with negligible impact on code size and a slight improvement in RAM usage for the Cortex-M4.
Compact Key Function Secret Sharing with Non-linear Decoder
We present a variant of Function Secret Sharing (FSS) schemes tailored for point, comparison, and interval functions, featuring compact key sizes at the expense of additional comparison. While existing FSS constructions are primarily geared towards $2$-party scenarios, exceptions such as the work by Boyle et al. (Eurocrypt 2015) and Riposte (S&P 2015) have introduced FSS schemes for $p$-party scenarios ($p \geq 3$). This paper aims to achieve the most compact $p$-party FSS key size to date. We achieve a noteworthy reduction in key size, a $2^p$-factor decrease compared to state-of-the-art FSS constructions (including computationally efficient constructions using symmetric-key primitives) of distributed point function (DPF). Compared to the previous public-key-based FSS design for DPF, we also get a key size reduction equal to a $2^{n/2}$-sized row vector, where $2^n$ is the domain size of the point function. This reduction in key size comes at the cost of a required comparison operation by the decoder (hence called a non-linear decoder), a departure from prior schemes. In $p$-party scenarios, our construction outperforms existing FSS constructions in key size, remaining on par with Riposte in evaluation time and showing significant improvement over Boyle et al.
In addition to constructing FSS for distributed point functions (DPF), we extend our approach to distributed comparison and interval functions, achieving the most efficient key size to date. Our distributed comparison function exhibits a key-size reduction by a factor of $q^{p-1}$, where $q$ denotes the size of the algebraic group used in the scheme's construction.
The reduced key size of the comparison function has practical implications, particularly in applications like privacy-preserving machine learning (PPML), where thousands of comparison functions are employed in each neural network layer.
To demonstrate the effectiveness of our improvements, we design and prototype-implement a scalable privacy-preserving framework for neural networks over distributed models. Specifically, we implement a distributed rectified linear unit (ReLU) activation function using our distributed comparison function, showcasing the efficacy of our proposed scheme.
Insta-Pok3r: Real-time Poker on Blockchain
We develop a distributed service for generating correlated randomness (e.g. permutations) for multiple parties, where each party’s output is private but publicly verifiable. This service provides users with a low-cost way to play online poker in real-time, without a trusted party.
Our service is backed by a committee of compute providers, who run a multi-party computation (MPC) protocol to produce an (identity-based) encrypted permutation of a deck of cards, in an offline phase well ahead of when the players’ identities are known. When the players join, what we call the online phase, they decrypt their designated cards immediately after deriving the identity-based decryption keys, a much simpler computation. In addition, the MPC protocol also generates a publicly-verifiable proof that the output is a permutation.
In our construction, we introduce a new notion of succinctly verifiable multi-identity based encryption (SVME), which extends the existing notion of verifiable encryption to a multi-identity-based setting, but with a constant sized proof – this may be of independent interest. We instantiate this for a permutation relation (defined over a small set) along with identity-based encryption, polynomial commitments and succinct proofs – our choices are made to enable a distributed computation when the card deck is always secret shared. Moreover, we design a new protocol to efficiently generate a secret-sharing of random permutation of a small set, which is run prior to distributed SVME.
Running these protocols offline simplifies the online phase substantially, as parties only derive their identity-specific keys privately via secure channels with the MPC committee, and then decrypt locally to obtain their decks. We provide a rigorous UC-based formalization in a highly modularized fashion.
Finally, we demonstrate practicality with an implementation that shows that for 8 MPC parties, gen- erating a secret publicly-verifiable permutation of 64 cards takes under 3 seconds, while accessing cards for a player takes under 0.3 seconds.
Quirky Interactive Reductions of Knowledge
Interactive proofs and arguments of knowledge can be generalized to the concept of interactive reductions of knowledge, where proving knowledge of a witness for one NP language is reduced to proving knowledge of a witness for another NP language. We take this generalization and specialize it to a class of reductions we refer to as `quirky interactive reductions of knowledge' (or QUIRKs). This name reflects our particular design choices within the broad and diverse world of interactive reduction methods. A central design choice is allowing the prover to rewind or regress to any previous reduction and repeat it as many times as desired. We prove completeness and extractability properties for QUIRKs. We also offer tools for constructing extraction algorithms along with several simple examples of usage.
Watermarking Language Models for Many Adaptive Users
We study watermarking schemes for language models with provable guarantees. As we show, prior works offer no robustness guarantees against adaptive prompting: when a user queries a language model more than once, as even benign users do. And with just a single exception (Christ and Gunn, 2024), prior works are restricted to zero-bit watermarking: machine-generated text can be detected as such, but no additional information can be extracted from the watermark. Unfortunately, merely detecting AI-generated text may not prevent future abuses.
We introduce multi-user watermarks, which allow tracing model-generated text to individual users or to groups of colluding users, even in the face of adaptive prompting. We construct multi-user watermarking schemes from undetectable, adaptively robust, zero-bit watermarking schemes (and prove that the undetectable zero-bit scheme of Christ, Gunn, and Zamir (2024) is adaptively robust). Importantly, our scheme provides both zero-bit and multi-user assurances at the same time. It detects shorter snippets just as well as the original scheme, and traces longer excerpts to individuals.
The main technical component is a construction of message-embedding watermarks from zero-bit watermarks. Ours is the first generic reduction between watermarking schemes for language models. A challenge for such reductions is the lack of a unified abstraction for robustness --- that marked text is detectable even after edits. We introduce a new unifying abstraction called AEB-robustness. AEB-robustness provides that the watermark is detectable whenever the edited text "approximates enough blocks" of model-generated output.
HEProfiler: An In-Depth Profiler of Approximate Homomorphic Encryption Libraries
Fully Homomorphic Encryption (FHE) allows computation on encrypted
data. Various software libraries have implemented the approximate-
arithmetic FHE scheme CKKS, which is highly useful for applications
in machine learning and data analytics; each of these libraries have differing performance and features. It is useful for developers and researchers to learn details about these libraries’ performance and their differences. Some previous work has profiled FHE and CKKS implementations for this purpose, but these comparisons are limited in their fairness and completeness.
In this article, we compare four major libraries supporting the CKKS
scheme. Working with the maintainers of each of the PALISADE,
Microsoft SEAL, HElib, and HEAAN libraries, we devise methods for fair
comparisons of these libraries, even with their widely varied development
strategies and library architectures. To show the practical performance of
these libraries, we present HEProfiler, a simple and extensible framework
for profiling C++ FHE libraries. Our experimental evaluation is complete in both the scope of tasks tested and metrics evaluated, allowing
us to draw conclusions about the behaviors of different libraries under a
wide range of real-world workloads. This is the first work-giving experimental comparisons of different bootstrapping-capable CKKS libraries.
On Central Primitives for Quantum Cryptography with Classical Communication
Recent work has introduced the "Quantum-Computation Classical-Communication"
(QCCC) (Chung et. al.) setting for cryptography. There has been some evidence that
One Way Puzzles (OWPuzz) are the natural central cryptographic primitive for this
setting (Khurana and Tomer). For a primitive to be considered central it should
have several characteristics. It should be well behaved (which for this paper we will
think of as having amplification, combiners, and universal constructions); it should
be implied by a wide variety of other primitives; and it should be equivalent to some
class of useful primitives. We present combiners, correctness and security amplifica-
tion, and a universal construction for OWPuzz. Our proof of security amplification
uses a new and cleaner version construction of EFI from OWPuzz (in comparison to
the result of Khurana and Tomer) that generalizes to weak OWPuzz and is the most
technically involved section of the paper. It was previously known that OWPuzz are
implied by other primitives of interest including commitments, symmetric key encryp-
tion, one way state generators (OWSG), and therefore pseudorandom states (PRS).
However we are able to rule out OWPuzz’s equivalence to many of these primitives
by showing a black box separation between general OWPuzz and a restricted class
of OWPuzz (those with efficient verification, which we call EV − OWPuzz). We then
show that EV − OWPuzz are also implied by most of these primitives, which separates
them from OWPuzz as well. This separation also separates extending PRS from highly
compressing PRS answering an open question of Ananth et. al.
Password-authenticated Key Exchange and Applications
We analyse a two password-authenticated key exchange protocols, a variant of CPace and a protocol related to the well-known SRP protocol. Our security results are tight. The first result gives us some information about trade-offs for design choices in CPace. The second result provides information about the security of SRP.
Our analysis is done in a new game-based security definition for password-authenticated key exchange. Our definition accomodates arbitrary password sampling methodologies. Our definition also supports modular security analysis, which we illustrate by giving two example applications of password-authenticated key exchange: password-authenticated secure channels and password-authenticated device authorisation, capturing popular applications of passwords.
Leakage Certification Made Simple
Side channel evaluations benefit from sound characterisations of adversarial leakage models, which are the determining factor for attack success. Two questions are of interest: can we define and estimate a quantity that captures the ideal adversary (who knows all the distributions that are involved in an attack), and can we define and estimate a quantity that captures a concrete adversary (represented by a given leakage model)?
Existing work has led to a proliferation of custom quantities to measure both types of adversaries, which can be data intensive to estimate in the ideal case, even for discrete side channels and especially when the number of dimensions in the side channel traces grows.
In this paper, we show how to define the mutual information between carefully chosen variables of interest and how to instantiate a recently suggested mutual information estimator for practical estimation. We apply our results to real-world data sets and are the first to provide a mutual information-based characterisation of ideal and concrete adversaries utilising up to 30 data points.
Shuffle Arguments Based on Subset-Checking
Zero-knowledge shuffle arguments are a useful tool for constructing mix-nets which enable anonymous communication. We propose a new shuffle argument using a novel technique that probabilistically checks that each weighted set of input elements corresponds to some weighted set of output elements, with weights from the same set as the input element weights. We achieve this using standard discrete log assumptions and the shortest integer solution (SIS) assumption. Our shuffle argument has prover and verifier complexity linear in the size of the shuffled set, and communication complexity logarithmic both in the shuffled set size and security parameter.
Structured-Seed Local Pseudorandom Generators and their Applications
In this note, we introduce structured-seed local pseudorandom generators, a relaxation of local pseudorandom generators. We provide constructions of this primitive under the sparse-LPN assumption, and explore its implications.
Minimize the Randomness in Rasta-Like Designs: How Far Can We Go?
The Rasta design strategy allows building low-round ciphers due to its efficient prevention of statistical attacks and algebraic attacks by randomizing the cipher, which makes it especially suitable for hybrid homomorphic encryption (HHE), also known as transciphering. Such randomization is obtained by pseudorandomly sampling new invertible matrices for each round of each new cipher evaluation. However, naively sampling a random invertible matrix for each round significantly impacts the plain evaluation runtime, though it does not impact the homomorphic evaluation cost. To address this issue, Dasta was proposed at ToSC 2020 to reduce the cost of generating the random matrices.
In this work, we address this problem from a different perspective: How far can the randomness in Rasta-like designs be reduced in order to minimize the plain evaluation runtime without sacrificing the security? To answer this question, we carefully studied the main threats to Rasta-like ciphers and the role of random matrices in ensuring security. We apply our results to the recently proposed cipher $\text{PASTA}$, proposing a modified version called $\text{PASTA}_\text{v2}$ instantiated with one initial random matrix and fixed linear layers - obtained by combining two MDS matrices with the Kronecker product - for the other rounds.
Compared with $\text{PASTA}$, the state-of-the-art cipher for BGV- and BFV-style HHE, our evaluation shows that $\text{PASTA}_\text{v2}$ is up to 100% faster in plain while having the same homomorphic runtime in the SEAL homomorphic encryption library and up to 30% faster evaluation time in HElib, respectively.
Enhancing Local Verification: Aggregate and Multi-Signature Schemes
An aggregate signature scheme is a digital signature protocol that enables the aggregation of multiple signatures. Given n signatures on n distinct messages from n different users, it is possible to combine all these signatures into a single, concise signature. This single signature, along with the n original messages, convinces the verifier that the n users indeed signed their respective n original messages. However, the verifier must have access to all the original messages to perform the verification, highlighting a potential limitation in terms of accessibility and efficiency. Goyal and Vaikuntanathan introduced the concept of local verification,
which allows the verifier to determine if a specific message m is part of the aggregated signature by only accessing the message m. In this paper, we extend the single-signer locally verifiable aggregate signature scheme initially proposed by Goyal and Vaikuntanathan, adapting it to a multi-signer context. Our generalization allows the verifier to validate multiple signatures simultaneously using an auxiliary value generated by the LocalOpen algorithm, thereby enhancing verification efficiency. Furthermore, we integrate this approach into the multi-signature scheme proposed by Boneh, Drijvers, and Neven, demonstrating its broader applicability and potential benefits in complex cryptographic systems.
Optimized Computation of the Jacobi Symbol
The Jacobi Symbol is an essential primitive in cryptographic applications such as primality testing, integer factorization, and various encryption schemes. By exploring the interdependencies among modular reductions within the algorithmic loop, we have developed a refined method that significantly enhances computational efficiency. Our optimized algorithm, implemented in the Rust language, achieves a performance increase of 72% over conventional textbook methods and is twice as fast as the previously fastest known Rust implementation.
This work not only provides a detailed analysis of the optimizations but also includes comprehensive benchmark comparisons to illustrate the practical advantages of our methods. Our algorithm is publicly available under an open-source license, promoting further research on foundational cryptographic optimizations.
Auditable Attribute-Based Credentials Scheme and Its Application in Contact Tracing
During the pandemic, the limited functionality of existing privacy-preserving contact tracing systems highlights the need for new designs. Wang et al. proposed an environmental-adaptive framework (CSS '21) but failed to formalize the security. The similarity between their framework and attribute-based credentials (ABC) inspires us to reconsider contact tracing from the perspective of ABC schemes. In such schemes, users can obtain credentials on attributes from issuers and prove the credentials anonymously (i.e., hiding sensitive information of both user and issuer). This work first extends ABC schemes with auditability, which enables designated auditing authorities to revoke the anonymity of particular issuers. For this purpose, we propose an ``auditable public key (APK)'' mechanism that extends the updatable public key by Fauzi et al. (AsiaCrypt '19). We provide formal security definitions regarding auditability and build our auditable ABC scheme by adding a DDH-based APK to Connolly et al.'s ABC construction (PKC '22). Note that the APK mechanism can be used as a plug-in for other cryptographic primitives and may be of independent interest. Finally, regarding contact tracing, we refine Wang et al.'s framework and present a formal treatment that includes security definitions and protocol construction. An implementation is provided to showcase the practicality of our design.
Stochastic Secret Sharing with $1$-Bit Shares and Applications to MPC
The problem of minimizing the share size of threshold secret-sharing schemes is a basic research question that has been extensively studied. Ideally, one strives for schemes in which the share size equals the secret size. While this is achievable for large secrets (Shamir, CACM '79), no similar solutions are known for the case of binary, single-bit secrets. Current approaches often rely on so-called ramp secret sharing that achieves a constant share size at the expense of a slight gap between the privacy and the correctness thresholds. In the case of single-bit shares, this leads to a large gap which is typically unacceptable. The possibility of a meaningful notion of secret sharing scheme with 1-bit shares and almost optimal threshold has been left wide open. Of special interest is the case of threshold 0.5, which is motivated by information-theoretic honest-majority secure multiparty computation (MPC).
In this work, we present a new stochastic model for secret-sharing where each party is corrupted by the adversary with probability $p$, independently of the other parties, and correctness and privacy are required to hold with high probability over the choice of the corrupt parties. We present new secret sharing schemes with single-bit shares that tolerate any constant corruption probability $p<0.5$. Our construction is based on a novel connection between such stochastic secret-sharing schemes and error-correcting codes that achieve capacity over the binary erasure channel.
Our schemes are linear and multiplicative. We demonstrate the usefulness of the model by using our new schemes to construct MPC protocols with security against an adversary that passively corrupts an arbitrary subset of $0.499n$ of the parties, where the online communication per party consists of a single bit per AND gate and zero communication per XOR gate. Unlike competing approaches for communication-efficient MPC, our solution is applicable even in a real-time model in which the parties should compute a Boolean circuit whose gates arrive in real-time, one at a time, and are not known in advance.
On Sequential Functions and Fine-Grained Cryptography
A sequential function is, informally speaking, a function $f$ for which a massively parallel adversary cannot compute "substantially" faster than an honest user with limited parallel computation power. Sequential functions form the backbone of many primitives that are extensively used in blockchains such as verifiable delay functions (VDFs) and time-lock puzzles. Despite this widespread practical use, there has been little work studying the complexity or theory of sequential functions.
Our main result is a black-box oracle separation between sequential functions and one-way functions: in particular, we show the existence of an oracle $\mathcal{O}$ that implies a sequential function but not a one-way function. This seems surprising since sequential functions are typically constructed from very strong assumptions that imply one-way functions and also since time-lock puzzles are known to imply one-way functions (Bitansky et al., ITCS '16).
We continue our exploration of the theory of sequential functions. We show that, informally speaking, the decisional, worst-case variant of a certain class of sequential function called a continuous iterative sequential function (CISF) is PSPACE-complete. A CISF is, in a nutshell, a sequential function $f$ that can be written in the form $f \left(k, x \right) = g^{k} \left(x \right)$ for some function $g$ where $k$ is an input determining the number of "rounds" the function is evaluated. We then show that more general forms of sequential functions are not contained in PSPACE relative to a random oracle.
Given these results, we then ask if it is possible to build any interesting cryptographic primitives from sequential functions that are not one-way. It turns out that even if we assume just the existence of a CISF that is not one-way, we can build certain "fine-grained" cryptographic primitives where security is defined similarly to traditional primitives with the exception that it is only guaranteed for some (generally polynomial) amount of time. In particular, we show how to build "fine-grained" symmetric key encryption and "fine-grained" MACs from a CISF. We also show how to build fine-grained public-key encryption from a VDF with a few extra natural properties and indistinguishability obfucsation (iO) for null circuits. We do not assume one-way functions. Finally, we define a primitive that we call a commutative sequential function - essentially a sequential function that can be computed in sequence to get the same output in two different ways - and show that it implies fine-grained key exchange.
The Sum-Check Protocol over Fields of Small Characteristic
The sum-check protocol of Lund, Fortnow, Karloff, and Nisan underlies SNARKs with the fastest known prover. In many of its applications, the prover can be implemented with a number of field operations that is linear in the number, $n$, of terms being summed.
We describe an optimized prover implementation when the protocol is applied over an extension field of a much smaller base field. The rough idea is to keep most of the prover's multiplications over the base field, at the cost of performing more $\textit{total}$ field multiplications. When the sum-check protocol is applied to a product of polynomials that all output values in the base field, our algorithm reduces the number of extension field operations by multiple orders of magnitude. In other settings, our improvements are more modest but nonetheless meaningful.
In SNARK design, the sum-check protocol is often combined with a polynomial commitment scheme, which are growing faster, especially when the values being committed are small. These improved commitment schemes are likely to render the sum-check prover the overall bottleneck, which our results help to mitigate.
Vision Mark-32: ZK-Friendly Hash Function Over Binary Tower Fields
Zero-knowledge proof systems are widely used in different applications on the Internet. Among zero-knowledge proof systems, SNARKs are a popular choice because of their fast verification time and small proof size. The efficiency of zero-knowledge systems is crucial for usability, resulting in the development of so-called arithmetization-oriented ciphers. In this work, we introduce Vision Mark-32, a modified instance of Vision defined over binary tower fields, with an optimized number of rounds and an efficient MDS matrix. We implement a fully-pipelined Vision Mark-32 permutation on Alveo U55C FPGA accelerator card and argue an order of magnitude better hardware efficiency compared to the popular Poseidon hash. Our fully-pipelined Vision Mark-32 implementation runs at 250 MHz and uses 398 kLUT and 104 kFF.
Lastly, we delineate how to implement each step efficiently in hardware.
Efficient Secret Sharing for Large-Scale Applications
Threshold secret sharing enables distributing a message to $n$ parties such that no subset of fewer than $t$ parties can learn the message, whereas any subset of at least $t$ parties can recover the message. Despite being a fundamental primitive, secret sharing still suffers from one significant drawback, where its message reconstruction algorithm is computationally expensive for large privacy thresholds $t$. In this paper, we aim to address this significant drawback.
We study general $(t,c)$-ramp secret sharing schemes where the number of parties c needed to reconstruct the secret may be larger than $t$. We present a ramp secret sharing scheme whose reconstruction time is 2-7.8x faster than prior constructions suitable against adversaries that adaptively corrupt parties. For $t = 2^{20}$, our new protocol has reconstruction time of 5 seconds whereas prior work requires nearly half a minute. We see improvements starting from as small as $t = 256$. Furthermore, we obtain correctness threshold as small as $c \ge 1.05t$. To obtain our construction, we first improve the secret sharing frameworks by Cramer et al. (EUROCRYPT'15) and Applebaum et al. (CRYPTO'23) from erasure codes. Our new framework obtains secret sharing schemes that may be used against adversaries with adaptive corruptions while requiring only weaker correctness guarantees from the underlying erasure code with a distributed generation property. Furthermore, our new framework also maintains the linear homomorphism of the prior works. Afterwards, we present a concretely efficient erasure code from random band matrices that satisfies the distributed generation property.
We show that our secret sharing scheme can improve many real-world applications. In secure aggregation protocols for federated learning, we obtain up to 22% reductions in computational cost by replacing Shamir's scheme with our construction. We extend our protocol to obtain a verifiable ramp secret sharing scheme where each party can verify the consistency of the shares. Our new verifiable ramp secret sharing has 8.2-25.2x faster sharing and 2.7-23.2x faster reconstruction time compared to prior works. Finally, we present an improved distributed verifiable random function that may be used for decentralized randomness beacons.
Arithmetisation of computation via polynomial semantics for first-order logic
We propose a compositional shallow translation from a first-order logic with equality, into polynomials; that is, we arithmetise the semantics of first-order logic. Using this, we can translate specifications of mathematically structured programming into polynomials, in a form amenable to succinct cryptographic verification. We give worked example applications, and we propose a proof-of-concept succinct verification scheme based on inner product arguments.
First-order logic is widely used, because it is simple, highly expressive, and has excellent mathematical properties. Thus a compositional shallow embedding into polynomials suggests a simple, high-level, yet potentially very practical new method for expressing verifiable computation.
Searching for differential addition chains
The literature sometimes uses slow algorithms to find minimum-length continued-fraction differential addition chains to speed up subsequent computations of multiples of points on elliptic curves. This paper introduces two faster algorithms to find these chains. The first algorithm prunes more effectively than previous algorithms. The second algorithm uses a meet-in-the-middle approach and appears to have a limiting cost exponent below 1.
Generalized Kotov-Ushakov Attack on Tropical Stickel Protocol Based on Modified Tropical Circulant Matrices
After the Kotov-Ushakov attack on the tropical implementation of Stickel protocol, various attempts have been made to create a secure variant of such implementation. Some of these attempts used a special class of commuting matrices resembling tropical circulants, and they have been proposed with claims of resilience against the Kotov-Ushakov attack, and even being potential post-quantum candidates. This paper, however, reveals that a form of the Kotov-Ushakov attack remains applicable and, moreover, there are heuristic implementations of that attack which have a polynomial time complexity and show an overwhelmingly good success rate.
Unconditionally secure MPC for Boolean circuits with constant online communication
Through tremendous efforts, the communication cost of secure multi-party computation (MPC) in the honest-majority setting has been significantly improved.
In particular, the state-of-the-art honest-majority MPC protocol by Escudero et al. (CCS'22) takes 12 field elements in total per multiplication gate for arithmetic circuits in the online phase. However, it still requires $12 \log(5n/4)$ bits of online communication per AND gate for Boolean circuits. That is, for Boolean circuits, no MPC protocol with constant online communication is known.
In this paper, we present an unconditionally secure MPC protocol for Boolean circuits in the honest-majority setting, which has constant online communication complexity and the offline communication complexity linear to the number $n$ of parties. We first describe the semi-honest MPC protocol and then show how to extend it to achieve malicious security, where the maliciously secure protocol has the same communication cost as the semi-honest protocol.
In particular, our protocol achieves the amortized communication cost $36$ bits per AND gate in the online phase, $18n+24$ bits per AND gate in the offline phase. For those circuit wires that require routing, each wire incurs a communication overhead of $15n$ bits in the offline phase.
PeaceFounder: centralised E2E verifiable evoting via pseudonym braiding and history trees
PeaceFounder is a centralised E2E verifiable e-voting system that leverages pseudonym braiding and history trees. The immutability of the bulletin board is maintained replication-free by voter’s client devices with locally stored consistency-proof chains. Meanwhile, pseudonym braiding done via an exponentiation mix before the vote allows anonymisation to be transactional with a single braider at a time. In contrast to existing E2E verifiable e-voting systems, it is much easier to deploy as the system is fully centralised, free from threshold decryption ceremonies, trusted setup phases and bulletin board replication. Furthermore, the body of a vote is signed with a braided pseudonym, enabling unlimited ballot types.
Reduction from Average-Case M-ISIS to Worst-Case CVP Over Perfect Lattices
This paper presents a novel reduction from the average-case hardness of the Module Inhomogeneous Short Integer Solution (M-ISIS) problem to the worst-case hardness of the Closest Vector Problem (CVP) by defining and leveraging “perfect” lattices for cryptographic purposes.
Perfect lattices, previously only theoretical constructs, are characterized by their highly regular structure, optimal density, and a central void, which we term the “Origin Cell.” The simplest Origin Cell is a hypercube with edge length 1 centered at the origin, guaranteed to be devoid of any valid lattice points.
By exploiting the unique properties of the Origin Cell, we recalibrate the parameters of the M-ISIS and CVP problems. Our results demonstrate that solving M-ISIS on average over perfect lattices is at least as hard as solving CVP in the worst case, thereby providing a robust hardness guarantee for M-ISIS. Additionally, perfect lattices facilitate exceptionally compact cryptographic variables, enhancing the efficiency of cryptographic schemes.
This significant finding enhances the theoretical foundation of lattice-based cryptographic problems and confirms the potential of perfect lattices in ensuring strong cryptographic security. The Appendix includes SageMath code to demonstrate the reproducibility of the reduction process from M-ISIS to CVP.
Space-Efficient and Noise-Robust Quantum Factoring
We provide two improvements to Regev's quantum factoring algorithm (arXiv:2308.06572), addressing its space efficiency and its noise-tolerance.
Our first contribution is to improve the quantum space efficiency of Regev's algorithm while keeping the circuit size the same. Our main result constructs a quantum factoring circuit using $O(n \log n)$ qubits and $O(n^{3/2} \log n)$ gates. We achieve the best of Shor and Regev (upto a logarithmic factor in the space complexity): on the one hand, Regev's circuit requires $O(n^{3/2})$ qubits and $O(n^{3/2} \log n)$ gates, while Shor's circuit requires $O(n^2 \log n)$ gates but only $O(n)$ qubits. As with Regev, to factor an $n$-bit integer $N$, we run our circuit independently $\approx \sqrt{n}$ times and apply Regev's classical postprocessing procedure.
Our optimization is achieved by implementing efficient and reversible exponentiation with Fibonacci numbers in the exponent, rather than the usual powers of 2, adapting work by Kaliski (arXiv:1711.02491) from the classical reversible setting to the quantum setting. This technique also allows us to perform quantum modular exponentiation that is efficient in both space and size without requiring significant precomputation, a result that may be useful for other quantum algorithms. A key ingredient of our exponentiation implementation is an efficient circuit for a function resembling in-place quantum-quantum modular multiplication. This implementation works with only black-box access to any quantum circuit for out-of-place modular multiplication, which we believe is yet another result of potentially broader interest.
Our second contribution is to show that Regev's classical postprocessing procedure can be modified to tolerate a constant fraction of the quantum circuit runs being corrupted by errors. In contrast, Regev's analysis of his classical postprocessing procedure requires all $\approx \sqrt{n}$ runs to be successful. In a nutshell, we achieve this using lattice reduction techniques to detect and filter out corrupt samples.
Encryption Based Covert Channel for Large Language Models
Transformer neural networks have gained significant traction since their introduction, becoming pivotal across diverse domains. Particularly in large language models like Claude and ChatGPT, the transformer architecture has demonstrated remarkable efficacy. This paper provides a concise overview of transformer neural networks and delves into their security considerations, focusing on covert channel attacks and their implications for the safety of large language models. We present a covert channel utilizing encryption and demonstrate its efficacy in circumventing Claude.ai's security measures. Our experiment reveals that Claude.ai appears to log our queries and blocks our attack within two days of our initial successful breach. This raises two concerns within the community: (1) The extensive logging of user inputs by large language models could pose privacy risks for users. (2) It may deter academic research on the security of such models due to the lack of experiment repeatability.
Improved Rectangle Attacks on SKINNY and CRAFT
The boomerang and rectangle attacks are adaptions of differential cryptanalysis that regard the target cipher $E$ as a composition of two sub-ciphers, i.e., $E = E_{1}\circ E_{0}$, to construct a distinguisher for $E$ with probability $p^{2}q^{2}$ by concatenating two short differential trails for $E_{0}$ and $E_{1}$ with probability $p$ and $q$ respectively. According to the previous research, the dependency between these two differential characteristics has a great impact on the probability of boomerang and rectangle distinguishers. Dunkelman et al. proposed the sandwich attack to formalise such dependency that regards $E$ as three parts, i.e., $E = E_{1}\circ E_{m}\circ E_{0}$, where $E_{m}$ contains the dependency between two differential trails, satisfying some differential propagation with probability $r$. Accordingly, the entire probability is $p^{2}q^{2}r$. Recently, Song et al. have proposed a general framework to identify the actual boundaries of $E_{m}$ and systematically evaluate the probability of $E_{m}$ with any number of rounds, and applied their method to accurately evaluate the probabilities of the best SKINNY's boomerang distinguishers. In this paper, using a more advanced method to search for boomerang distinguishers, we show that the best previous boomerang distinguishers for SKINNY can be significantly improved in terms of probability and number of rounds. More precisely, we propose related-tweakey boomerang distinguishers for up to 19, 21, 23, and 25 rounds of SKINNY-64-128, SKINNY-128-256, SKINNY-64-192 and SKINNY-128-384 respectively, which improve the previous boomerang distinguishers of these variants of SKINNY by 1, 2, 1, and 1 round respectively. Based on the improved boomerang distinguishers for SKINNY, we provide related-tweakey rectangle attacks on 23 rounds of SKINNY-64-128, 24 rounds of SKINNY-128-256, 29 rounds of SKINNY-64-192, and 30 rounds of SKINNY-128-384. It is worth noting that our improved related-tweakey rectangle attacks on SKINNY-64-192, SKINNY-128-256 and SKINNY-128-384 can be directly applied for the same number of rounds of ForkSkinny-64-192, ForkSkinny-128-256 and ForkSkinny-128-384 respectively. CRAFT is another SKINNY-like tweakable block cipher for which we provide the security analysis against rectangle attack for the first time. As a result, we provide a 14-round boomerang distinguisher for CRAFT in the single-tweak model based on which we propose a single-tweak rectangle attack on 18 rounds of this cipher. Moreover, following the previous research regarding the evaluation of switching in multiple rounds of boomerang distinguishers, we also introduce new tools called Double Boomerang Connectivity Table $\tt{DBCT}$, $\tt{LBCT}^{\scriptsize{=}|}$, and $\tt{UBCT}^{\vDash}$ to evaluate the boomerang switch through the multiple rounds more accurately.
Reading It like an Open Book: Single-trace Blind Side-channel Attacks on Garbled Circuit Frameworks
Garbled circuits (GC) are a secure multiparty computation protocol that enables two parties to jointly compute a function using their private data without revealing it to each other. While garbled circuits are proven secure at the protocol level, implementations can still be vulnerable to side-channel attacks. Recently, side-channel analysis of GC implementations has garnered significant interest from researchers.
We investigate popular open-source GC frameworks and discover that the AES encryption used in the garbling process follows a secret-dependent sequence. This vulnerability allows private inputs to be exposed through side-channel analysis. Based on this finding, we propose a side-channel attack on garbled circuits to recover the private inputs of both parties. Our attack does not require access to any plaintexts or ciphertexts in the protocol and is single-trace, adhering to the constraint that a garbled circuit can be executed only once. Furthermore, unlike existing attacks that can only target input non-XOR gates, our method applies to both input and internal non-XOR gates. Consequently, the secrets associated with every non-XOR gate are fully exposed as in an open book.
We comprehensively evaluate our attack in various scenarios. First, we perform the attack on single-platform software implementations of standard AES and interleaved AES on a 32-bit ARM processor, achieving a $100\%$ success rate in both cases. Next, we target a hardware implementation on a Xilinx Artix-7 FPGA, where the resolution of power consumption measurements and the number of samples are significantly limited. In this scenario, our attack achieves a success rate of $79.58\%$. Finally, we perform a cross-platform attack on two processors with different microarchitectures representing the two parties. The differing execution cycles and power sensors across the platforms increase the difficulty of side-channel analysis. Despite these challenges, our point-of-interest (POI) selection method allows our attack to achieve a $100\%$ success rate in this scenario as well. We also discuss effective countermeasures that can be readily applied to GC frameworks to mitigate this vulnerability.
Adaptively Secure 5 Round Threshold Signatures from MLWE/MSIS and DL with Rewinding
T-out-of-N threshold signatures have recently seen a renewed interest, with various types now available, each offering different tradeoffs.
However, one property that has remained elusive is adaptive security. When we target thresholdizing existing efficient signatures schemes based on the Fiat-Shamir paradigm such as Schnorr, the elusive nature becomes clear. This class of signature schemes typically rely on the forking lemma to prove unforgeability. That is, an adversary is rewound and run twice within the security game. Such a proof is at odds with adaptive security, as the reduction must be ready to answer 2(T-1) secret key shares in total, implying that it can reconstruct the full secret key. Indeed, prior works either assumed strong idealized models such as the algebraic group model (AGM) or modified the underlying signature scheme so as not to rely on rewinding based proofs.
In this work, we propose a new proof technique to construct adaptively secure threshold signatures for existing rewinding-based Fiat-Shamir signatures. As a result, we obtain the following:
1. The first adaptively secure 5 round lattice-based threshold signature under the MLWE and MSIS assumptions in the ROM. The resulting signature is a standard signature of Raccoon, a lattice-based signature scheme by del Pino et al., submitted to the additional NIST call for proposals.
2. The first adaptively secure 5 round threshold signature under the DL assumption in the ROM. The resulting signature is a standard Schnorr signature. To the best of our knowledge, this is the first adaptively secure threshold signature based on DL even assuming stronger models like AGM.
Our work is inspired by the recent statically secure lattice-based 3 round threshold signature by del Pino et al. (Eurocrypt~2024) based on Raccoon. While they relied on so-called one-time additive masks to solve lattice specific issues, we notice that these masks can also be a useful tool to achieve adaptive security. At a very high level, we use these masks throughout the signing protocol to carefully control the information the adversary can learn from the signing transcripts. Intuitively, this allows the reduction to return a total of 2(T-1) randomly sampled secret key shares to the adversary consistently and without being detected, resolving the above paradoxical situation. Lastly, by allowing the parties to maintain a simple state, we can compress our 5 round schemes into 4 rounds.
Threshold OPRF from Threshold Additive HE
An oblivious pseudorandom function (OPRF) is a two-party protocol in which a party holds an input and the other party holds the PRF key, such that the party having the input only learns the PRF output and the party having the key would not learn the input. Now, in a threshold oblivious pseudorandom function (TOPRF) protocol, a PRF key K is initially shared among T servers. A client can obtain a PRF value by interacting with t(≤ T) servers but is unable to compute the same with up to (t − 1) servers. In this paper, we present a practically efficient homomorphic encryption (HE)-based post-quantum secure TOPRF protocol. Our proposed approach, which is based on a novel use of threshold HE, is agnostic of the underlying PRF and outperforms existing fully homomorphic encryption (FHE)-based approaches for TOPRF computation by several orders of magnitude in terms of running time. The FHE-based approaches require bootstrapping, a computationally extensive operation, and the primary bottleneck for evaluating large-depth circuits. Whereas, our proposed approach is based on a multi-party computation (MPC) protocol that uses a threshold additive HE scheme based on Regev’s cryptosystem (J’ACM 2009) alternative to FHE-based approaches. Concretely, we show a novel replacement of bootstrapping required in traditional FHE schemes by a threshold additive HE-based interactive protocol that performs masked decryption followed by table look-ups, jointly performed by a group of servers holding secret shares of the HE decryption key. Finally, We present a practical validation of our approach by realizing an AES-based TOPRF with an evaluation time of less than 1 second on consumer-grade server(s).
Combined Threshold Implementation
Physical security is an important aspect of devices for which an adversary can manipulate the physical execution environment. Recently, more and more attention has been directed towards a security model that combines the capabilities of passive and active physical attacks, i.e., an adversary that performs fault-injection and side-channel analysis at the same time. Implementing countermeasures against such a powerful adversary is not only costly but also requires the skillful combination of masking and redundancy to counteract all reciprocal effects.
In this work, we propose a new methodology to generate combined-secure circuits. We show how to transform TI-like constructions to resist any adversary with the capability to tamper with internal gates and probe internal wires. For the resulting protection scheme, we can prove the combined security in a well-established theoretical security model.
Since the transformation preserves the advantages of TI-like structures, the resulting circuits prove to be more efficient in the number of required bits of randomness (up to 100%), the latency in clock cycles (up to 40%), and even the area for pipelined designs (up to 40%) than the state of the art for an adversary restricted to manipulating a single gate and probing a single wire.
SACfe: Secure Access Control in Functional Encryption with Unbounded Data
Privacy is a major concern in large-scale digital applications, such as cloud-computing, machine learning services, and access control. Users want to protect not only their plain data but also their associated attributes (e.g., age, location, etc). Functional encryption (FE) is a cryptographic tool that allows fine-grained access control over encrypted data. However, existing FE fall short as they are either inefficient and far from reality or they leak sensitive user-specific information.
We propose SACfe, a novel attribute-based FE scheme that provides secure, fine-grained access control and hides both the user’s attributes and the function applied to the data, while preserving the data’s confidentiality. Moreover, it enables users to encrypt unbounded-length messages along with an arbitrary number of hidden attributes into ciphertexts. We design SACfe, a protocol for performing linear computation on encrypted data while enforcing access control based on inner product predicates. We show how SACfe can be used for online biometric authentication for privacy-preserving access control. As an additional contribution, we introduce an attribute-based linear FE for unbounded length of messages and functions where access control is realized by monotone span programs. We implement our protocols using the CiFEr cryptographic library and show its efficiency for practical settings.
Oblivious Single Access Machines: A New Model for Oblivious Computation
Oblivious RAM (ORAM) allows a client to securely outsource memory storage to an untrusted server. It has been shown that no ORAM can simultaneously achieve small bandwidth blow-up, small client storage, and a single roundtrip of latency.
We consider a weakening of the RAM model, which we call the Single Access Machine (SAM) model. In the SAM model, each memory slot can be written to at most once and read from at most once. We adapt existing tree-based ORAM to obtain an oblivious SAM (OSAM) that has $O(\log n)$ bandwidth blow-up (which we show is optimal), small client storage, and a single roundtrip.
OSAM unlocks improvements to oblivious data structures/algorithms. For instance, we achieve oblivious unbalanced binary trees (e.g. tries, splay trees). By leveraging splay trees, we obtain a notion of caching ORAM, where an access in the worst case incurs amortized $O(\log^2 n)$ bandwidth blow-up and $O(\log n)$ roundtrips, but in many common cases (e.g. sequential scans) incurs only amortized $O(\log n)$ bandwidth blow-up and $O(1)$ roundtrips. We also give new oblivious graph algorithms, including computing minimum spanning trees and single source shortest paths, in which the OSAM client reads/writes $O(|E| \cdot \log |E|)$ words using $O(|E|)$ roundtrips, where $|E|$ is the number of edges. This improves over prior custom solutions by a log factor.
At a higher level, OSAM provides a general model for oblivious computation. We construct a programming interface around OSAM that supports arbitrary pointer-manipulating programs such that dereferencing a pointer to an object incurs $O(\log d \log n)$ bandwidth blowup and $O(\log d)$ roundtrips, where $d$ is the number of pointers to that object. This new interface captures a wide variety of data structures and algorithms (e.g., trees, tries, doubly-linked lists) while matching or exceeding prior best asymptotic results. It both unifies much of our understanding of oblivious computation and allows the programmer to write oblivious algorithms combining various common data structures/algorithms and beyond.
FASIL: A challenge-based framework for secure and privacy-preserving federated learning
Enhancing privacy in federal learning (FL) without considering robustness can create an open door for attacks such as poisoning attacks on the FL process. Thus, addressing both the privacy and security aspects simultaneously becomes vital. Although, there are a few solutions addressing both privacy and security in the literature in recent years, they have some drawbacks such as requiring two non-colluding servers, heavy cryptographic operations, or peer-to-peer communication topology. In this paper, we introduce a novel framework that allows the server to run some analysis for detection and mitigation of attacks towards the FL process, while satisfying the confidentiality requirements for the training data against the server. We evaluate the effectiveness of the framework in terms of security and privacy by performing experiments on some concrete examples. We also provide two instantiations of the framework with two different secure aggregation protocols to give a more concrete view how the framework works and we analyse the computation and communication overhead of the framework.
MaSTer: Maliciously Secure Truncation for Replicated Secret Sharing without Pre-Processing
Secure multi-party computation (MPC) in a three-party, honest majority scenario is currently the state-of-the-art for running machine learning algorithms in a privacy-preserving manner. For efficiency reasons, fixed-point arithmetic is widely used to approximate computation over decimal numbers. After multiplication in fixed-point arithmetic, truncation is required to keep the result's precision. In this paper, we present an efficient three-party truncation protocol secure in the presence of an active adversary without pre-processing and improve on the current state-of-the-art in MPC over rings using replicated secret sharing (RSS). By adding an efficient consistency check, we lift the efficient but only passively secure three-party truncation protocol from the ABY3 framework by Mohassel and Rindal into the malicious setting without pre-processed data. Our benchmark indicates performance improvements of an order of magnitude in the offline phase for a single batch training. Finally, we apply our protocol to a real-world application for diagnostic prediction based on publicly available ECG heartbeat data. We achieve an improvement by a factor of two in the total throughput for both LAN and WAN settings.
Polynomial sharings on two secrets: Buy one, get one free
While passive side-channel attacks and active fault attacks have been studied intensively in the last few decades, strong attackers combining these attacks have only been studied relatively recently. Due to its simplicity, most countermeasures against passive attacks are based on additive sharing. Unfortunately, extending these countermeasures against faults often leads to quite a significant performance penalty, either due to the use of expensive cryptographic operations or a large number of shares due to massive duplication. Just recently, Berndt, Eisenbarth, Gourjon, Faust, Orlt, and Seker thus proposed to use polynomial sharing against combined attackers (CRYPTO 2023). While they construct gadgets secure against combined attackers using only a linear number of shares, the overhead introduced might still be too large for practical scenarios.
In this work, we show how the overhead of nearly all known constructions using polynomial sharing can be reduced by nearly half by embedding two secrets in the coefficients of one polynomial at the expense of increasing the degree of the polynomial by one. We present a very general framework that allows adapting these constructions to this new sharing scheme and prove the security of this approach against purely passive side-channel attacks, purely active fault attacks, and combined attacks. Furthermore, we present new gadgets allowing us to operate upon the different secrets in a number of useful ways.
FABESA: Fast (and Anonymous) Attribute-Based Encryption under Standard Assumption
Attribute-Based Encryption (ABE) provides fine-grained access control to encrypted data and finds applications in various domains. The practicality of ABE schemes hinges on the balance between security and efficiency. The state-of-the-art adaptive secure ABE scheme, proven to be adaptively secure under standard assumptions (FAME, CCS'17), is less efficient compared to the fastest one (FABEO, CCS'22) which is only proven secure under the Generic Group Model (GGM). These traditional ABE schemes focus solely on message privacy. To address scenarios where attribute value information is also sensitive, Anonymous ABE (${\rm A}^{\rm 2}$BE) ensures the privacy of both the message and attributes. However, most ${\rm A}^{\rm 2}$BE schemes suffer from intricate designs with low efficiency, and the security of the fastest key-policy ${\rm A}^{\rm 2}$BE (proposed in FEASE, USENIX'24) relies on the GGM.
In this paper, we propose novel fast key-policy and ciphertext-policy ABE schemes that (1) support both AND and OR gates for access policies, (2) have no restriction on the size and type of policies or attributes, (3) achieve adaptive security under the standard DLIN assumption, and (4) only need 4 pairings for decryption. As our ABE constructions automatically provide ciphertext anonymity, we easily transform our ABE schemes to ${\rm A}^{\rm 2}$BE schemes while maintaining the same features and high-level efficiency.
The implementation results show that all our schemes achieve the best efficiency comparing to other schemes with adaptive security proven under standard assumptions. Specifically, our ABE schemes perform better than FAME and are close to FABEO. Our key-policy ${\rm A}^{\rm 2}$BE scheme performs close to the one in FEASE and our ciphertext-policy ${\rm A}^{\rm 2}$BE outperforms the state-of-the-art (Cui et al., ProvSec'16).
History-Free Sequential Aggregation of Hash-and-Sign Signatures
A sequential aggregate signature (SAS) scheme allows multiple users to sequentially combine their respective signatures in order to reduce communication costs. Historically, early proposals required the use of trapdoor permutation (e.g., RSA). In recent years, a number of attempts have been made to extend SAS schemes to post-quantum assumptions. Many post-quantum signatures have been proposed in the hash-and-sign paradigm, which requires the use of trapdoor functions and appears to be an ideal candidate for sequential aggregation attempts. However, the hardness in achieving post-quantum one-way permutations makes it difficult to obtain similarly general constructions. Direct attempts at generalizing permutation-based schemes have been proposed, but they either lack formal security or require additional properties on the trapdoor function, which are typically not available for multivariate or code-based functions. In this paper, we propose a (partial-signature) history-free SAS within the probabilistic hash-and-sign with retry paradigm, generalizing existing techniques to generic trapdoor functions. We prove the security of our scheme in the random oracle model and we instantiate our construction with three post-quantum schemes, comparing their compression capabilities. Finally, we discuss how direct extensions of permutation-based SAS schemes are not possible without additional properties, showing the lack of security of two existing multivariate schemes.
Constant-Size Unbounded Multi-Hop Fully Homomorphic Proxy Re-Encryption from Lattices
Proxy re-encryption is a cryptosystem that achieves efficient encrypted data sharing by allowing a proxy to transform a ciphertext encrypted under one key into another ciphertext under a different key. Homomorphic proxy re-encryption (HPRE) extends this concept by integrating homomorphic encryption, allowing not only the sharing of encrypted data but also the homomorphic computations on such data. The existing HPRE schemes, however, are limited to a single or bounded number of hops of ciphertext re-encryptions. To address this limitation, this paper introduces a novel lattice-based, unbounded multi-hop fully homomorphic proxy re-encryption (FHPRE) scheme, with constant-size ciphertexts. Our FHPRE scheme supports an unbounded number of reencryption operations and enables arbitrary homomorphic computations over original, re-encrypted, and evaluated ciphertexts. Additionally, we propose a potential application of our FHPRE scheme in the form of a non-interactive, constant-size multi-user computation system for cloud computing environments.
Gap MCSP is not (Levin) NP-complete in Obfustopia
We demonstrate that under believable cryptographic hardness assumptions, Gap versions of standard meta-complexity problems, such as the Minimum Circuit Size problem (MCSP) and the Minimum Time-Bounded Kolmogorov Complexity problem (MKTP) are not NP-complete w.r.t. Levin (i.e., witness-preserving many-to-one) reductions.
In more detail:
- Assuming the existence of indistinguishability obfuscation, and subexponentially-secure one-way functions, an appropriate Gap version of MCSP is not NP-complete under randomized Levin-reductions.
- Assuming the existence of subexponentially-secure indistinguishability obfuscation, subexponentially-secure one-way functions and injective PRGs, an appropriate Gap version of MKTP is not NP-complete under randomized Levin-reductions.
Nakamoto Consensus under Bounded Processing Capacity
For Nakamoto's longest-chain consensus protocol, whose proof-of-work (PoW) and proof-of-stake (PoS) variants power major blockchains such as Bitcoin and Cardano, we revisit the classic problem of the security--performance tradeoff: Given a network of nodes with finite communication- and computation-resources, against what fraction of adversary power is Nakamoto consensus (NC) secure for a given block production rate? State-of-the-art analyses of NC fail to answer this question, because their bounded-delay model does not capture the rate limits to nodes' processing of blocks, which cause congestion when blocks are released in quick succession. We develop a new analysis technique to prove a refined security--performance tradeoff for PoW NC in a bounded-capacity model. In this model, we show that, in contrast to the classic bounded-delay model, Nakamoto's private attack is no longer the worst attack, and a new attack we call the teasing strategy, that exploits congestion, is strictly worse. In PoS, equivocating blocks can exacerbate congestion, making traditional PoS NC insecure except at very low block production rates. To counter such equivocation spamming, we present a variant of PoS NC we call Blanking NC (BlaNC), which achieves the same resilience as PoW NC.
Security-Performance Tradeoff in DAG-based Proof-of-Work Blockchain Protocols
Proof-of-work (PoW) blockchain protocols based on directed acyclic graphs (DAGs) have demonstrated superior transaction confirmation performance compared to their chain-based predecessors. However, it is uncertain whether their security deteriorates in high-throughput settings similar to their predecessors, because their acceptance of simultaneous blocks and complex block dependencies presents challenges for rigorous security analysis.
We address these challenges by analyzing DAG-based protocols via a congestible blockchain model (CBM), a general model that allows case-by-case upper bounds on the block propagation delay, rather than a uniform upper bound as in most previous analyses. CBM allows us to capture two key phenomena of high-throughput settings: (1) simultaneous blocks increase each other's propagation delay, and (2) a block can be processed only after receiving all the blocks it refers to. We further devise a reasonable adversarial block propagation strategy in CBM, called the late-predecessor attack, which exploits block dependencies to delay the processing of honest blocks. We then evaluate the security and performance of Prism and OHIE, two DAG-based protocols that aim to break the security-performance tradeoff, in the presence of an attacker capable of launching the late predecessor attack. Our results show that these protocols suffer from reduced security and extended latency in high-throughput settings similar to their chain-based predecessors.
chainBoost: A Secure Performance Booster for Blockchain-based Resource Markets
Cryptocurrencies and blockchain technology provide an innovative model for reshaping digital services. Driven by the movement toward Web 3.0, recent systems started to provide distributed services, such as computation outsourcing or file storage, on top of the currency exchange medium. By allowing anyone to join and collect cryptocurrency payments for serving others, these systems create decentralized markets for trading digital resources. Yet, there is still a big gap between the promise of these markets and their practical viability. Existing initiatives are still early-stage and have already encountered security and efficiency obstacles. At the same time, existing work around promising ideas, specifically sidechains, fall short in exploiting their full potential in addressing these problems.
To bridge this gap, we propose chainBoost, a secure performance booster for decentralized resource markets. It expedites service related operations, reduces the blockchain size, and supports flexible service-payment exchange modalities at low overhead. At its core, chainBoost employs a sidechain, that has a (security and semantic) mutual-dependence with the mainchain, to which the system offloads heavy/frequent operations. To enable it, we develop a novel sidechain architecture composed of temporary and permanent blocks, a block suppression mechanism to prune the sidechain, a syncing protocol to permit arbitrary data exchange between the two chains, and an autorecovery protocol to support robustness and resilience. We analyze the security of chainBoost, and implement a proof-of-concept prototype for a distributed file storage market as a use case. For a market handling around 2000 transactions per round, our experiments show up to 11x improvement in throughput and 94% reduction in confirmation time. They also show that chainBoost can reduce the main blockchain size by about 90%, and that it outperforms comparable optimistic rollup solutions by reducing transaction finality by 99.7%.
Exploiting Clock-Slew Dependent Variability in CMOS Digital Circuits Towards Power and EM SCA Resilience
Mathematically secured cryptographic implementations leak critical information in terms of power, EM emanations, etc. Several circuit-level countermeasures are proposed to hinder side channel leakage at the source. Circuit-level countermeasures (e.g., IVR, STELLAR, WDDL, etc) are often preferred as they are generic and have low overhead. They either dither the voltage randomly or attenuate the meaningful signature at $V_{DD}$ port. Although any digital implementation has two generic ports, namely clock and $V_{DD}$, circuit-level countermeasures primarily focus on $V_{DD}$ port, and countermeasures using the clock are mainly unexplored. System-level clock randomization is ineffective due to post-processing techniques. This work, for the first time, presents clock-based countermeasures by providing a controlled slew that exploits the inherent variability of digital circuits in terms of power consumption and transforms power/EM emanation into a complex function of data and slew. Due to this, minimum traces-to-disclosure (MTD) improves by 100$\times$ with respect to the unprotected one.
Moreover, the slewed clock reduces the leaky frequency, and the clock randomization countermeasure is more effective as it becomes more difficult} to post-process in the frequency domain. Clock slew and randomization together have a cumulative effect(1800x) more than the multiplication of individual techniques (100x & 5x respectively). In brief, this paper presents a clock-level generic synthesizable countermeasure technique that improved the minimum-traces-to-disclosure (MTD) by 1800$\times$ and incurs only 11% area overhead, $<3\%$ power overhead (measured) and $<6\%$ performance overhead (measured). Moreover, this can be easily combined with other power-port-based mitigation techniques for enhanced security.
Sparsity-Aware Protocol for ZK-friendly ML Models: Shedding Lights on Practical ZKML
As deep learning is being widely adopted across various domains, ensuring the integrity of models has become increasingly crucial. Despite the recent advances in Zero-Knowledge Machine Learning (ZKML) techniques, proving the inference over large ML models is still prohibitive. To enable practical ZKML, model simplification techniques like pruning and quantization should be applied without hesitation. Contrary to conventional belief, recent development in ML space have demonstrated that these simplification techniques not only condense complex models into forms with sparse, low-bit weight matrices, but also maintain exceptionally high model accuracies that matches its unsimplified counterparts.
While such transformed models seem inherently ZK-friendly, directly applying existing ZK proof frameworks still lead to suboptimal inference proving performance. To make ZKML truly practical, a quantization-and-pruning-aware ZKML framework is needed. In this paper, we propose SpaGKR, a novel sparsity-aware ZKML framework that is proven to surpass capabilities of existing ZKML methods. SpaGKR is a general framework that is widely applicable to any computation structure where sparsity arises. It is designed to be modular - all existing GKR-based ZKML frameworks can be seamlessly integrated with it to get remarkable compounding performance enhancements. We tailor SpaGKR specifically to the most commonly-used neural network structure - the linear layer, and propose the SpaGKR-LS protocol that achieves asymptotically optimal prover time. Notably, when applying SpaGKR-LS to a special series of simplified model - ternary network, it achieves further efficiency gains by additionally leveraging the low-bit nature of model parameters.
Chaghri --- an FHE-friendly Block Cipher
The Recent progress in practical applications of secure computation protocols has also attracted attention to the symmetric-key primitives underlying them. Whereas traditional ciphers have evolved to be efficient with respect to certain performance metrics, advanced cryptographic protocols call for a different focus. The so called arithmetic complexity is viewed through the number and layout of non-linear operations in the circuit implemented by the protocol. Symmetric-key algorithms that are optimized with respect to this metric are said to be algebraic ciphers. Previous work targeting ZK and MPC protocols delivered great improvement in the performance of these applications both in lab and in practical use. Interestingly, despite its apparent benefits to privacy-aware cloud computing, algebraic ciphers targeting FHE did not attract similar attention.
In this paper we present Chaghri, an FHE-friendly block cipher enabling efficient transciphering in BGV-like schemes. A complete Chaghri circuit can be implemented using only 16 multiplications, 32 Frobenius automorphisms and 32 rotations, all arranged in a depth-32 circuit. Our HElib implemention achieves a throughput of 0.26 seconds-per-bit which is 65% faster than AES in the same setting.
Accelerating pairings on BW10 and BW14 Curves
Since the advent of pairing based cryptography, many researchers have developed several techniques and variants of pairings to optimise the speed of pairing computations. The selection of the elliptic curve for a given pairing based protocol is crucial for operations in the first and second pairing groups of points of the elliptic curve and for many cryptographic schemes. A new variant of superoptimal pairing was proposed in 2023, namely x-superoptimal pairing on curves with odd prime embedding degrees BW13-310 and BW19-286. This paper extends the definition of the x-superoptimal pairing on elliptic curves with even embedding degrees BW10-511 and BW14-351 at 128 bits security level. We provide a suitable formula of the x-superoptimal pairing on BW10-511 and BW14-351 where the Miller loop is about $13.5\%$ and $21.6\%$ faster than the optimal ate pairing on BW10-511 and BW14-351 respectively. The correctness of the x-superoptimal pairing on BW10-511 and BW14-351 and bilinearity has been verified by a Magma code.
Expediting Homomorphic Computation via Multiplicative Complexity-aware Multiplicative Depth Minimization
Fully homomorphic encryption (FHE) enables secure data processing without compromising data access, but its computational cost and slower execution compared to plaintext operations pose challenges. The growing interest in FHE-based secure computation necessitates the acceleration of homomorphic computations. While existing research primarily targets the reduction of the multiplicative depth (MD) of homomorphic circuits, this paper addresses the trade-off between MD reduction and the increase in multiplicative complexity (MC), a critical gap often overlooked during circuit optimization and potentially resulting in suboptimal outcomes. Three contributions are presented: (a) an exact synthesis paradigm for optimal homomorphic circuit implementations, (b) an efficient heuristic algorithm named MC-aware MD minimization, and (c) a homomorphic circuit optimization flow combining MC-aware MD minimization with existing MD reduction techniques. Experimental results demonstrate a 21.32% average reduction in homomorphic computation time and showcase significantly improved efficiency in circuit optimization.
Fast Blind Rotation for Bootstrapping FHEs
Uncategorized
Uncategorized
Blind rotation is one of the key techniques to construct fully homomorphic encryptions with the best known bootstrapping algorithms running in less than one second. Currently, the two main approaches, namely, AP and GINX, for realizing blind rotation are first introduced by Alperin-Sheriff and Peikert (CRYPTO 2014) and Gama, Izabachene, Nguyen and Xie (EUROCRYPT 2016), respectively.
\qquad In this paper, we propose a new blind rotation algorithm
based on a GSW-like encryption from the NTRU assumption. Our algorithm has performance asymptotically independent from the key distributions, and outperforms AP and GINX in both the evaluation key size and the computational efficiency(especially for large key distributions). By using our blind rotation algorithm as a building block, we present new bootstrapping algorithms for
both LWE and RLWE ciphertexts.
We implement our bootstrapping algorithm for LWE ciphertexts,
and compare the actual performance with two bootstrapping algorithms, namely, FHEW/AP by Ducas and Micciancio (EUROCRYPT 2015) and TFHE/GINX by Chillotti, Gama, Georgieva and Izabach\`ene (Journal of Cryptology 2020), that were implemented in the OpenFHE library. For parameters with ternary key distribution at 128-bit security, our bootstrapping only needs to store evaluation key of size 18.65MB for blind rotation, which is about 89.8 times smaller than FHEW/AP and 2.9 times smaller than TFHE/GINX. Moreover, our bootstrapping can be done in 112ms on a laptop, which is about 3.2 times faster than FHEW/AP and 2.1 times faster than TFHE/GINX. More improvements are available for large key distributions such as Gaussian distributions.
Dishonest Majority Multi-Verifier Zero-Knowledge Proofs for Any Constant Fraction of Corrupted Verifiers
In this work we study the efficiency of Zero-Knowledge (ZK) arguments of knowledge, particularly exploring Multi-Verifier ZK (MVZK) protocols as a midway point between Non-Interactive ZK and Designated-Verifier ZK, offering versatile applications across various domains. We introduce a new MVZK protocol designed for the preprocessing model, allowing any constant fraction of verifiers to be corrupted, potentially colluding with the prover. Our contributions include the first MVZK over rings. Unlike recent prior works on fields in the dishonest majority case, our protocol demonstrates communication complexity independent of the number of verifiers, contrasting the linear complexity of previous approaches. This key advancement ensures improved scalability and efficiency. We provide an end-to-end implementation of our protocol. The benchmark shows that it achieves a throughput of 1.47 million gates per second for 64 verifiers with $50\%$ corruption, and 0.88 million gates per second with $75\%$ corruption.
Tempora-Fusion: Time-Lock Puzzle with Efficient Verifiable Homomorphic Linear Combination
Uncategorized
Uncategorized
To securely transmit sensitive information into the future, Time-Lock Puzzles (TLPs) have been developed. Their applications include scheduled payments, timed commitments, e-voting, and sealed-bid auctions. Homomorphic TLP is a key variant of TLP that enables computation on puzzles from different clients. This allows a solver/server to tackle only a single puzzle encoding the computation's result. However, existing homomorphic TLPs lack support for verifying the correctness of the computation results. We address this limitation by introducing Tempora-Fusion, a TLP that allows a server to perform homomorphic linear combinations of puzzles from different clients while ensuring verification of computation correctness. This scheme avoids asymmetric-key cryptography for verification, thus paving the way for efficient implementations. We discuss our scheme's application in various domains, such as federated learning, scheduled payments in online banking, and e-voting.
PT-symmetric mapping of three states and its implementation on a cloud quantum processor
Recently, PT-symmetric systems have garnered significant attention due to their unconventional properties. Despite the growing interest, there remains an ongoing debate about whether these systems can outperform their Hermitian counterparts in practical applications, and if so, by what metrics this performance should be measured. We developed a novel PT-symmetric approach for mapping N = 3 pure qubit states to address this, implemented it using the dilation method, and demonstrated it on a superconducting quantum processor from the IBM Quantum Experience. For the first time, we derived exact expressions for the population of the post-selected PT-symmetric subspace for both N = 2 and N = 3 states. When applied to the discrimination of N = 2 pure states, our algorithm provides an equivalent result to the conventional unambiguous quantum state discrimination. For N = 3 states, our approach introduces novel capabilities not available in traditional Hermitian systems, enabling the transformation of an arbitrary set of three quantum states into another, at the cost of introducing an inconclusive outcome. Our algorithm has the same error rate for the attack on the three-state QKD protocol as the conventional minimum error, maximum confidence, and maximum mutual information strategies. For post-selection quantum metrology, our results provide precise conditions where PT-symmetric quantum sensors outperform their Hermitian counterparts in terms of information-cost rate. Combined with punctuated unstructured quantum database search, our method significantly reduces the qubit readout requirements at the cost of adding an ancilla, while maintaining the same average number of oracle calls as the original punctuated Grover's algorithm. This provides significant advantages for NISQ-era computers. Our work opens new pathways for applying PT symmetry in quantum communications, computing, and cryptography.
A Theoretical Take on a Practical Consensus Protocol
The Asynchronous Common Subset (ACS) problem is a fundamental problem in distributed computing. Very recently, Das et al. (2024) developed a new ACS protocol with several desirable properties: (i) it provides optimal resilience, tolerating up to $t < n/3$ corrupt parties out of $n$ parties in total, (ii) it does not rely on a trusted set up, (iii) it utilizes only "lighweight" cryptography, which can be instantiated using just a hash function, and (iv) it has expected round complexity $O(1)$ and expected communication complexity $O(\kappa n^3)$, where $\kappa$ is the output-length of the hash function. The purpose of this paper is to give a detailed, self-contained exposition and analysis of this protocol from the point of view of modern theoretcal cryptography, fleshing out a number of details of the definitions and proofs, providing a complete security analysis based on concrete security assumptions on the hash function (i.e., without relying on random oracles), and developing all of the underlying theory in the universal composability framework.
Cross-chain bridges via backwards-compatible SNARKs
In recent years, SNARKs have shown great promise as a tool for building trustless bridges to connect the heterogeneous ecosystem of blockchains. Unfortunately, the parameters hardwired for many of the widely used blockchains are incongruous with the conventional SNARKs, which results in unsatisfactory performance. This bottleneck necessitates new proof systems tailored for efficiency in these environments.
The primary focus of this paper is on succinct bridges from Cosmos to Ethereum, which largely boils down to efficient proofs of multiple Ed25519 signatures. However, these techniques can be ported to settings that require succinct proofs of multiple secp256k1 or BLS12-381 signatures.
We describe our succinct validity bridging scheme Overfly, which uses a field-agnostic SNARK to circumvent the huge overhead of non-native field arithmetic arising from Ed25519 scalar multiplications in the circuit. We also explore the schemes deVirgo and zkTree, which exploit the parallelization of proof generation and the subsequent aggregation of proofs.
Our benchmarks indicate that it is crucial to sidestep non-native arithmetic to the extent that it is possible. We also found that two or more proof systems need to be securely amalgamated to optimize a succinct validity bridging scheme.
On vectorial functions mapping strict affine subspaces of their domain into strict affine subspaces of their co-domain, and the strong D-property
Given three positive integers $n<N$ and $M$, we study those vectorial Boolean $(N,M)$-functions $\mathcal{F}$ which map an $n$-dimensional affine space $A$ into an $m$-dimensional affine space where $m<M$ and possibly $m=n$. This provides $(n,m)$-functions $\mathcal{F}_A$ as restrictions of $\mathcal{F}$. We show that the nonlinearity of $\mathcal{F}$ must not be too large for allowing this, and we observe that if it is zero, then it is always possible. In this case, we show that the nonlinearity of the restriction may be large.
We then focus on the case $M=N$ and $\mathcal{F}$ of the form $\psi(\mathcal{G}(x))$ where $\mathcal{G}$ is almost perfect nonlinear (APN) and $\psi$ is a linear function with a kernel of dimension $1.$ We observe that the problem of determining the D-property of APN $(N-1,N)$-functions $\mathcal{G}_A$, where $A$ is a hyperplane, is related to the problem of constructing APN $(N-1,N-1)$-functions $\mathcal{F}_A$. For this reason, we introduce the strong D-property defined for $(N,N)$-functions $\mathcal{G}$. We give a characterization of this property for crooked functions and their compositional inverse (if it exists) by means of their ortho-derivatives, and we prove that the Gold APN function in dimension $N$ odd big enough has the strong D-property. We also prove in simpler a way than Taniguchi in 2023 that the strong D-property of the Gold APN function holds for $N$ even big enough. Then we give a partial result on the Dobbertin APN power function, and on the basis of this result, we conjecture that it has the strong D-property as well.
We then move our focus to two known infinite families of differentially 4-uniform $(N-1,N-1)$-permutations constructed as the restrictions of $(N,N)$-functions $\mathcal{F}(x)=\psi(\mathcal{G}(x))$ or $\mathcal{F}(x)=\psi(\mathcal{G}(x))+x$ where $\psi$ is linear with a kernel of dimension $1$ and $\mathcal{G}$ is an APN permutation. After a deeper investigation on these classes, we provide proofs (which were missing) that they are not APN in dimension $n=N-1$ even. Then we present our own construction by relaxing some hypothesis on $\psi$ and $\mathcal{G}$.
An algebraic attack for forging signatures of MPPK/DS
We give an algebraic attack to forge the signature of a scheme called
MPPK/DS, which can be achieved by solving a linear system in 5 variables
with coefficients on $\mathbb{Z}/2^x q \mathbb{Z}$ for some odd prime $q$ and $x ≥ 1$.
Delegated-Query Oblivious Transfer and its Practical Applications
Databases play a pivotal role in the contemporary World Wide Web and the world of cloud computing. Unfortunately, numerous privacy violations have recently garnered attention in the news. To enhance database privacy, we consider Oblivious Transfer (OT), an elegant cryptographic technology. Our observation reveals that existing research in this domain primarily concentrates on theoretical cryptographic applications, overlooking various practical aspects:
- OTs assume parties have direct access to databases. Our "1-out-of-2 Delegated-Query OT" enables parties to privately query a database, without direct access.
- With the rise of cloud computing, physically separated databases may no longer remain so. Our "1-out-of-2 Delegated-Query Multi-Receiver OT" protects privacy in such evolving scenarios.
- Research often ignores the limitations of thin clients, e.g., Internet of Things devices. To address this, we propose a compiler that transforms any 1-out-of-n OT into a thin client version.
zkVoting : Zero-knowledge proof based coercion-resistant and E2E verifiable e-voting system
We introduce ${zkVoting}$, a coercion-resistant e-voting system that utilizes a fake keys approach based on a novel nullifiable commitment scheme. This scheme allows voters to receive both real and fake commitment keys from a registrar. Each ballot includes this commitment, but only the tallier can efficiently discern the fake ballots, simplifying the tally process to $\mathcal{O}(n)$ and ensuring coercion resistance. ${zkVoting}$ also preserves voter anonymity by ensuring each ballot conceals the voter's identity. Additionally, by integrating zero-knowledge proofs, ${zkVoting}$ achieves end-to-end (E2E) verifiability. We formally prove its security and demonstrate its practicality for real-world applications, with a ballot casting time of 2.3 seconds and a tally time of 3.9 milliseconds per ballot.
Guidance for Efficient Selection of Secure Parameters for Fully Homomorphic Encryption
The field of Fully Homomorphic Encryption (FHE) has seen many theoretical and computational advances in recent years, bringing the technology closer to practicality than ever before. For this reason, practitioners from neighbouring fields such as machine learning have sought to understand FHE to provide privacy to their work. Unfortunately, selecting secure and efficient parameters in FHE is a daunting task due to the many interdependencies between the parameters involved. In this work, we solve this problem by moving away from the standard parameter selection procedure, introducing formulas which provide secure and optimal parameters for any lattice-based scheme. We build our formulas from a strong theoretical foundation based on cryptanalysis against LWE.
Examining the Practical Side Channel Resilience of ARX-boxes
Implementations of ARX ciphers are hoped to have some intrinsic side channel resilience owing to the specific choice of cipher components: modular addition (A), rotation (R) and exclusive-or (X). Previous work has contributed to this understanding by developing theory regarding the side channel resilience of components (pioneered by the early works of Prouff) as well as some more recent practical investigations by Biryukov et al. that focused on lightweight cipher constructions. We add to this work by specifically studying ARX-boxes both mathematically as well as practically. Our results show that previous works' reliance on the simplistic assumption that intermediates independently leak (their Hamming weight) has led to the incorrect conclusion that the modular addition is necessarily the best target and that ARX constructions are therefore harder to attack in practice: we show that on an ARM M0, the best practical target is the exclusive or and attacks succeed with only tens of traces.
In addition, we also provide results suggesting that the modular addition may also be a vulnerable target when partition based distinguishers are applied in side channel attacks.
Plan your defense: A comparative analysis of leakage detection methods on RISC-V cores
Hardening microprocessors against side-channel attacks is a critical aspect of ensuring their security. A key step in this process is identifying and mitigating “leaky" hardware modules, which leak information during the execution of cryptographic algorithms.
In this paper, we explore how different leakage detection methods, the Side-channel Vulnerability Factor (SVF) and the Test Vector Leakage Assessment (TVLA), contribute to hardening of microprocessors. We conduct experiments on two RISC-V cores, SHAKTI and Ibex, using two cryptographic algorithms, SHA-3 and AES. Our findings suggest that SVF and TVLA can provide valuable insights into identifying leaky modules. However, the effectiveness of these methods can vary depending on the specific core and cryptographic algorithm in use.
We conclude that the choice of leakage detection method should be based not only on computational cost but also on the specific requirements of the system, the implementation of the algorithm examined and the nature of the potential threats.
Optimizing and Implementing Fischlin's Transform for UC-Secure Zero-Knowledge
Fischlin's transform (CRYPTO 2005) is an alternative to the Fiat-Shamir transform that enables straight-line extraction when proving knowledge. In this work we focus on the problem of using the Fischlin transform to construct UC-secure zero-knowledge from Sigma protocols, since UC security -- that guarantees security under general concurrent composition -- requires straight-line (non-rewinding) simulators. We provide a slightly simplified transform that is much easier to understand, and present algorithmic and implementation optimizations that significantly improve the running time. It appears that the main obstacles to the use of Fischlin in practice is its computational cost and implementation complexity (with multiple parameters that need to be chosen). We provide clear guidelines and a simple methodology for choosing parameters, and show that with our optimizations the running-time is far lower than expected. For just one example, on a 2023 MacBook, the cost of proving the knowledge of discrete log with Fischlin is only 0.41ms (on a single core). This is 15 times slower than plain Fiat-Shamir on the same machine, which is a significant multiple but objectively not significant in many applications. We also extend the transform so that it can be applied to batch proofs, and show how this can be much more efficient than individually proving each statement. We hope that this paper will both encourage and help practitioners implement the Fischlin transform where relevant.
Pairing-Free Blind Signatures from CDH Assumptions
We present the first concurrently-secure blind signatures making black-box use of a pairing-free group for which unforgeability, in the random oracle model, can be proved {\em without} relying on the algebraic group model (AGM), thus resolving a long-standing open question. Prior pairing-free blind signatures without AGM proofs have only been proved secure for bounded concurrency, relied on computationally expensive non-black-box use of NIZKs, or had complexity growing with the number of signing sessions due to the use of boosting techniques.
Our most efficient constructions rely on the chosen-target CDH assumption and can be seen as blind versions of signatures by Goh and Jarecki (EUROCRYPT '03) and Chevallier-Mames (CRYPTO '05). We also give a less efficient scheme with security based on (plain) CDH. The underlying signing protocols consist of four (in order to achieve regular unforgeability) or five moves (for strong unforgeability). All schemes are proved statistically blind in the random oracle model.
Measuring Conditional Anonymity - A Global Study
The realm of digital health is experiencing a global surge, with mobile applications extending their reach into various facets of daily life. From tracking daily eating habits and vital functions to monitoring sleep patterns and even the menstrual cycle, these apps have become ubiquitous in their pursuit of comprehensive health insights.
Many of these apps collect sensitive data and promise users to protect their privacy - often through pseudonymization. We analyze the real anonymity that users can expect by this approach and report on our findings. More concretely:
1. We introduce the notion of conditional anonymity sets derived from statistical properties of the population.
2. We measure anonymity sets for two real-world applications and present overarching findings from 39 countries.
3. We develop a graphical tool for people to explore their own anonymity set.
One of our case studies is a popular app for tracking the menstruation cycle. Our findings for this app show that, despite their promise to protect privacy, the collected data can be used to identify users up to groups of 5 people in 97% of all the US counties, allowing the de-anonymization of the individuals. Given that the US Supreme Court recently overturned abortion rights, the possibility
of determining individuals is a calamity.
PIR with compressed queries and amortized query processing
Uncategorized
Uncategorized
Private information retrieval (PIR) is a key building block in many privacy-preserving systems. Unfortunately, existing constructions remain very expensive. This paper introduces two techniques that make the computational variant of PIR (CPIR) more efficient in practice. The first technique targets a recent class of CPU-efficient CPIR protocols where the query sent by the client contains a number of ciphertexts proportional to the size of the database. We show how to compresses this query, achieving size reductions of up to 274X.
The second technique is a new data encoding called probabilistic batch codes (PBCs). We use PBCs to build a multi-query PIR scheme that allows the server to amortize its computational cost when processing a batch of requests from the same client. This technique achieves up to 40× speedup over processing queries one at a time, and is significantly more efficient than related encodings. We apply our techniques to the Pung private communication system, which relies on a custom multi-query CPIR protocol for its privacy guarantees. By porting our techniques to Pung, we find that we can simultaneously reduce network costs by 36× and increase throughput by 3X.
Proxying is Enough: Security of Proxying in TLS Oracles and AEAD Context Unforgeability
Uncategorized
Uncategorized
TLS oracles allow a TLS client to offer selective data provenance to an external (oracle) node such that the oracle node is ensured that the data is indeed coming from a pre-defined TLS server. Typically, the client/user supplies their credentials to the server and reveals selective data using zero-knowledge proofs to demonstrate certain server-offered information to oracles while ensuring the secrecy of the rest of the TLS transcript. Conceptually, this is a standard three-party secure computation between the TLS server, TLS client (prover), and the oracle (verifier) node; however, the key practical requirement for TLS oracles to ensure that data provenance process remains transparent to the TLS server. Recent TLS oracle protocols such as DECO enforce the communication pattern of server-client-verifier and utilize a novel three-party handshake process during TLS to ensure data integrity against potential tempering by the client. However, this approach introduces a significant performance penalty on the client/prover and the verifier. This raises the question of whether it is possible to reduce the overhead by putting the verifier (as a proxy) between the server and the client such that the correct TLS transcript is available to the verifier.
This work offers both positive and negative answers to this oracle proxy question: We first formalize the oracle proxy notion that allows the verifier to directly proxy client-server TLS communication, without entering a three-party handshake or interfering with the connection in any way. We then show that for common TLS-based higher-level protocols such as HTTPS, data integrity to the verifier proxy is ensured by the variable padding built into the HTTP protocol semantics. On the other hand, if a TLS-based protocol comes without variable padding, we demonstrate that data integrity cannot be guaranteed. In this context, we then study the case where the TLS response is pre-determined and cannot be tampered with during the connection. We propose the concept of context unforgeability and show allows overcoming the impossibility. We further show that ChaCha20-Poly1305 satisfies the concept while AES-GCM does not under the standard model.
Sustained Space and Cumulative Complexity Trade-offs for Data-Dependent Memory-Hard Functions
Memory-hard functions (MHFs) are a useful cryptographic primitive which can be used to design egalitarian proof of work puzzles and to protect low entropy secrets like passwords against brute-force attackers. Intuitively, a memory-hard function is a function whose evaluation costs are dominated by memory costs even if the attacker uses specialized hardware (FPGAs/ASICs), and several cost metrics have been proposed to quantify this intuition. For example, space-time cost looks at the product of running time and the maximum space usage over the entire execution of an algorithm. Alwen and Serbinenko (STOC 2015) observed that the space-time cost of evaluating a function multiple times may not scale linearly in the number of instances being evaluated and introduced the stricter requirement that a memory-hard function has high cumulative memory complexity (CMC) to ensure that an attacker's amortized space-time costs remain large even if the attacker evaluates the function on multiple different inputs in parallel. Alwen et al. (EUROCRYPT 2018) observed that the notion of CMC still gives the attacker undesirable flexibility in selecting space-time tradeoffs e.g., while the MHF scrypt has maximal CMC $\Omega(N^2)$, an attacker could evaluate the function with constant $O(1)$ memory in time $O(N^2)$. Alwen et al. introduced an even stricter notion of Sustained Space complexity and designed an MHF which has $s=\Omega(N/\log N)$ sustained complexity $t=\Omega(N)$ i.e., any algorithm evaluating the function in the parallel random oracle model must have at least $t=\Omega(N)$ steps where the memory usage is at least $\Omega(N/\log N)$. In this work, we use dynamic pebbling games and dynamic graphs to explore tradeoffs between sustained space complexity and cumulative memory complexity for data-dependent memory-hard functions such as Argon2id and scrypt. We design our own dynamic graph (dMHF) with the property that {\em any} dynamic pebbling strategy either (1) has $\Omega(N)$ rounds with $\Omega(N)$ space, or (2) has CMC $\Omega(N^{3-\epsilon})$ --- substantially larger than $N^2$. For Argon2id we show that {\em any} dynamic pebbling strategy either(1) has $\Omega(N)$ rounds with $\Omega(N^{1-\epsilon})$ space, or (2) has CMC $\omega(N^2)$. We also present a dynamic version of DRSample (Alwen et al. 2017) for which {\em any} dynamic pebbling strategy either (1) has $\Omega(N)$ rounds with $\Omega(N/\log N)$ space, or (2) has CMC $\Omega(N^3/\log N)$.
Limits on the Power of Prime-Order Groups: Separating Q-Type from Static Assumptions
Subgroup decision techniques on cryptographic groups and pairings have been critical for numerous applications. Originally conceived in the composite-order setting, there is a large body of work showing how to instantiate subgroup decision techniques in the prime-order setting as well. In this work, we demonstrate the first barrier to this research program, by demonstrating an important setting where composite-order techniques cannot be replicated in the prime-order setting.
In particular, we focus on the case of $q$-type assumptions, which are ubiquitous in group- and pairing-based cryptography, but unfortunately are less desirable than the more well-understood static assumptions. Subgroup decision techniques have had great success in removing $q$-type assumptions, even allowing $q$-type assumptions to be generically based on static assumptions on composite-order groups. Our main result shows that the same likely does not hold in the prime order setting. Namely, we show that a large class of $q$-type assumptions, including the security definition of a number of cryptosystems, cannot be proven secure in a black box way from any static assumption.
Securing Lightning Channels against Rational Miners
Payment channel networks (e.g., the Lightning Network in Bitcoin) constitute one of the most popular scalability solutions for blockchains. Their safety relies on parties being online to detect fraud attempts on-chain and being able to timely react by publishing certain transactions on-chain. However, a cheating party may bribe miners in order to censor those transactions, resulting in loss of funds for the cheated party: these attacks are known in the literature as timelock bribing attacks. In this work, we present the first channel construction that does not require parties to be online and, at the same time, is resistant to timelock bribing attacks.
We start by proving for the first time that Lightning channels are secure against timelock bribing attacks in the presence of rational channel parties under the assumption that these parties constantly monitor the mempool and never deplete the channel in one direction. The latter underscores the importance of keeping a coin reserve in each channel as implemented in the Lightning Network, albeit for different reasons. We show, however, that the security of the Lightning Network against Byzantine channel parties does not carry over to a setting in which miners are rational and accept timelock bribes.
Next, we introduce CRAB, the first Lightning-compatible channel construction that provides security against Byzantine channel parties and rational miners. CRAB leverages miners' incentives to safeguard the channel, thereby also forgoing the unrealistic assumption of channel parties constantly monitoring the mempool.
Finally, we show how our construction can be refined to eliminate the major assumption behind payment channels, i.e., the need for online participation. To that end, we present Sleepy CRAB the first provably secure channel construction under rational miners that enables participants to go offline indefinitely. We also provide a proof-of-concept implementation of Sleepy CRAB and evaluate its cost in Bitcoin, thereby demonstrating its practicality.
A Survey of Polynomial Multiplications for Lattice-Based Cryptosystems
We survey various mathematical tools used in software works multiplying polynomials in
\[
\frac{\mathbb{Z}_q[x]}{\left\langle {x^n - \alpha x - \beta} \right\rangle}.
\]
In particular, we survey implementation works targeting polynomial multiplications in lattice-based cryptosystems Dilithium, Kyber, NTRU, NTRU Prime, and Saber with instruction set architectures/extensions Armv7-M, Armv7E-M, Armv8-A, and AVX2.
There are three emphases in this paper: (i) modular arithmetic, (ii) homomorphisms, and (iii) vectorization. For modular arithmetic, we survey Montgomery, Barrett, and Plantard multiplications. For homomorphisms, we survey (a) various homomorphisms such as Cooley--Tukey FFT, Good--Thomas FFT, Bruun's FFT, Rader's FFT, Karatsuba, and Toom--Cook; (b) various algebraic techniques for adjoining nice properties to the coefficient rings, including localization, Schönhage's FFT, Nussbaumer's FFT, and coefficient ring switching; and (c) various algebraic techniques related to the polynomial moduli, including twisting, composed multiplication, evaluation at $\infty$, truncation, incomplete transformation, striding, and Toeplitz matrix-vector product. For vectorization, we survey the relations between homomorphisms and vector arithmetic.
We then go through several case studies: We compare the implementations of modular multiplications used in Dilithium and Kyber, explain how the matrix-to-vector structure was exploited in Saber, and review the design choices of transformations for NTRU and NTRU Prime with vectorization. Finally, we outline several interesting implementation projects.
A Formal Treatment of End-to-End Encrypted Cloud Storage
Users increasingly store their data in the cloud, thereby benefiting from easy access, sharing, and redundancy. To additionally guarantee security of the outsourced data even against a server compromise, some service providers have started to offer end-to-end encrypted (E2EE) cloud storage. With this cryptographic protection, only legitimate owners can read or modify the data. However, recent attacks on the largest E2EE providers have highlighted the lack of solid foundations for this emerging type of service.
In this paper, we address this shortcoming by initiating the formal study of E2EE cloud storage. We give a formal syntax to capture the core functionality of a cloud storage system, capturing the real-world complexity of such a system's constituent interactive protocols. We then define game-based security notions for confidentiality and integrity of a cloud storage system against a fully malicious server. We treat both selective and fully adaptive client compromises. Our notions are informed by recent attacks on E2EE cloud storage providers. In particular we show that our syntax is rich enough to capture the core functionality of MEGA and that recent attacks on it arise as violations of our security notions. Finally, we present an E2EE cloud storage system that provides all core functionalities and that is both efficient and provably secure with respect to our selective security notions. Along the way, we discuss challenges on the path towards bringing the security of cloud storage up to par with other end-to-end primitives, such as secure messaging and TLS.
Generic Construction of Dual-Server Public Key Authenticated Encryption with Keyword Search
Chen et al. (IEEE Transactions on Cloud Computing 2022) introduced dual-server public key authenticated encryption with keyword search (DS-PAEKS), and proposed a DS-PAEKS scheme under the decisional Diffie-Hellman assumption. In this paper, we propose a generic construction of DS-PAEKS from PAEKS, public key encryption, and signatures. By providing a concrete attack, we show that the DS-PAEKS scheme of Chen et al. is vulnerable. That is, the proposed generic construction yields the first DS-PAEKS schemes. Our attack with a slight modification works against the Chen et al. dual-server public key encryption with keyword search (DS-PEKS) scheme (IEEE Transactions on Information Forensics and Security 2016). Moreover, we demonstrate that the Tso et al. generic construction of DS-PEKS from public key encryption (IEEE Access 2020) is also vulnerable. We also analyze other pairing-free PAEKS schemes (Du et al., Wireless Communications and Mobile Computing 2022 and Lu and Li, IEEE Transactions on Mobile Computing 2022). Though we did not find any attack against these schemes, we show that at least their security proofs are wrong.
Helium: Scalable MPC among Lightweight Participants and under Churn
We introduce Helium, a novel framework that supports scalable secure multiparty computation (MPC) for lightweight participants and tolerates churn. Helium relies on multiparty homomorphic encryption (MHE) as its core building block. While MHE schemes have been well studied in theory, prior works fall short of addressing critical considerations paramount for adoption such as supporting resource-constrained and unstably connected participants. In this work, we systematize the requirements of MHE-based MPC protocols from a practical lens, and we propose a novel execution mechanism that addresses those considerations. We implement this execution mechanism in Helium, which makes it the first implemented framework to support MPC under network churn based solely on cryptographic assumptions. We show that a Helium network of $30$ parties connected with $100$Mbits/s links and experiencing a system-wide churn rate of $40$ failures per minute can compute the product between a fixed $512\times512$ secret matrix (e.g., a collectively-trained private model) and a fresh secret vector (e.g., a feature vector) $8.3$ times per second. This is $\sim\!\!1500$ times faster than a state-of-the-art MPC framework operating under no churn.
Quantum Lattice Enumeration in Limited Depth
In 2018, Aono et al. (ASIACRYPT 2018) proposed to use quantum backtracking algorithms (Montanaro, TOC 2018; Ambainis and Kokainis, STOC 2017) to speedup lattice point enumeration. Quantum lattice sieving algorithms had already been proposed (Laarhoven et al., PQCRYPTO 2013), being shown to provide an asymptotic speedup over classical counterparts, but also to lose competitiveness at dimensions relevant to cryptography if practical considerations on quantum computer architecture were taken into account (Albrecht et al., ASIACRYPT 2020). Aono et al.'s work argued that quantum walk speedups can be applied to lattice enumeration, achieving at least a quadratic asymptotic speedup à la Grover search while not requiring exponential amounts of quantum accessible classical memory, as it is the case for sieving. In this work, we explore how to lower bound the cost of using Aono et al.'s techniques on lattice enumeration with extreme cylinder pruning, assuming a limit to the maximum depth that a quantum computation can achieve without decohering, with the objective of better understanding the practical applicability of quantum backtracking in lattice cryptanalysis.
DualRing-PRF: Post-Quantum (Linkable) Ring Signatures from Legendre and Power Residue PRFs
Ring signatures are one of the crucial cryptographic primitives used in the design of privacy-preserving systems. Such a signature scheme allows a signer to anonymously sign a message on behalf of a spontaneously formed group. It not only ensures the authenticity of the message but also conceals the true signer within the group. An important extension of ring signatures is linkable ring signatures, which prevent a signer from signing twice without being detected (under some constraints). Linkable ring signatures offer advantages in applications where full anonymity might jeopardise the intended purpose, such as privacy-oriented cryptocurrencies like Monero.
In this work, we introduce post-quantum ring signature (DualRing-PRF) and linkable ring signature (DualRingL-PRF) schemes whose security solely rely on symmetric-key primitives (namely, Legendre PRF and power residue PRF). Our construction of the ring signature departs from previous approaches with similar security assumptions, offering the most competitive signature sizes for small and medium-sized rings. In particular, for a ring size of 16, DualRing-PRF has a communication overhead 1.4 times smaller than the state-of-the-art scheme proposed by Goel et al. (PETS’21). Furthermore, we demonstrate the extension of DualRing-PRF to incorporate linkability and non-slanderability. Compared to the existing one-time traceable ring signature (a variant of linkable ring signature) by Scafuro and Zhang (ESORICS’21), our construction supports many-time signing and achieves significantly smaller signature sizes when ring size exceeds 16. This advantage becomes more pronounced as the ring size increases.
SNOW-SCA: ML-assisted Side-Channel Attack on SNOW-V
This paper presents SNOW-SCA, the first power side-channel analysis (SCA) attack of a 5G mobile communication security standard candidate, SNOW-V, running on a 32-bit ARM Cortex-M4 microcontroller. First, we perform a generic known-key correlation (KKC) analysis to identify the leakage points. Next, a correlation power analysis (CPA) attack is performed, which reduces the attack complexity to two key guesses for each key byte. The correct secret key is then uniquely identified utilizing linear discriminant analysis (LDA). The profiled SCA attack with LDA achieves 100% accuracy after training with < 200 traces, which means the attack succeeds with just a single trace. Overall, using the combined CPA and LDA attack model, the correct secret key byte is recovered with < 50 traces collected using the ChipWhisperer platform. The entire 256-bit secret key of SNOW-V can be recovered incrementally using the proposed SCA attack. Finally, we suggest low-overhead countermeasures that can be used to prevent these SCA attacks.
FHEDA: Efficient Circuit Synthesis with Reduced Bootstrapping for Torus FHE
Fully Homomorphic Encryption (FHE) schemes are widely used cryptographic primitives for performing arbitrary computations on encrypted data. However, FHE incorporates a computationally intensive mechanism called bootstrapping, that resets the noise in the ciphertext to a lower level allowing the computation on circuits of arbitrary depth. This process can take significant time, ranging from several minutes to hours. To address the above issue, in this work, we propose an Electronic Design Automation (EDA) framework FHEDA that generates efficient Boolean representations of circuits compatible with the Torus-FHE (ASIACRYPT 2020) scheme. To the best of our knowledge, this is the first work in the EDA domain of FHE. We integrate logic synthesis and gate optimization techniques into our FHEDA framework for reducing the total number of bootstrapping operations in a Boolean circuit, which leads to a significant (up to 50%) reduction in homomorphic computation time. Our FHEDA is built upon the observation that in Torus-FHE two consecutive Boolean gate evaluations over fresh encryptions require only one bootstrapping instead of two, based on appropriate parameter choices. By integrating this observation with logic replacement techniques into FHEDA, we could reduce the total number of bootstrapping operations along with the circuit depth. This eventually reduces the homomorphic evaluation time of Boolean circuits. In order to verify the efficacy of our approach, we assess the performance of
the proposed EDA flow on a diverse set of representative benchmarks including privacypreserving machine learning and different symmetric key block ciphers.
SoK: Programmable Privacy in Distributed Systems
This Systematization of Knowledge conducts a survey of contemporary distributed blockchain protocols, with the aim of identifying cryptographic and design techniques which practically enable both expressive programmability and user data confidentiality. To facilitate a framing which supports the comparison of concretely very different protocols, we define an epoch-based computational model in the form of a flexible UC-style ideal functionality which divides the operation of privacy-preserving networks into three phases: Independent, Mediated, and Global computation. Our analysis of protocols focuses in particular on features of the Mediated computation phase, which provides the facility to execute non-trivial program logic on private inputs from multiple users. Specifically, we compare implementations in different protocols for private limit order auctions, which we find to be a representative application which is common and relatively simple, but which exhibits adversarial dynamics which demonstrate the capabilities of a non-trivial Mediated computation mechanism. In our analysis, we identify four protocols representative of different high-level approaches used to implement Mediated computations. We compare protocols according to the degree and flexibility of programmability, the privacy properties achieved, and the security assumptions required for correct operation. We conclude by offering recommendations and best practices for future programmable privacy designs.
Malicious Security for PIR (almost) for Free
Private Information Retrieval (PIR) enables a client to retrieve a database element from a semi-honest server while hiding the element being queried from the server. Maliciously-secure PIR (mPIR) [Colombo et al., USENIX Security '23] strengthens the guarantees of plain (i.e., semi-honest) PIR by ensuring that even a misbehaving server (a) cannot compromise client privacy via selective-failure attacks, and (b) must answer every query *consistently* (i.e., with respect to the same database). These additional security properties are crucial for many real-world applications.
In this work we present a generic compiler that transforms any PIR scheme into an mPIR scheme in a black-box manner, minimal overhead, and without requiring additional cryptographic assumptions. Since mPIR trivially implies PIR, our compiler establishes the equivalence of mPIR and PIR. By instantiating our compiler with existing PIR schemes, we immediately obtain mPIR schemes with $O(N^\epsilon)$ communication cost. In fact, by applying our compiler to a recent doubly-efficient PIR [Lin et al., STOC '23], we are able to construct a *doubly-efficient* mPIR scheme that requires only $\text{polylog}(N)$ communication and server and client computation. In comparison, all prior work incur a $\Omega(\sqrt{N})$ cost in these metrics.
Our compiler makes use of smooth locally-decodable codes (LDCs) that have a robust decoding procedure. We term these codes "subcode"-LDCs, because they are LDCs where the query responses are from an error-correcting code. This property is shared by Reed--Muller codes (whose query responses are Reed--Solomon codewords) and more generally lifted codes.
Applying our compiler requires us to consider decoding in the face of *non-signaling adversaries*, for reasons analogous to the need for non-signaling PCPs in the succinct-argument literature. We show how to construct such decoders for Reed--Muller codes, and more generally for smooth locally-decodable codes that have a robust decoding procedure.
Pseudorandom Error-Correcting Codes
We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which are error-correcting codes with the property that any polynomial number of codewords are pseudorandom to any computationally-bounded adversary. Efficient decoding of corrupted codewords is possible with the help of a decoding key.
We build pseudorandom codes that are robust to substitution and deletion errors, where pseudorandomness rests on standard cryptographic assumptions. Specifically, pseudorandomness is based on either $2^{O(\sqrt{n})}$-hardness of LPN, or polynomial hardness of LPN and the planted XOR problem at low density.
As our primary application of pseudorandom codes, we present an undetectable watermarking scheme for outputs of language models that is robust to cropping and a constant rate of random substitutions and deletions. The watermark is undetectable in the sense that any number of samples of watermarked text are computationally indistinguishable from text output by the original model. This is the first undetectable watermarking scheme that can tolerate a constant rate of errors.
Our second application is to steganography, where a secret message is hidden in innocent-looking content. We present a constant-rate stateless steganography scheme with robustness to a constant rate of substitutions. Ours is the first stateless steganography scheme with provable steganographic security and any robustness to errors.
Improved Boomerang Attacks on 6-Round AES
The boomerang attack is a cryptanalytic technique which allows combining two short high-probability differentials into a distinguisher for a large number of rounds. Since its introduction by Wagner in 1999, it has been applied to many ciphers. One of the best-studied targets is a 6-round variant of AES, on which the boomerang attack is outperformed only by the dedicated Square attack. Recently, two new variants of the boomerang attack were presented: retracing boomerang (Eurocrypt'20) and truncated boomerang (Eurocrypt'23). These variants seem incompatible: the former achieves lower memory complexity by throwing away most of the data in order to force dependencies, while the latter achieves lower time complexity by using large structures, which inevitably leads to a large memory complexity.
In this paper we show that elements of the two techniques can be combined to get `the best of the two worlds' – the practical memory complexity of the retracing attack and the lower time complexity of the truncated attack. We obtain an attack with data complexity of $2^{57}$ (compared to $2^{59}$ and $2^{55}$ of truncated and retracing boomerang, respectively), memory complexity of $2^{33}$ (compared to $2^{59}$ and $2^{31}$), and time complexity of $2^{61}$ (compared to $2^{61}$ and $2^{80}$). This is the second-best attack on 6-round AES, after the Square attack.
Throwing Boomerangs into Feistel Structures: Application to CLEFIA, WARP, LBlock, LBlock-s and TWINE
Automatic tools to search for boomerang distinguishers have seen significant advances over the past few years. However, most previous work has focused on ciphers based on a Substitution Permutation Network (SPN), while analyzing the Feistel structure is of great significance. Boukerrou et al. recently provided a theoretical framework to formulate the boomerang switch over multiple Feistel rounds, but they did not provide an automatic tool to find distinguishers. In this paper, by enhancing the recently proposed method by Hadipour et al., we provide an automatic tool to search for boomerang distinguishers and apply it to block ciphers following the Generalized Feistel Structure (GFS). Applying our tool to a wide range of GFS ciphers, we show that it significantly improves the best previous results on boomerang analysis. In particular, we improve the best previous boomerang distinguishers for 20 and 21 rounds of WARP by a factor of $2^{38.28}$ and $2^{36.56}$, respectively. Thanks to the effectiveness of our method, we can extend the boomerang distinguishers of WARP by two rounds and distinguish 23 rounds of this cipher from a random permutation. Applying our method to the internationally-standardized cipher CLEFIA, we achieve a 9-round boomerang distinguisher which improves the best previous boomerang distinguisher by one round. Based on this distinguisher, we build a key-recovery attack on 11 rounds of CLEFIA, which improves the best previous sandwich attack on this cipher by one round. We also apply our method to LBlock, LBlock-s, and TWINE and improve the best previous boomerang distinguisher of these ciphers.
ZLR: a fast online authenticated encryption scheme achieving full security
Online authenticated encryption has been considered of practical relevance in light-weight environments due to low latency and constant memory usage. In this paper, we propose a new tweakable block cipher-based online authenticated encryption scheme, dubbed ZLR, and its domain separation variant, dubbed DS-ZLR. ZLR and DS-ZLR follow the Encrypt-MixEncrypt paradigm. However, in contrast to existing schemes using the same paradigm such as ELmE and CoLM, ZLR and DS-ZLR enjoy n-bit security by using larger internal states with an efficient ZHash-like hashing algorithm. In this way, 2n-bit blocks are processed with only a single primitive call for hashing and two primitive calls for encryption and decryption, when they are based on an n-bit tweakable block cipher using n-bit (resp. 2n-bit) tweaks for ZLR (resp. DS-ZLR). Furthermore, they support pipelined computation as well as online nonce-misuse resistance. To the best of our knowledge, ZLR and DS-ZLR are the first pipelineable tweakable block cipher-based online authenticated encryption schemes of rate 2/3 that provide n-bit security with online nonce-misuse resistance.
Phoenix: Hash-and-Sign with Aborts from Lattice Gadgets
Preimage sampling is a fundamental tool in lattice-based cryptography, and its performance directly impacts that of the cryptographic mechanisms relying on it. In 2012, Micciancio and Peikert proposed a new way of generating trapdoors (and an associated preimage sampling procedure) with very interesting features. Unfortunately, in some applications such as digital signatures, the performance may not be as competitive as other approaches like Fiat-Shamir with Aborts. In an effort to improve preimage sampling for Micciancio-Peikert (MP) trapdoors, Lyubashevsky and Wichs (LW) introduced a new sampler which leverages rejection sampling but suffers from strong parameter requirements that hampered performance. As a consequence it seemed to be restricted to theoretical applications and has not been, to our knowledge, considered for real-world applications.
Our first contribution is to revisit the LW sampler by proposing an improved analysis which yields much more compact parameters. This leads to gains on the preimage size of about 60% over the LW sampler, and up to 25% compared to the original MP sampling technique. It thus sheds a new light on the LW sampler, opening promising perspectives for the efficiency of advanced lattice-based constructions relying on such mechanisms. To provide further improvements, we show that it perfectly combines with the approximate trapdoors approach by Chen, Genise and Mukherjee, but with a smaller preimage error.
Building upon those results, we introduce a hash-and-sign signature scheme called Phoenix. The scheme is based on the M-LWE and M-SIS assumptions and features attractive public key and signature sizes which are even smaller than those of the most recent gadget-based construction Eagle of Yu, Jia and Wang (Crypto’23). Moreover, Phoenix is designed to be implementation-friendly, avoiding in particular complex Gaussian samplers that are often hard to protect.
Towards Optimal Parallel Broadcast under a Dishonest Majority
The parallel broadcast (PBC) problem generalises the classic Byzantine broadcast problem to the setting where all $n$ nodes broadcast a message and deliver $O(n)$ messages. PBC arises naturally in many settings including multi-party computation. Recently, Tsimos, Loss, and Papamanthou (CRYPTO 2022) showed PBC protocols with improved communication, against an adaptive adversary who can corrupt all but a constant fraction $\epsilon$ of nodes (i.e., $f < (1 - \epsilon)n$). However, their study is limited to single-bit messages, and their protocols have large polynomial overhead in the security parameter $\kappa$: their TrustedPBC protocol achieves $\tilde{O}(n^2 \kappa^4)$ communication and $O(\kappa\log n)$ rounds. Since these factors of $\kappa$ are in practice often close (or at least polynomially related) to $n$, they add a significant overhead. In this work, we propose three parallel broadcast protocols for $L$-bit messages, for any size $L$, that significantly improve the communication efficiency of the state-of-the-art.
We first propose a new extension protocol that uses a $\kappa$-bit PBC as a black box and achieves i) communication complexity of $O(L n^2 + \mathcal{P}(\kappa))$, where $\mathcal{P}(\kappa)$ is the communication complexity of the $\kappa$-bit PBC, and ii) round complexity same as the $\kappa$-bit PBC. By comparison, the state-of-the-art extension protocol for regular broadcast (Nayak et al., DISC 2020) incurs $O(n)$ additional rounds of communication. Next, we propose a protocol that is secure against a static adversary, for $\kappa$-bit messages with $O(n^2 \kappa^{1+K} + n\kappa^3 + \kappa^4)$ communication and $O(\kappa)$ round complexity, where $K$ is an arbitrarily small constant such that $0<K<1$. Finally, we propose an adaptively-secure protocol for $\kappa$-bit messages with $\tilde{O}(n^2\kappa^2 + n\kappa^3)$ communication overhead and $O(\kappa \log{n})$ round complexity by modifying and improving the next-best protocol TrustedPBC in several key ways. Notably, our latter two protocols are $\tilde{O}(\kappa^{2 - K})$ and $O(\kappa^2)$ times more communication-efficient, respectively, than the state-of-the-art protocols while achieving the same round complexity.
ICICLE v2: Polynomial API for Coding ZK Provers to Run on Specialized Hardware
Polynomials play a central role in cryptography. In the context of Zero Knowledge Proofs (ZKPs), protocols can be exclusively expressed using polynomials, making them a powerful abstraction tool, as demonstrated in most ZK research papers. Our first contribution is a high-level framework that enables practitioners to implement ZKPs in a more natural way, based solely on polynomial primitives.
ZK provers are considered computationally intensive algorithms with a high degree of parallelization. These algorithms benefit significantly from hardware acceleration, and deployed ZK systems typically include specialized hardware to optimize the performance of the prover code. Our second contribution is leveraging our polynomial API to abstract away low-level hardware primitives and automate their memory management. This device-agnostic design allows ZK engineers to prototype and build solutions while taking advantage of the performance gains offered by specialized hardware, such as GPUs and FPGAs, without needing to understand the hardware implementation details.
Finally, our polynomial API is integrated into version 2 of the ICICLE library and is running in production. This paper also serves as a comprehensive documentation for the ICICLE v2 polynomial API.
Efficient Secure Communication Over Dynamic Incomplete Networks With Minimal Connectivity
We study the problem of implementing unconditionally secure reliable and private communication (and hence secure computation) in dynamic incomplete networks.
Our model assumes that the network is always $k$-connected, for some $k$, but the concrete connection graph is adversarially chosen in each round of interaction.
We show that, with $n$ players and $t$ malicious corruptions, perfectly secure communication is possible if and only if $k > 2t$. This disproves a conjecture from earlier work, that $k> 3t$ is necessary. Our new protocols are much more efficient than previous work; in particular, we improve the round and communication complexity by an exponential factor (in $n$) in both the semi-honest and the malicious corruption setting, leading to protocols with polynomial complexity.
Unclonable Secret Sharing
Unclonable cryptography utilizes the principles of quantum mechanics to addresses cryptographic tasks that are impossible classically. We introduce a novel unclonable primitive in the context of secret sharing, called unclonable secret sharing (USS). In a USS scheme, there are $n$ shareholders, each holding a share of a classical secret represented as a quantum state. They can recover the secret once all parties (or at least $t$ parties) come together with their shares. Importantly, it should be infeasible to copy their own shares and send the copies to two non-communicating parties, enabling both of them to recover the secret.
Our work initiates a formal investigation into the realm of unclonable secret sharing, shedding light on its implications, constructions, and inherent limitations.
** Connections: We explore the connections between USS and other quantum cryptographic primitives such as unclonable encryption and position verification, showing the difficulties to achieve USS in different scenarios.
**Limited Entanglement: In the case where the adversarial shareholders do not share any entanglement or limited entanglement, we demonstrate information-theoretic constructions for USS.
**Large Entanglement: If we allow the adversarial shareholders to have unbounded entanglement resources (and unbounded computation), we prove that unclonable secret sharing is impossible. On the other hand, in the quantum random oracle model where the adversary can only make a bounded polynomial number of queries, we show a construction secure even with unbounded entanglement.
Furthermore, even when these adversaries possess only a polynomial amount of entanglement resources, we establish that any unclonable secret sharing scheme with a reconstruction function implementable using Cliffords and logarithmically many T-gates is also unattainable.
A Note on (2, 2)-isogenies via Theta Coordinates
In this paper, we revisit the algorithm for computing chains of $(2, 2)$-isogenies between products of elliptic curves via theta coordinates proposed by Dartois et al. For each fundamental block of this algorithm, we provide a explicit inversion-free version. Besides, we exploit a novel technique of $x$-only ladder to speed up the computation of gluing isogeny. Finally, we present a mixed optimal strategy, which combines the inversion-elimination tool with the original methods together to execute a chain of $(2, 2)$-isogenies.
We make a cost analysis and present a concrete comparison between ours and the previously known methods for inversion elimination. Furthermore, we implement the mixed optimal strategy for benchmark. The results show that when computing $(2, 2)$-isogeny chains with lengths of 126, 208 and 632, compared to Dartois, Maino, Pope and Robert's original implementation, utilizing our techniques can reduce $30.8\%$, $20.3\%$ and $9.9\%$ multiplications over the base field $\mathbb{F}_p$, respectively. Even for the updated version which employs their inversion-free methods, our techniques still possess a slight advantage.
Cryptanalysis of QARMAv2
QARMAv2 is a general-purpose and hardware-oriented family of lightweight tweakable block ciphers (TBCs) introduced in ToSC 2023. QARMAv2, as a redesign of QARMAv1 with a longer tweak and tighter security margins, is also designed to be suitable for cryptographic memory protection and control flow integrity. The designers of QARMAv2 provided a relatively comprehensive security analysis in the design specification, e.g., some bounds for the number of attacked rounds in differential and boomerang analysis, together with some concrete impossible differential, zero-correlation, and integral distinguishers. As one of the first third-party cryptanalysis of QARMAv2, Hadipour et al. significantly improved the integral distinguishers of QARMAv2, and provided the longest concrete distinguishers of QARMAv2 up to now. However, they provided no key recovery attack based on their distinguishers.
This paper delves into the cryptanalysis of QARMAv2 to enhance our understanding of its security. Given that the integral distinguishers of QARMAv2 are the longest concrete distinguishers for this cipher so far, we focus on integral attack. To this end, we first further improve the automatic tool introduced by Hadipour et al. for finding integral distinguishers of TBCs following the TWEAKEY framework. This new tool exploits the MixColumns property of QARMAv2 to find integral distinguishers more suitable for key recovery attacks. Then, we combine several techniques for integral key recovery attacks, e.g., Meet-in-the-middle and partial-sum techniques to build a fine-grained integral key recovery attack on QARMAv2. Notably, we demonstrate how to leverage the low data complexity of the integral distinguishers of QARMAv2 to reduce the memory complexity of the meet-in-the-middle technique. As a result, we successfully present the first concrete key recovery attacks on reduced-round versions of QARMAv2. This includes attacking 13 rounds of QARMAv2-64-128 with a single tweak block ($\mathscr{T} = 1$), 14 rounds of QARMAv2-64-128 with two independent tweak blocks ($\mathscr{T} = 2$), and 16 rounds of QARMAv2-128-256 with two independent tweak blocks ($\mathscr{T} = 2$), all in an unbalanced setting. Our attacks do not compromise the claimed security of QARMAv2, but they shed more light on the cryptanalysis of this cipher.
Finding the Impossible: Automated Search for Full Impossible-Differential, Zero-Correlation, and Integral Attacks
Impossible differential (ID), zero-correlation (ZC), and integral attacks are a family of important attacks on block ciphers. For example, the impossible differential attack was the first cryptanalytic attack on 7 rounds of AES. Evaluating the security of block ciphers against these attacks is very important but also challenging: Finding these attacks usually implies a combinatorial optimization problem involving many parameters and constraints that is very hard to solve using manual approaches. Automated solvers, such as Constraint Programming (CP) solvers, can help the cryptanalyst to find suitable attacks. However, previous CP-based methods focus on finding only the ID, ZC, and integral distinguishers, often only in a limited search space. Notably, none can be extended to a unified optimization problem for finding full attacks, including efficient key-recovery steps.
In this paper, we present a new CP-based method to search for ID, ZC, and integral distinguishers and extend it to a unified constraint optimization problem for finding full ID, ZC, and integral attacks. To show the effectiveness and usefulness of our method, we applied it to several block ciphers, including SKINNY, CRAFT, SKINNYe-v2, and SKINNYee. For the ISO standard block cipher SKINNY, we significantly improve all existing ID, ZC, and integral attacks. In particular, we improve the integral attacks on SKINNY-$n$-$3n$ and SKINNY-$n$-$2n$ by 3 and 2 rounds, respectively, obtaining the best cryptanalytic results on these variants in the single-key setting. We improve the ZC attack on SKINNY-$n$-$n$ (SKINNY-$n$-$2n$) by 2 (resp. 1) rounds. We also improve the ID attacks on all variants of SKINNY. Particularly, we improve the time complexity of the best previous single-tweakey (related-tweakey) ID attack on SKINNY-$128$-$256$ (resp. SKINNY-$128$-$384$) by a factor of $2^{22.57}$ (resp. $2^{15.39}$). On CRAFT, we propose a 21-round (20-round) ID (resp. ZC) attack, which improves the best previous single-tweakey attack by 2 (resp. 1) rounds. Using our new model, we also provide several practical integral distinguishers for reduced-round SKINNY, CRAFT, and Deoxys-BC. Our method is generic and applicable to other strongly aligned block ciphers.
Autoguess: A Tool for Finding Guess-and-Determine Attacks and Key Bridges
The guess-and-determine technique is one of the most widely used techniques in cryptanalysis to recover unknown variables in a given system of relations. In such attacks, a subset of the unknown variables is guessed such that the remaining unknowns can be deduced using the information from the guessed variables and the given relations. This idea can be applied in various areas of cryptanalysis such as finding the internal state of stream ciphers when a sufficient amount of output data is available, or recovering the internal state and the secret key of a block cipher from very few known plaintexts. Another important application is the key-bridging technique in key-recovery attacks on block ciphers, where the attacker aims to find the minimum number of required sub-key guesses to deduce all involved sub-keys via the key schedule. Since the complexity of the guess-and-determine technique directly depends on the number of guessed variables, it is essential to find the smallest possible guess basis, i.e., the subset of guessed variables from which the remaining variables can be deduced.
In this paper, we present Autoguess, an easy-to-use general tool to search for a minimal guess basis. We propose several new modeling techniques to harness SAT/SMT, MILP, and Gröbner basis solvers. We demonstrate their usefulness in guess-and-determine attacks on stream ciphers and block ciphers, as well as finding key-bridges in key recovery attacks on block ciphers. Moreover, integrating our CP models for the key-bridging technique into the previous CP-based frameworks to search for distinguishers, we propose a unified and general CP model to search for key recovery friendly distinguishers which supports both linear and nonlinear key schedules.
Cryptography at the Crossroads: Ethical Responsibility, the Cypherpunk Movement and Institutions
This paper explores the intersection of cryptographic work with ethical responsibility and political activism, inspired by the Cypherpunk Manifesto and Phillip Rogaway's analysis of the moral character of cryptography. The discussion encompasses the historical context of cryptographic development, the philosophical underpinnings of the cypherpunk ideology, and contemporary challenges posed by mass surveillance and privacy concerns. By examining these facets, the paper calls for a renewed commitment to developing cryptographic solutions that prioritize human rights and societal good.
Dumbo-MVBA: Optimal Multi-valued Validated Asynchronous Byzantine Agreement, Revisited
Multi-valued validated asynchronous Byzantine agreement (MVBA), proposed in the elegant work of Cachin et al. (CRYPTO '01), is fundamental for critical fault-tolerant services such as atomic broadcast in the asynchronous network. It was left as an open problem to asymptotically reduce the $O(ln^2+n^2*lambda+n^3)$ communication (where $n$ is the number of parties, $l$ is the input length, and $lambda$ is the security parameter). Recently, Abraham et al. (PODC '19) removed the $n^3$ term to partially answer the question when input is small. However, in other typical cases, e.g., building atomic broadcast through MVBA, the input length $l >= n*lambda$, and thus the communication is dominated by the $ln^2$ term and the problem raised by Cachin et al. remains open.
We fill the gap and answer the remaining part of the above open problem. In particular, we present two MVBA protocols with $O(ln+n^2*lambda)$ communicated bits, which is optimal when $l >= n*lambda$. We also maintain other benefits including optimal resilience to tolerate up to $n/3$ adaptive Byzantine corruptions, optimal expected constant running time, and optimal $O(n^2)$ messages.
At the core of our design, we propose asynchronous provable dispersal broadcast (APDB) in which each input can be split and dispersed to every party and later recovered in an efficient way. Leveraging APDB and asynchronous binary agreement, we design an optimal MVBA protocol, Dumbo-MVBA; we also present a general self-bootstrap framework Dumbo-MVBA* to reduce the communication of any existing MVBA protocols.
Finally, we demonstrate an enticing application of our $\mathsf{MVBA}$ protocols to construct expected constant-round asynchronous common subset ($\mathsf{ACS}$) with only $\mathcal{O}(\ell n^2 +\lambda n^2)$ communication complexity in expect, where $\ell$ is the input length of $\mathsf{ACS}$ represented in bits. The resulting $\mathsf{ACS}$ has asymptotically optimal communication cost when $\ell\geq\lambda$ and might have a broad array of applications like asynchronous atomic broadcasts or asynchronous multi-party computation.