All papers in 2025 (78 results)
Triple Ratchet: A Bandwidth Efficient Hybrid-Secure Signal Protocol
Secure Messaging apps have seen growing adoption, and are used by billions of people daily. However, due to imminent threat of a "Harvest Now, Decrypt Later" attack, secure messaging providers must react know in order to make their protocols $\textit{hybrid-secure}$: at least as secure as before, but now also post-quantum (PQ) secure. Since many of these apps are internally based on the famous Signal's Double-Ratchet (DR) protocol, making Signal hybrid-secure is of great importance.
In fact, Signal and Apple already put in production various Signal-based variants with certain levels of hybrid security: PQXDH (only on the initial handshake), and PQ3 (on the entire protocol), by adding a $\textit{PQ-ratchet}$ to the DR protocol. Unfortunately, due to the large communication overheads of the $\mathsf{Kyber}$ scheme used by PQ3, real-world PQ3 performs this PQ-ratchet approximately every 50 messages. As we observe, the effectiveness of this amortization, while reasonable in the best-case communication scenario, quickly deteriorates in other still realistic scenarios; causing $\textit{many consecutive}$ (rather than $1$ in $50$) re-transmissions of the same $\mathsf{Kyber}$ public keys and ciphertexts (of combined size 2272 bytes!).
In this work we design a new Signal-based, hybrid-secure secure messaging protocol, which significantly reduces the communication complexity of PQ3. We call our protocol "the $\textit{Triple Ratchet}$" (TR) protocol. First, TR uses $\textit{em erasure codes}$ to make the communication inside the PQ-ratchet provably balanced. This results in much better $\textit{worst-case}$ communication guarantees of TR, as compared to PQ3. Second, we design a novel "variant" of $\mathsf{Kyber}$, called $\mathsf{Katana}$, with significantly smaller combined length of ciphertext and public key (which is the relevant efficiency measure for "PQ-secure ratchets"). For 192 bits of security, $\mathsf{Katana}$ improves this key efficiency measure by over 37%: from 2272 to 1416 bytes. In doing so, we identify a critical security flaw in prior suggestions to optimize communication complexity of lattice-based PQ-ratchets, and fix this flaw with a novel proof relying on the recently introduced hint MLWE assumption.
During the development of this work we have been in discussion with the Signal team, and they are actively evaluating bringing a variant of it into production in a future iteration of the Signal protocol.
On Multi-Key FuncCPA Secure Encryption Schemes
The notion of funcCPA security for homomorphic encryption schemes was introduced by Akavia \textit{et~al.}\ (TCC 2022). Whereas it aims to capture the bootstrapping technique in homomorphic encryption schemes, Dodis \textit{et~al.}\ (TCC 2023) pointed out that funcCPA security can also be applied to non-homomorphic public-key encryption schemes (PKE). As an example, they presented a use case for privacy-preserving outsourced computation without homomorphic computation. It should be noted that prior work on funcCPA security, including the use case presented by Dodis \textit{et~al.}, considered only the single-key setting. However, in recent years, multi-party collaboration in outsourced computation has garnered significant attention, making it desirable for funcCPA security to support the multi-key setting. Therefore, in this work, we introduce a new notion of security called Multi-Key funcCPA (MKfunc) to address this need, and show that if a PKE scheme is KDM-secure, then it is also MKfuncCPA secure. Furthermore, we show that similar discussions can be applied to symmetric-key encryption.
Decompose and conquer: ZVP attacks on GLV curves
While many side-channel attacks on elliptic curve cryptography can be avoided by coordinate randomization, this is not the case for the zero-value point (ZVP) attack. This attack can recover a prefix of static ECDH key but requires solving an instance of the dependent coordinates problem (DCP), which is open in general. We design a new method for solving the DCP on GLV curves, including the Bitcoin secp256k1 curve, outperforming previous approaches. This leads to a new type of ZVP attack on multiscalar multiplication, recovering twice as many bits when compared to the classical ZVP attack. We demonstrate a $63\%$ recovery of the private key for the interleaving algorithm for multiscalar multiplication. Finally, we analyze the largest database of curves and addition formulas with over 14 000 combinations and provide the first classification of their resistance against the ZVP attack.
Further Improvements in AES Execution over TFHE: Towards Breaking the 1 sec Barrier
Making the most of TFHE advanced capabilities such as programmable or circuit bootstrapping and their generalizations for manipulating data larger than the native plaintext domain of the scheme is a very active line of research. In this context, AES is a particularly interesting benchmark, as an example of a nontrivial algorithm which has eluded ``practical'' FHE execution performances for years, as well as the fact that it will most likely be selected by NIST as a flagship reference in its upcoming call on threshold (homomorphic) cryptography. Since 2023, the algorithm has thus been the subject of a renewed attention from the FHE community and has served as a playground to test advanced operators following the LUT-based, $p$-encodings or several variants of circuit bootstrapping, each time leading to further timing improvements. Still, AES is also interesting as a benchmark because of the tension between boolean- and byte-oriented operations within the algorithm. In this paper, we resolve this tension by proposing a new approach, coined ``\hippo'', which consistently combines the (byte-oriented) LUT-based approach with a generalization of the (boolean-oriented) $p$-encodings one to get the best of both worlds. In doing so, we obtain the best timings so far, getting a single-core execution of the algorithm over TFHE from $46$ down to $32$ seconds and approaching the $1$ second barrier with only a mild amount of parallelism. We should also stress that all the timings reported in this paper are consistently obtained on the same machine which is often not the case in previous studies. Lastly, we emphasize that the techniques we develop are applicable beyond just AES since the boolean-byte tension is a recurrent issue when running algorithms over TFHE.
XBOOT: Free-XOR Gates for CKKS with Applications to Transciphering
The CKKS scheme is traditionally recognized for approximate homomorphic encryption of real numbers, but BLEACH (Drucker et al., JoC 2024) extends its capabilities to handle exact computations on binary or small integer numbers.
Despite this advancement, BLEACH's approach of simulating XOR gates via $(a-b)^2$ incurs one multiplication per gate, which is computationally expensive in homomorphic encryption. To this end, we introduce XBOOT, a new framework built upon BLEACH's blueprint but allows for almost free evaluation of XOR gates. The core concept of XBOOT involves lazy reduction, where XOR operations are simulated with the less costly addition operation, $a+b$, leaving the management of potential overflows to later stages. We carefully handle the modulus chain and scale factors to ensure that the overflows would be conveniently rounded during the CKKS bootstrapping phase without extra cost. We use AES-CKKS transciphering as a benchmark to test the capability of XBOOT, and achieve a throughput exceeding one kilobyte per second, which represents a $2.5\times$ improvement over the state-of-the-art (Aharoni et al., HES 2023). Moreover, XBOOT enables the practical execution of tasks with extensive XOR operations that were previously challenging for CKKS. For example, we can do Rasta-CKKS transciphering at over two kilobytes per second, more than $10\times$ faster than the baseline without XBOOT.
Conditional Constant Function Problem and Its Quantum Solutions: Attacking Feistel Ciphers
In this paper, we define the conditional constant function problem (CCFP) and, for a special case of CCFP, we propose a quantum algorithm for solving it efficiently. Such an algorithm enables us to make new evaluations to the quantum security of Feistel block cipher in the case where a quantum adversary only has the ability to make online queries in a classical manner, which is relatively realistic. Specifically, we significantly improved the chosen-plaintext key recovery attacks on two Feistel block cipher variants known as Feistel-KF and Feistel-FK. For Feistel-KF, we construct a 3-round distinguisher based on the special case of CCFP and propose key recovery attacks mounting to $r>3$ rounds. For Feistel-FK, our CCFP based distinguisher covers 4 rounds and the key recovery attacks are applicable for $r>4$ rounds. Utilizing our CCFP solving algorithm, we are able to reduce the classical memory complexity of our key recovery attacks from the previous exponential $O(2^{cn})$ to $O(1)$.
The query complexity of our key recovery attacks on Feistel-KF is also significantly reduced from $O(2^{cn})$ to $O(1)$ where $c$'s are constants.
Our key recovery results enjoy the current optimal complexities.
They also indicate that quantum algorithms solving CCFP could be more promising than those solving the period finding problem.
PSMT: Private Segmented Membership Test for Distributed Record Linkage
In various real-world situations, a client may need to verify whether specific data elements they possess are part of a set segmented among numerous data holders.
To maintain user privacy, it’s essential that both the client’s data elements and the data holders’ sets remain encrypted throughout the process.
Existing approaches like Private Set Intersection (PSI), Multi-Party PSI (MPSI), Private Segmented Membership Test (PSMT), and Oblivious RAM (ORAM) face challenges in these contexts.
They either require data holders to access the sets in plaintext, result in high latency when aggregating data from multiple holders, risk exposing the identity of the party with the matching element, cause a large communication overhead for multiple-element queries, or lead to high false positives.
This work introduces the primitive of a Private Segmented Membership Test (PSMT) for clients with multiple query elements.
We present a basic protocol for solving PSMT using a threshold variant of approximate-arithmetic homomorphic encryption, addressing the challenges of avoiding information leakage about the party with the intersection element, minimizing communication overhead for multiple query elements, and preventing false positives for a large number of data holders ensuring IND-CPA^D security.
Our novel approach surpasses current state-of-the-art methods in scalability, supporting significantly more data holders.
This is achieved through a novel summation-based homomorphic membership check rather than a product-based one, as well as various novel ideas addressing technical challenges.
Our new PSMT protocol supports a large number of parties and query elements (up to 4096 parties and 512 queries in experiments) compared to previous methods.
Our experimental evaluation shows that our method's aggregation of results from 1024 data holders with a set size of 2^15 can run in 71.2s and only requires an additional 1.2 seconds per query for processing multiple queries.
We also compare our PSMT protocol to other state-of-the-art PSI and MPSI protocols and our previous work and discuss our improvements in usability with a better privacy model and a larger number of parties and queries.
The HHE Land: Exploring the Landscape of Hybrid Homomorphic Encryption
Hybrid Homomorphic Encryption (HHE) is considered a promising solution for key challenges that emerge when adopting Homomorphic Encryption (HE). In cases such as communication and computation overhead for clients and storage overhead for servers, it combines symmetric cryptography with HE schemes. However, despite a decade of advancements, enhancing HHE usability, performance, and security for practical applications remains a significant stake.
This work contributes to the field by presenting a comprehensive analysis of prominent HHE schemes, focusing on their performance and security. We implemented three superior schemes--PASTA, HERA, and Rubato--using the Go programming language and evaluated their performance in a client-server setting. To promote open science and reproducibility, we have made our implementation publicly available on GitHub.
Furthermore, we conducted an extensive study of applicable attacks on HHE schemes, categorizing them under algebraic-based, differential-based, linear-based, and LWE-based attacks. Our security analysis revealed that while most existing schemes meet theoretical security requirements, they remain vulnerable to practical attacks. These findings emphasize the need for improvements in practical security measures, such as defining standardized parameter sets and adopting techniques like noise addition to counter these attacks.
This survey provides insights and guidance for researchers and practitioners to design and develop secure and efficient HHE systems, paving the way for broader real-world adoption.
Beyond Optimal Fault-Tolerance
One of the most basic properties of a consensus protocol is its fault-tolerance--the maximum fraction of faulty participants that the protocol can tolerate without losing fundamental guarantees such as safety and liveness. Because of its importance, the optimal fault-tolerance achievable by any protocol has been characterized in a wide range of settings. For example, for state machine replication (SMR) protocols operating in the partially synchronous setting, it is possible to simultaneously guarantee consistency against $\alpha$-bounded adversaries (i.e., adversaries that control less than an $\alpha$ fraction of the participants) and liveness against $\beta$-bounded adversaries if and only if $\alpha + 2\beta \leq 1$.
This paper characterizes to what extent ``better-than-optimal'' fault-tolerance guarantees are possible for SMR protocols when the standard consistency requirement is relaxed to allow a bounded number $r$ of consistency violations, each potentially leading to the rollback of recently finalized transactions. We prove that bounding rollback is impossible without additional timing assumptions and investigate protocols that tolerate and recover from consistency violations whenever message delays around the time of an attack are bounded by a parameter $\Delta^*$ (which may be arbitrarily larger than the parameter $\Delta$ that bounds post-GST message delays in the partially synchronous model). Here, a protocol's fault-tolerance can be a non-constant function of $r$, and we prove, for each $r$, matching upper and lower bounds on the optimal ``recoverable fault-tolerance'' achievable by any SMR protocol. For example, for protocols that guarantee liveness against 1/3-bounded adversaries in the partially synchronous setting, a 5/9-bounded adversary can always cause one consistency violation but not two, and a 2/3-bounded adversary can always cause two consistency violations but not three. Our positive results are achieved through a generic ``recovery procedure'' that can be grafted on to any accountable SMR protocol and restores consistency following a violation while rolling back only transactions that were finalized in the previous $2\Delta^*$ timesteps.
On Composing Generic Voting Schemes for Improved Privacy
Hybrid encryption provides a way for schemes to distribute trust among many computational assumptions, for instance by composing existing schemes. This is increasingly relevant as quantum computing advances because it lets us get the best of both worlds from the privacy of the post quantum schemes and the more battle tested classical schemes.
We show how to compose members of a very general class of voting schemes and prove that this preserves correctness and integrity and improves privacy compared to its constituent parts.
We also show an example composition using a lattice based decryption mixnet where the improvement in privacy can indirectly lead to an improvement in integrity.
Shielded CSV: Private and Efficient Client-Side Validation
Cryptocurrencies allow mutually distrusting users to transact monetary value over the internet without relying on a trusted third party.
Bitcoin, the first cryptocurrency, achieved this through a novel protocol used to establish consensus about an ordered transaction history.
This requires every transaction to be broadcasted and verified by the network, incurring communication and computational costs.
Furthermore, transactions are visible to all nodes of the network, eroding privacy, and are recorded permanently, contributing to increasing storage requirements over time.
To limit resource usage of the network, Bitcoin currently supports an average of 11 transactions per second.
Most cryptocurrencies today still operate in a substantially similar manner.
Private cryptocurrencies like Zcash and Monero address the privacy issue by replacing transactions with proofs of transaction validity.
However, this enhanced privacy comes at the cost of increased communication, storage, and computational requirements.
Client-Side Validation (CSV) is a paradigm that addresses these issues by removing transaction validation from the blockchain consensus rules.
This approach allows sending the coin along with a validity proof directly to its recipient, reducing communication, computation and storage cost.
CSV protocols deployed on Bitcoin today do not fully leverage the paradigm's potential, as they still necessitate the overhead of publishing ordinary Bitcoin transactions.
Moreover, the size of their coin proofs is proportional to the coin's transaction history, and provide limited privacy.
A recent improvement is the Intmax2 CSV protocol, which writes significantly less data to the blockchain compared to a blockchain transaction and has succinct coin proofs.
In this work, we introduce Shielded CSV, which improves upon state-of-the-art CSV protocols by providing the first construction that offers truly private transactions.
It addresses the issues of traditional private cryptocurrency designs by requiring only 64 bytes of data per transaction, called a nullifier, to be written to the blockchain.
Moreover, for each nullifier in the blockchain, Shielded CSV users only need to perform a single Schnorr signature verification, while non-users can simply ignore this data.
The size and verification cost of coin proofs for Shielded CSV receivers is independent of the transaction history.
Thus, one application of Shielded CSV is adding privacy to Bitcoin at a rate of 100 transactions per second, provided there is an adequate bridging mechanism to the blockchain.
We specify Shielded CSV using the Proof Carrying Data (PCD) abstraction.
We then discuss two implementation strategies that we believe to be practical, based on Folding Schemes and Recursive STARKs, respectively.
Finally, we propose future extensions, demonstrating the power of the PCD abstraction and the extensibility of Shielded CSV.
This highlights the significant potential for further improvements to the Shielded CSV framework and protocols built upon it.
Constant latency and finality for dynamically available DAG
Directed Acyclic Graph (DAG) based protocols have shown great promise to improve the performance of blockchains. The CAP theorem shows that it is impossible to have a single system that achieves both liveness (known as dynamic availability) and safety under network partition.This paper explores two types of DAG-based protocols prioritizing liveness or safety, named structured dissemination and Graded Common Prefix (GCP), respectively.
For the former, we introduce the first DAG-based protocol with constant expected latency, providing high throughput dynamic availability under the sleepy model. Its expected latency is $3\Delta$ and its throughput linearly scales with participation. We validate these expected performance improvements over existing constant latency sleepy model BFT by running prototypes of each protocol across multiple machines.
The latter, GCP, is a primitive that provides safety under network partition, while being weaker than standard consensus. As a result, we are able to obtain a construction that runs in only $2$ communication steps, as opposed to the $4$ steps of existing low latency partially synchronous BFT. In addition, GCP can easily avoid relying on single leaders' proposals, becoming more resilient to crashes. We also validate these theoretical benefits of GCP experimentally.
We leverage our findings to extend the Ebb-and-Flow framework, where two BFT sub-protocols allow different types of clients in the same system to prioritize either liveness or safety. Our extension integrates our two types of DAG-based protocols. This provides a hybrid DAG-based protocol with high throughput, dynamical availability, and finality under network partitions, without running a standard consensus protocol twice as required in existing work.
Efficient Homomorphic Integer Computer from CKKS
As Fully Homomorphic Encryption (FHE) enables computation over encrypted data, it is a natural question of how efficiently it handles standard integer computations like $64$-bit arithmetic. It has long been believed that the CGGI/DM family or the BGV/BFV family are the best options, depending on the size of the parallelism. The Cheon-Kim-Kim-Song (CKKS) scheme, although being widely used in many applications like machine learning, was not considered a good option as it is more focused on computing real numbers rather than integers.
Recently, Drucker et al. [J. Cryptol.] suggested to use CKKS for discrete computations, by separating the error/noise from the discrete message. Since then, there have been several breakthroughs in the discrete variant of CKKS, including the CKKS-style functional bootstrapping by Bae et al. [Asiacrypt'24]. Notably, the CKKS-style functional bootstrapping can be regarded as a parallelization of CGGI/DM functional bootstrapping, and it is several orders of magnitude faster in terms of throughput. Based on the CKKS-style functional bootstrapping, Kim and Noh [ePrint, 2024/1638] designed an efficient homomorphic modular reduction for CKKS, leading to modulo small integer arithmetic.
Although it is known that CKKS is efficient for handling small integers like $4$ or $8$ bits, it is still unclear whether its efficiency extends to larger integers like $32$ or $64$ bits. In this paper, we propose a novel method for homomorphic unsigned integer computations. We represent a large integer (e.g. $64$-bit) as a vector of smaller chunks (e.g. $4$-bit) and construct arithmetic operations relying on the CKKS-style functional bootstrapping. The proposed scheme supports many of the operations supported in TFHE-rs while outperforming it in terms of amortized running time. Notably, our homomorphic 64-bit multiplication takes $17.9$ms per slot, which is more than three orders of magnitude faster than TFHE-rs.
Morgana: a laconic circuit builder
We construct a novel SNARK proof system, Morgana. The main property of our system is its small circuit keys, which are proportional in size to the description of the circuit, rather than to the number of constraints.
Previously, a common approach to this problem was to first construct a universal circuit (colloquially known as a zk-VM), and then simulate an application circuit within it. However, this approach introduces significant overhead.
Our system, on the other hand, results in a direct speedup compared to Spartan, the state-of-the-art SNARK for R1CS.
Additionally, small circuit keys enable the construction of zk-VMs from our system through a novel approach: first, outputting a commitment to the circuit key, and second, executing our circuit argument for this circuit key.
SoK: Trusted setups for powers-of-tau strings
Many cryptographic protocols rely upon an initial \emph{trusted setup} to generate public parameters. While the concept is decades old, trusted setups have gained prominence with the advent of blockchain applications utilizing zero-knowledge succinct non-interactive arguments of knowledge (zk-SNARKs), many of which rely on a ``powers-of-tau'' setup. Because such setups feature a dangerous trapdoor which undermines security if leaked, multiparty protocols are used to prevent the trapdoor from being known by any one party. Practical setups utilize an elaborate public ceremony to build confidence that the setup was not subverted. In this paper, we aim to systematize existing knowledge on trusted setups, drawing the distinction between setup \emph{protocols} and \emph{ceremonies}, and shed light on the different features of various approaches. We establish a taxonomy of protocols and evaluate real-world ceremonies based on their design principles, strengths, and weaknesses.
PunSearch: Enabling Puncturable Encrypted Search over Lattice for Cloud Storage Systems
Searchable encryption (SE) has been widely studied for cloud storage systems, allowing data encrypted search and retrieval. However, existing SE schemes can not support the fine-grained searchability revocation, making it impractical for real applications. Puncturable encryption (PE) [Oakland'15] can revoke the decryption ability of a data receiver for a specific message, which can potentially alleviate this issue. Moreover, the threat of quantum computing remains an important and realistic concern, potentially leading to data privacy leakage for cloud storage systems. Consequently, designing a post-quantum puncturable encrypted search scheme is still far-reaching. In this paper, we propose PunSearch, the first puncturable encrypted search scheme over lattice for outsourced data privacy-preserving in cloud storage systems. PunSearch provides a fine-grained searchability revocation while enjoying quantum safety. Different from existing PE schemes, we construct a novel trapdoor generation mechanism through evaluation algorithms and lattice pre-image sampling technique. We then design a search permission verification method to revoke the searchability for specific keywords. Furthermore, we formalize a new IND-Pun-CKA security model, and utilize it to analyze the security of PunSearch. Comprehensive performance evaluation indicates that the computational overheads of Encrypt, Trapdoor, Search, and Puncture algorithms in PunSearch are just 0.06, 0.005, 0.05, and 0.31 times of other prior arts, respectively under the best cases. These results demonstrate that PunSearch is effective and secure for cloud storage systems.
Treating dishonest ciphertexts in post-quantum KEMs -- explicit vs. implicit rejection in the FO transform
We revisit a basic building block in the endeavor to migrate to post-quantum secure cryptography, Key Encapsulation Mechanisms (KEMs). KEMs enable the establishment of a shared secret key, using only public communication. When targeting chosen-ciphertext security against quantum attackers, the go-to method is to design a Public-Key Encryption (PKE) scheme and then apply a variant of the PKE-to-KEM conversion known as the Fujisaki-Okamoto (FO) transform, which we revisit in this work. Intuitively, FO ensures chosen-ciphertext security by rejecting dishonest messages. This comes in two flavors -- the KEM could reject by returning 'explicit' failure symbol $\bot$ or instead by returning a pseudo-random key ('implicit' reject). During the NIST post-quantum standardization process, designers chose implicit rejection, likely due to the availability of security proofs against quantum attackers. On the other hand, implicit rejection introduces complexity and can easily deteriorate into explicit rejection in practice. While it was proven that implicit rejection is not less secure than explicit rejection, the other direction was less clear. This is relevant because the available security proofs against quantum attackers still leave things to be desired. When envisioning future improvements, due to, e.g., advancements in quantum proof techniques, having to treat both variants separately creates unnecessary overhead.
In this work, we thus re-evaluate the relationship between the two design approaches and address the so-far unexplored direction: in the classical random oracle model, we show that explicit rejection is not less secure than implicit rejection, up to a rare edge case. This, however, uses the observability of random oracle queries. To lift the proof into the quantum world, we make use of the extractable QROM (eQROM). As an alternative that works without the eQROM, we give an indirect proof that involves a new necessity statement on the involved PKE scheme.
CAPSS: A Framework for SNARK-Friendly Post-Quantum Signatures
In this paper, we present a general framework for constructing SNARK-friendly post-quantum signature schemes based on minimal assumptions, specifically the security of an arithmetization-oriented family of permutations. The term "SNARK-friendly" here refers to the efficiency of the signature verification process in terms of SNARK constraints, such as R1CS or AIR constraints used in STARKs. Within the CAPSS framework, signature schemes are designed as proofs of knowledge of a secret preimage of a one-way function, where the one-way function is derived from the chosen permutation family. To obtain compact signatures with SNARK-friendly verification, our primary goal is to achieve a hash-based proof system that is efficient in both proof size and arithmetization of the verification process.
To this end, we introduce SmallWood, a hash-based polynomial commitment and zero-knowledge argument scheme tailored for statements arising in this context. The SmallWood construction leverages techniques from Ligero, Brakedown, and Threshold-Computation-in-the-Head (TCitH) to achieve proof sizes that outperform the state of the art of hash-based zero-knowledge proof systems for witness sizes ranging from $2^5$ to $2^{16}$.
From the SmallWood proof system and further optimizations for SNARK-friendliness, the CAPSS framework offers a generic transformation of any arithmetization-oriented permutation family into a SNARK-friendly post-quantum signature scheme. We provide concrete instances built on permutations such as Rescue-Prime, Poseidon, Griffin, and Anemoi. For the Anemoi family, achieving 128-bit security, our approach produces signatures of sizes ranging from 9 to 13.3 KB, with R1CS constraints between 19K and 29K. This represents a 4-6$\times$ reduction in signature size and a 5-8$\times$ reduction in R1CS constraints compared to Loquat (CRYPTO 2024), a SNARK-friendly post-quantum signature scheme based on the Legendre PRF.
SoK: Multiparty Computation in the Preprocessing Model
Multiparty computation (MPC) allows a set of mutually distrusting parties to compute a function over their inputs, while keeping those inputs private. Most recent MPC protocols that are ready for real-world applications are based on the so-called preprocessing model, where the MPC is split into two phases: a preprocessing phase, where raw material, independent of the inputs, is produced; and an online phase, which can be efficiently computed, consuming this preprocessed material, when the inputs become available. However, the sheer number of protocols following this paradigm, makes it difficult to navigate through the literature. Our work aims at systematizing existing literature, (1) to make it easier for protocol developers to choose the most suitable preprocessing protocol for their application scenario; and (2) to identify research gaps, so as to give
pointers for future work. We devise two main categories for the preprocessing model, which we term traditional and special preprocessing, where the former refers to preprocessing for general purpose functions, and the latter refers to preprocessing for specific functions. We further systematize the protocols based on the underlying cryptographic primitive they use, the mathematical structure they are based on, and for special preprocessing protocols also their target function. For each of the 41 presented protocols, we give the intuition behind their main technical contribution, and we analyze their security properties and relative performance.
Fair Signature Exchange
We introduce the concept of Fair Signature Exchange (FSE). FSE enables a client to obtain signatures on multiple messages in a fair manner: the client receives all signatures if and only if the signer receives an agreed-upon payment. We formalize security definitions for FSE and present a practical construction based on the Schnorr signature scheme, avoiding computationally expensive cryptographic primitives such as SNARKs. Our scheme imposes minimal overhead on the Schnorr signer and verifier, leaving the signature verification process unchanged from standard Schnorr signatures. Fairness is enforced using a blockchain as a trusted third party, while exchanging only a constant amount of information on-chain regardless of the number of signatures exchanged. We demonstrate how to construct a batch adaptor signature scheme using FSE, and our FSE construction based on Schnorr results in an efficient implementation of a batch Schnorr adaptor signature scheme for the discrete logarithm problem. We implemented our scheme to show that it has negligible overhead compared to standard Schnorr signatures. For instance, exchanging $2^{10}$ signatures on the Vesta curve takes approximately $80$ms for the signer and $300$ms for the verifier, with almost no overhead for the signer and $2$x overhead for the verifier compared to the original Schnorr protocol. Additionally, we propose an extension to blind signature exchange, where the signer does not learn the messages being signed. This is achieved through a natural adaptation of blinded Schnorr signatures.
Skyscraper: Fast Hashing on Big Primes
Arithmetic hash functions defined over prime fields have been actively developed and used in verifiable computation (VC) protocols. Among those, elliptic-curve-based SNARKs require large (\(256\)-bit and higher) primes. Such hash functions are notably slow, losing a factor of up to \(1000\) compared to regular constructions like SHA-2/3.
In this paper, we present the hash function $\textsf{Skyscraper}$, which is aimed at large prime fields and provides major improvements compared to $\texttt{Reinforced Concrete}$ and $\texttt{Monolith}$. First, the design is exactly the same for all large primes, which simplifies analysis and deployment. Secondly, it achieves a performance comparable to cryptographic hash standards by using low-degree non-invertible transformations and minimizing modulo reductions. Concretely, it hashes two \(256\)-bit prime field (BLS12-381 curve scalar field) elements in \(135\) nanoseconds, whereas SHA-256 needs \(42\) nanoseconds on the same machine.
The low circuit complexity of $\textsf{Skyscraper}$, together with its high native speed, should allow a substantial reduction in many VC scenarios, particularly in recursive proofs.
Trustless Bridges via Random Sampling Light Clients
The increasing number of blockchain projects introduced annually has led to a pressing need for secure and efficient interoperability solutions. Currently, the lack of such solutions forces end-users to rely on centralized intermediaries, contradicting the core principle of decentralization and trust minimization in blockchain technology. In this paper, we propose a decentralized and efficient interoperability solution (aka Bridge Protocol) that operates without additional trust assumptions, relying solely on the Byzantine Fault Tolerance (BFT) of the two chains being connected. In particular, relayers (actors that exchange messages between networks) are permissionless and decentralized, hence eliminating any single point of failure. We introduce Random Sampling, a novel technique for on-chain light clients to efficiently follow the history of PoS blockchains by reducing the signature verifications required. Here, the randomness is drawn on-chain, for example, using Ethereum's RANDAO. We analyze the security of the bridge from a crypto- economic perspective and provide a framework to derive the security parameters. This includes handling subtle concurrency issues and randomness bias in strawman designs. While the protocol is applicable to various PoS chains, we demonstrate its feasibility by instantiating a bridge between Polkadot and Ethereum (currently deployed), and discuss some practical security challenges. We also evaluate the efficiency (gas costs) of an on-chain light-client verifier implemented as a smart contract on ethereum against SNARK-based approaches. Even for large validator set sizes (up to $10^6$), the signature verification gas costs of our light-client verifier are a magnitude lower.
Pre-sieve, Partial-guess, and Accurate estimation: Full-round Related-key Impossible Boomerang Attack on ARADI
The impossible boomerang attack is a very powerful attack, and the existing results show that it is more effective than the impossible differential attack in the related-key scenario. However, in the current key recovery process, the details of a block cipher are ignored, only fixed keys are pre-guessed, and the time complexity of the early abort technique is roughly estimated. These limitations are obstacles to the broader application of impossible boomerang attack. In this paper, we propose the pre-sieving technique, partial pre-guess key technique and precise complexity evaluation technique. For the pre-sieving technique, we capitalize on the specific features of both the linear layer and the nonlinear layer to expeditiously filter out the impossible quartets at the earliest possible stage. Regarding the partial pre-guess key technique, we are able to selectively determine the keys that require guessing according to our requirements. Moreover, the precise complexity evaluation technique empowers us to explicitly compute the complexity associated with each step of the attack. We integrate these techniques and utilize them to launch an attack on ARADI, which is a low-latency block cipher proposed by the NSA (National Security Agency) in 2024 for the purpose of memory encryption. Eventually, we achieve the first full-round attack with a data complexity of $2^{130}$, a time complexity of $2^{254.81}$, and a memory complexity of $2^{252.14}$. None of the previous key recovery methods have been able to attain such an outcome, thereby demonstrating the high efficacy of our new technique.
Hash-Based Multi-Signatures for Post-Quantum Ethereum
With the threat posed by quantum computers on the horizon, systems like Ethereum must transition to cryptographic primitives resistant to quantum attacks. One of the most critical of these primitives is the non-interactive multi-signature scheme used in Ethereum's proof-of-stake consensus, currently implemented with BLS signatures. This primitive enables validators to independently sign blocks, with their signatures then publicly aggregated into a compact aggregate signature.
In this work, we introduce a family of hash-based signature schemes as post-quantum alternatives to BLS. We consider the folklore method of aggregating signatures via (hash-based) succinct arguments, and our work is focused on instantiating the underlying signature scheme. The proposed schemes are variants of the XMSS signature scheme, analyzed within a novel and unified framework. While being generic, this framework is designed to minimize security loss, facilitating efficient parameter selection. A key feature of our work is the avoidance of random oracles in the security proof. Instead, we define explicit standard model requirements for the underlying hash functions. This eliminates the paradox of simultaneously treating hash functions as random oracles and as explicit circuits for aggregation. Furthermore, this provides cryptanalysts with clearly defined targets for evaluating the security of hash functions. Finally, we provide recommendations for practical instantiations of hash functions and concrete parameter settings, supported by known and novel heuristic bounds on the standard model properties.
Doubly Efficient Fuzzy Private Set Intersection for High-dimensional Data with Cosine Similarity
Fuzzy private set intersection (Fuzzy PSI) is a cryptographic protocol for privacy-preserving similarity matching, which is one of the essential operations in various real-world applications such as facial authentication, information retrieval, or recommendation systems. Despite recent advancements in fuzzy PSI protocols, still a huge barrier remains in deploying them for these applications. The main obstacle is the high dimensionality, e.g., from 128 to 512, of data; lots of existing methods, Garimella et al. (CRYPTO’23, CRYPTO’24) or van Baarsen et al. (EUROCRYPT’24), suffer from exponential overhead on communication and/or computation cost. In addition, the dominant similarity metric in these applications is cosine similarity, which disables several optimization tricks based on assumptions for the distribution of data, e.g., techniques by Gao et al. (ASIACRYPT’24). In this paper, we propose a novel fuzzy PSI protocol for cosine similarity, called FPHE, that overcomes these limitations at the same time. FPHE features linear complexity on both computation and communication with respect to the dimension of set elements, only requiring much weaker assumption than prior works. The basic strategy of ours is to homomorphically compute cosine similarity and run an approximated comparison function, with a clever packing method for efficiency. In addition, we introduce a novel proof technique to harmonize the approximation error from the sign function with the noise flooding, proving the security of FPHE under the semi-honest model. Moreover, we show that our construction can be extended to support various functionalities, such as labeled or circuit fuzzy PSI. Through experiments, we show that FPHE can perform fuzzy PSI over 512-dimensional data in a few minutes, which was computationally infeasible for all previous proposals under the same assumption as ours.
Founding Zero-Knowledge Proofs of Training on Optimum Vicinity
Zero-knowledge proofs of training (zkPoT) allow a party to prove that a model is trained correctly on a committed dataset without revealing any additional information about the model or the dataset. Existing zkPoT protocols prove the entire training process in zero knowledge; i.e., they prove that the final model was obtained in an iterative fashion starting from the training data and a random seed (and potentially other parameters) and applying the correct algorithm at each iteration. This approach inherently requires the prover to perform work linear to the number of iterations.
In this paper, we take a different approach to proving the correctness of model training. Our approach is motivated by efficiency but also more urgently by the observation that the prover's ability to pick the random seed used for training introduces the potential for it to bias the model. In other words, if the input to the training algorithm is biased, the resulting model will be biased even if the prover correctly ran the training algorithm. Rather than prove the correctness of the training process, we thus directly prove the correctness of the training model using a notion we call optimum vicinity, which bounds the distance between the trained model and the mathematically optimal model for models that can be viewed as the solution to a convex optimization problem. We show both theoretically and experimentally that this ensures the trained model behaves similarly to the optimal model, and show this is not true for existing approaches. We also demonstrate significant performance improvements as compared to the existing zkPoT paradigm: the statement proven in ZK in our protocol has a size independent of the number of training iterations, and our Boolean (respectively arithmetic) circuit size is up to $246\times$ (respectively $5\times$) smaller than that of a baseline zkPoT protocol that verifies the whole training process.
Separating Broadcast from Cheater Identification
Secure Multiparty Computation (MPC) protocols that achieve Identifiable Abort (IA) guarantee honest parties that in the event that they are denied output, they will be notified of the identity of at least one corrupt party responsible for the abort. Cheater identification provides recourse in the event of a protocol failure, and in some cases can even be desired over Guaranteed Output Delivery. However, protocols in the literature typically make use of broadcast as a necessary tool in identifying cheaters. In many deployments, the broadcast channel itself may be the most expensive component.
In this work, we investigate if it is inherent that MPC with IA should bear the full complexity of broadcast. As the implication of broadcast from IA has been established in previous work, we relax our target to circumvent this connection: we allow honest parties to differ in which cheaters they identify, nonetheless retaining the ability to prove claims of cheating to an auditor.
We show that in the honest majority setting, our notion of Provable Identifiable Selective Abort (PISA) can be achieved without a traditional broadcast channel. Indeed, broadcast in this setting---which we term Broadcast with Selective Identifiable Abort (BC-IA)---is achievable in only two point-to-point rounds with a simple echoing technique. On the negative side, we also prove that BC-IA is impossible to achieve in the dishonest majority setting.
As a general result, we show that any MPC protocol that achieves IA with $r$ broadcasts, can be compiled to one that achieves PISA with $2(r+1)$ point to point rounds. As a practical application of this methodology, we design, implement, and benchmark a six-round honest majority threshold ECDSA protocol that achieves PISA, and can be deployed in any environment with point to point communication alone.
Black-Box Registered ABE from Lattices
This paper presents the first black-box registered ABE for circuit from lattices. The selective security is based on evasive LWE assumption [EUROCRYPT'22, CRYPTO'22]. The unique prior Reg-ABE scheme from lattices is derived from non-black-box construction based on function-binding hash and witness encryption [CRYPTO'23]. Technically, we first extend the black-box registration-based encryption from standard LWE [CRYPTO'23] so that we can register a public key with a function; this yields a LWE-based Reg-ABE with ciphertexts of size $L \cdot \mathsf{polylog}(L)$ where $L$ is the number of users. We then make use of the special structure of its ciphertext to reduce its size to $\mathsf{polylog}(L)$ via an algebraic obfuscator based on evasive LWE [CRYPTO'24].
Cryptojacking detection using local interpretable model-agnostic explanations
Cryptojacking, the unauthorised use of computing resources to mine cryptocurrency, has emerged as a critical threat in today’s digital landscape. These attacks not only compromise system integrity but also result in increased costs, reduced hardware lifespan, and heightened network security risks. Early and accurate detection is essential to mitigate the adverse effects of cryptojacking. This study focuses on developing a semi-supervised machine learning (ML) approach that leverages an autoencoder for feature extraction and a random forest (RF) model for classification. The objective is to enhance cryptojacking detection while maintaining a balance between accuracy and interpretability. The proposed methodology is further enhanced with explainable artificial intelligence (XAI) techniques such as local interpretable model-agnostic explanations (LIME) to offer insights into model predictions. Results from datasets such as UGRansome and BitcoinHeist indicate that the semi-supervised approach achieves accuracy rates ranging from 70% to 99%. The study demonstrates that the proposed model provides an efficient, interpretable, and scalable solution for real-time cryptojacking detection across various scenarios.
On the gap between terms in an addition chain
In this paper, we study the distribution of the \textit{gap} between terms in an addition chain. In particular, we show that if $1,2,\ldots,s_{\delta(n)}=n$ is an addition chain of length $\delta(n)$ leading to $n$, then $$\underset{1\leq l\leq \delta(n)}{\mathrm{sup}}(s_{l+k}-s_l)\gg k\frac{n}{\delta(n)}$$ and $$\underset{1\leq l\leq \delta(n)}{\mathrm{inf}}(s_{l+k}-s_l)\ll k\frac{n}{\delta(n)}$$ for fixed $k\geq 1$.
ABLE: Optimizing Mixed Arithmetic and Boolean Garbled Circuit
Privacy and security have become critical priorities in many scenarios. Privacy-preserving computation (PPC) is a powerful solution that allows functions to be computed directly on encrypted data. Garbled circuit (GC) is a key PPC technology that enables secure, confidential computing. GC comes in two forms: Boolean GC supports all operations by expressing functions as logic circuits; arithmetic GC is a newer technique to efficiently compute a set of arithmetic operations like addition and multiplication. Mixed GC combines both Boolean and arithmetic GC, in an attempt to optimize performance by computing each function in its most natural form. However, this introduces additional costly conversions between the two forms. It remains unclear if and when the efficiency gains of arithmetic GC outweigh these conversion costs.
In this paper, we present A̲rithmetic Ḇoolean Ḻogic E̲xchange for Garbled Circuit, the first real implementation of mixed GC. ABLE profiles the performance of Boolean and arithmetic GC operations along with their conversions. We assess not only communication but also computation latency, a crucial factor in evaluating the end-to-end runtime of GC. Based on these insights, we propose a method to determine whether it is more efficient to use general Boolean GC or mixed GC for a given application. Rather than implementing both approaches to identify the better solution for each unique case, our method enables users to select the most suitable GC realization early in the design process. This method evaluates whether the benefits of transitioning operations from Boolean to arithmetic GC offset the associated conversion costs. We apply this approach to a neural network task as a case study.
Implementing ABLE reveals opportunities for further GC optimization. We propose a heuristic to reduce the number of primes in arithmetic GC, cutting communication by 14.1% and compute latency by 15.7% in a 16-bit system. Additionally, we optimize mixed GC conversions with row reduction technique, achieving a 48.6% reduction in garbled table size for bit-decomposition and a 50% reduction for bit-composition operation. These improvements reduce communication overhead in stream GC and free up storage in the GC with preprocessing approach. We open source our code for community use.
Time-Lock Puzzles from Lattices
Time-lock puzzles (TLP) are a cryptographic tool that allow one to encrypt a message into the future, for a predetermined amount of time $T$. At present, we have only two constructions with provable security: One based on the repeated squaring assumption and the other based on obfuscation. Basing TLP on any other assumption is a long-standing question, further motivated by the fact that known constructions are broken by quantum algorithms.
In this work, we propose a new approach to construct time-lock puzzles based on lattices, and therefore with plausible post-quantum security. We obtain the following main results:
* In the preprocessing model, where a one-time public-coin preprocessing is allowed, we obtain a time-lock puzzle with encryption time $\log(T)$.
* In the plain model, where the encrypter does all the computation, we obtain a time-lock puzzle with encryption time $\sqrt{T}$.
Both constructions assume the existence of any sequential function $f$, and the hardness of the circular small-secret learning with errors (LWE) problem. At the heart of our results is a new construction of succinct randomized encodings (SRE) for $T$-folded repeated circuits, where the complexity of the encoding is $\sqrt{T}$. This is the first construction of SRE where the overall complexity of the encoding algorithm is sublinear in the runtime $T$, and which is not based on obfuscation. As a direct corollary, we obtain a non-interactive RAM delegation scheme with sublinear complexity (in the number of steps $T$).
The Meta-Complexity of Secret Sharing
A secret-sharing scheme allows the distribution of a secret $s$ among $n$ parties, such that only certain predefined “authorized” sets of parties can reconstruct the secret, while all other “unauthorized” sets learn nothing about $s$. The collection of authorized/unauthorized sets is defined by a monotone function $f: \{0,1\}^n \rightarrow \{0,1\}$. It is known that any monotone function can be realized by a secret-sharing scheme; thus, the smallest achievable \emph{total share size}, $S(f)$, serves as a natural complexity measure.
In this paper, we initiate the study of the following meta-complexity question: Given a monotone function $f$, is it possible to efficiently distinguish between cases where the secret-sharing complexity of $f$ is small versus large? We examine this question across several computational models, yielding the following main results.
(Hardness for formulas and circuits): Given a monotone formula $f$ of size $L$, it is coNP-hard to distinguish between ``cheap'' functions, where the maximum share size is 1 bit and the total share size is $O(L^{0.01})$, and ``expensive'' functions, where the maximum share size is $\Omega(\sqrt{L})$ and the total share size is $\Omega(L/\log L)$.
This latter bound nearly matches known secret-sharing constructions yielding a total share size of $L$ bits. For monotone circuits, we strengthen the bound on the expensive case to a maximum share size of $\Omega(L/\log L)$ and a total share size of $\Omega(L^2/\log L)$. These results rule out the existence of instance-optimal compilers that map a formula $f$ to a secret-sharing scheme with complexity polynomially related to $S(f)$.
(Hardness for truth tables): Under cryptographic assumptions, either (1) every $n$-bit slice function can be realized by a $\text{poly}(n)$-size secret-sharing scheme, or (2) given a truth-table representation of $f$ of size $N = 2^n$, it is computationally infeasible to distinguish in time $\text{poly}(N)$ between cases where $S(f) = \text{poly}(n)$ and $S(f) = n^{\omega(1)}$. Option (1) would be considered a breakthrough result, as the best-known construction for slices has a sub-exponential complexity of $2^{\tilde{O}(\sqrt{n})}$ (Liu, Vaikuntanathan, and Wee; Eurocrypt 2018). Our proof introduces a new worst-case-to-average-case reduction for slices, which may be of independent interest.
(Hardness for graphs): We examine the simple case where $f$ is given as a 2-DNF, represented by a graph $G$ whose edges correspond to 2-terms, and ask whether it is possible to distinguish between cases where the share size is constant and those where the share size is large, say $\Omega(\log n)$. We establish several connections between this question and questions in communication complexity. For instance, we show that graphs admitting constant-cost secret sharing form a subclass of graphs with constant randomized communication complexity and constant-size adjacency sketches (Harms, Wild, and Zamaraev; STOC 2022). We leverage these connections to establish new lower bounds for specific graph families, derive a combinatorial characterization of graphs with constant-size linear secret-sharing schemes, and show that a natural class of myopic algorithms fails to distinguish cheap graphs from expensive ones.
IND-CPA$^{\text{C}}$: A New Security Notion for Conditional Decryption in Fully Homomorphic Encryption
Fully Homomorphic Encryption (FHE) allows a server to perform computations directly over the encrypted data. In general FHE protocols, the client is tasked with decrypting the computation result using its secret key. However, certain FHE applications benefit from the server knowing this result, especially without the aid of the client. Providing the server with the secret key allows it to decrypt all the data, including the client's private input. Protocols such as Goldwasser et. al. (STOC'13) have shown that it is possible to provide the server with the capability of conditional decryption that allows it to decrypt the result of some pre-defined computation and nothing else. While beneficial to an honest-but-curious server to aid in providing better services, a malicious server may utilize this added advantage to perform active attacks on the overall FHE application to leak secret information. Existing security notions fail to capture this scenario since they assume that only the client has the ability to decrypt FHE ciphertexts. Therefore, in this paper, we propose a new security notion named IND-CPA$^{\text{C}}$, that provides the adversary with access to a conditional decryption oracle. We then show that none of the practical exact FHE schemes are secure under this notion by demonstrating a generic attack that utilizes this restricted decryption capability to recover the underlying errors in the FHE ciphertexts. Given the security guarantee of the underlying (R)LWE hardness problem collapses with the leakage of these error values, we show a full key recovery attack. Finally, we propose a technique to convert any IND-CPA secure FHE scheme into an IND-CPA$^\text{C}$ secure FHE scheme. Our technique utilizes Succinct Non-Interactive Argument of Knowledge, where the server is forced to generate a proof of an honest computation of the requested function along with the computation result. During decryption, the proof is verified, and the decryption proceeds only if the verification holds. Both the verification and decryption steps run inside a Garbled Circuit and thus are not observable or controllable by the server.
Registered ABE and Adaptively-Secure Broadcast Encryption from Succinct LWE
Registered attribute-based encryption (ABE) is a generalization of public-key encryption that enables fine-grained access control to encrypted data (like standard ABE), but without needing a central trusted authority. In a key-policy registered ABE scheme, users choose their own public and private keys and then register their public keys together with a decryption policy with an (untrusted) key curator. The key curator aggregates all of the individual public keys into a short master public key which serves as the public key for an ABE scheme.
Currently, we can build registered ABE for restricted policies (e.g., Boolean formulas) from pairing-based assumptions and for general policies using witness encryption or indistinguishability obfuscation. In this work, we construct a key-policy registered ABE for general policies (specifically, bounded-depth Boolean circuits) from the $\ell$-succinct learning with errors (LWE) assumption in the random oracle model. The ciphertext size in our registered ABE scheme is $\mathsf{poly}(\lambda, d)$, where $\lambda$ is a security parameter and $d$ is the depth of the circuit that computes the policy circuit $C$. Notably, this is independent of the length of the attribute $\mathbf{x}$ and is optimal up to the $\mathsf{poly}(d)$ factor.
Previously, the only lattice-based instantiation of registered ABE uses witness encryption, which relies on private-coin evasive LWE, a stronger assumption than $\ell$-succinct LWE. Moreover, the ciphertext size in previous registered ABE schemes that support general policies (i.e., from obfuscation or witness encryption) scales with $\mathsf{poly}(\lambda, |\mathbf{x}|, |C|)$. The ciphertext size in our scheme depends only on the depth of the circuit (and not the length of the attribute or the size of the policy). This enables new applications to identity-based distributed broadcast encryption.
Our techniques are also useful for constructing adaptively-secure (distributed) broadcast encryption, and we give the first scheme from the $\ell$-succinct LWE assumption in the random oracle model. Previously, the only lattice-based broadcast encryption scheme with adaptive security relied on witness encryption in the random oracle model. All other lattice-based broadcast encryption schemes only achieved selective security.
SoK: Time to be Selfless?! Demystifying the Landscape of Selfish Mining Strategies and Models
Selfish mining attacks present a serious threat to Bitcoin security, enabling a miner with less than 51% of the network hashrate to gain higher rewards than when mining honestly. A growing body of works has studied the impact of such attacks and presented numerous strategies under a variety of model settings. This led to a complex landscape with conclusions that are often exclusive to certain model assumptions. This growing complexity makes it hard to comprehend the state of the art and distill its impact and trade-offs.
In this paper, we demystify the landscape of selfish mining by presenting a systematization framework of existing studies and evaluating their strategies under realistic model adaptations. To the best of our knowledge, our work is the first of its kind. We develop a multi-dimensional systematization framework assessing prior works based on their strategy formulation and targeted models. We go on to distill a number of insights and gaps clarifying open questions and understudied areas. Among them, we find that most studies target the block-reward setting and do not account for transaction fees, and generally study the single attacker case. To bridge this gap, we evaluate many of the existing strategies in the no-block-reward setting---so miner's incentives come solely from transaction fees---for both the single and multi-attacker scenarios. We also extend their models to include a realistic honest-but-rational miners showing how such adaptations could garner more-performant strategy variations. Finally, we touch upon defenses proposed in the literature, and discuss connections between selfish mining and relevant incentivized/fee-driven paradigms.
Structural Results for Maximal Quaternion Orders and Connecting Ideals of Prime Power Norm in $B_{p,\infty}$
Fix odd primes $p, \ell$ with $p \equiv 3 \mod 4$ and $\ell \neq p$. Consider the rational quaternion algebra ramified at $p$ and $\infty$ with a fixed maximal order $\mathcal{O}_{1728}$. We give explicit formulae for bases of all cyclic norm $\ell^n$ ideals of $\mathcal{O}_{1728}$ and their right orders, in Hermite Normal Form (HNF). Further, in the case $\ell \mid p+1$, or more generally, $-p$ is a square modulo $\ell$, we derive a parametrization of these bases along paths of the $\ell$-ideal graph, generalising the results of [1]. With such orders appearing as the endomorphism rings of supersingular elliptic curves defined over $\overline{\mathbb{F}_{p}}$, we note several potential applications to isogeny-based cryptography including fast ideal sampling algorithms. We also demonstrate how our findings may lead to further structural observations, by using them to prove a result on the successive minima of endomorphism rings of supersingular curves defined over $\mathbb{F}_p$.
[1] = https://eprint.iacr.org/2025/033
Keyed-Verification Anonymous Credentials with Highly Efficient Partial Disclosure
An anonymous credential (AC) system with partial disclosure allows users to prove possession of a credential issued by an issuer while selectively disclosing a subset of their attributes to a verifier in a privacy-preserving manner. In keyed-verification AC (KVAC) systems, the issuer and verifier share a secret key. Existing KVAC schemes rely on computationally expensive zero-knowledge proofs during credential presentation, with the presentation size growing linearly with the number of attributes. In this work, we propose two highly efficient KVAC constructions that eliminate the need for zero-knowledge proofs during the credential presentation and achieve constant-size presentations.
Our first construction adapts the approach of Fuchsbauer et al. (JoC'19), which achieved constant-size credential presentation in a publicly verifiable setting using their proposed structure-preserving signatures on equivalence classes (SPS-EQ) and set commitment schemes, to the KVAC setting. We introduce structure-preserving message authentication codes on equivalence classes (SP-MAC-EQ) and designated-verifier set commitments (DVSC), resulting in a KVAC system with constant-size credentials (2 group elements) and presentations (4 group elements). To avoid the bilinear groups and pairing operations required by SP-MAC-EQ, our second construction uses a homomorphic MAC with a simplified DVSC. While this sacrifices constant-size credentials ($n+2$ group elements, where $n$ is the number of attributes), it retains constant-size presentations (2 group elements) in a pairingless setting.
We formally prove the security of both constructions and provide open-source implementation results demonstrating their practicality. We extensively benchmarked our KVAC protocols and, additionally, bechmarked the efficiency of our SP-MAC-EQ scheme against the original SPS-EQ scheme, showcasing significant performance improvements.
Bundled Authenticated Key Exchange: A Concrete Treatment of (Post-Quantum) Signal's Handshake Protocol
The Signal protocol relies on a special handshake protocol, formerly X3DH and now PQXDH, to set up secure conversations. Prior analyses of these protocols (or proposals for post-quantum alternatives) have all used highly tailored models to the individual protocols and generally made ad-hoc adaptations to "standard" AKE definitions, making the concrete security attained unclear and hard to compare between similar protocols. Indeed, we observe that some natural Signal handshake protocols cannot be handled by these tailored models. In this work, we introduce Bundled Authenticated Key Exchange (BAKE), a concrete treatment of the Signal handshake protocol. We formally model prekey bundles and states, enabling us to define various levels of security in a unified model. We analyze Signal's classically secure X3DH and harvest-now-decrypt-later-secure PQXDH, and show that they do not achieve what we call optimal security (as is documented). Next, we introduce RingXKEM, a fully post-quantum Signal handshake protocol achieving optimal security; as RingXKEM shares states among many prekey bundles, it could not have been captured by prior models. Lastly, we provide a security and efficiency comparison of X3DH, PQXDH, and RingXKEM.
VDORAM: Towards a Random Access Machine with Both Public Verifiability and Distributed Obliviousness
Verifiable random access machines (vRAMs) serve as a foundational model for expressing complex computations with provable security guarantees, serving applications in areas such as secure electronic voting, financial auditing, and privacy-preserving smart contracts. However, no existing vRAM provides distributed obliviousness, a critical need in scenarios where multiple provers seek to prevent disclosure against both other provers and the verifiers.
Implementing a publicly verifiable distributed oblivious RAM (VDORAM) presents several challenges. Firstly, the development of VDORAM is hindered by the limited availability of sophisticated publicly verifiable multi-party computation (MPC) protocols. Secondly, the lack of readily available front-end implementations for multi-prover zero-knowledge proofs (ZKPs) poses a significant obstacle to developing practical applications. Finally, directly adapting existing RAM designs to the VDORAM paradigm may prove either impractical or inefficient due to the inherent complexities of reconciling oblivious computation with the generation of publicly verifiable proofs.
To address these challenges, we introduce CompatCircuit, the first multi-prover ZKP front-end implementation to our knowledge. CompatCircuit integrates collaborative zkSNARKs to implement publicly verifiable MPC protocols with rich functionalities beyond those of an arithmetic circuit, enabling the development of multi-prover ZKP applications. Building upon CompatCircuit, we present VDORAM, the first publicly verifiable distributed oblivious RAM. By combining distributed oblivious architectures with verifiable RAM, VDORAM achieves an efficient RAM design that balances communication overhead and proof generation time. We have implemented CompatCircuit and VDORAM in approximately 15,000 lines of code, demonstrating usability by providing a practical and efficient implementation. Our performance evaluation result reveals that the system still provides moderate performance with distributed obliviousness.
Cauchyproofs: Batch-Updatable Vector Commitment with Easy Aggregation and Application to Stateless Blockchains
Stateless blockchain designs have emerged to address the challenge of growing blockchain size by utilizing succinct global states. Previous works have developed vector commitments that support proof updates and aggregation to be used as such states. However, maintaining proofs for multiple users still demands significant computational resources, particularly in updating proofs with every transaction. This paper introduces Cauchyproofs, a batch-updatable vector commitment enabling proof-serving nodes to efficiently update proofs in quasi-linear time relative to the number of users and transactions, utilizing an optimized KZG scheme to achieve complexity $O\left(\left(\left|\vec{\alpha}\right| + \left|\vec{\beta}\right|\right) \log^2 (\left|\vec{\alpha}\right| + \left|\vec{\beta}\right|)\right)$, compared to previous $O\left(\left|\vec{\alpha}\right|\cdot|\vec{\beta}|\right)$ approaches. This advancement reduces the computational burden on proof-serving nodes, allowing for efficient proof maintenance across large user groups. We demonstrate that our approach is approximately five times faster than the naive approach at the Ethereum-level block size. Additionally, we present a novel matrix representation for KZG proofs utilizing Cauchy matrices, enabling faster all-proof computations with reduced elliptic curve operations. Finally, we propose an algorithm for history proof query, supporting retrospective proof generation with high efficiency. Our contributions substantially enhance the scalability and practicality of proof-serving nodes in stateless blockchain frameworks.
Forking the RANDAO: Manipulating Ethereum's Distributed Randomness Beacon
Proof-of-stake consensus protocols often rely on distributed randomness beacons (DRBs) to generate randomness for leader selection. This work analyses the manipulability of Ethereum's DRB implementation, RANDAO, in its current consensus mechanism. Even with its efficiency, RANDAO remains vulnerable to manipulation through the deliberate omission of blocks from the canonical chain. Previous research has shown that economically rational players can withhold blocks --~known as a block withholding attack or selfish mixing~-- when the manipulated RANDAO outcome yields greater financial rewards.
We introduce and evaluate a new manipulation strategy, the RANDAO forking attack. Unlike block withholding, whereby validators opt to hide a block, this strategy relies on selectively forking out an honest proposer's block to maximize transaction fee revenues and block rewards.
In this paper, we draw attention to the fact that the forking attack is significantly more harmful than selfish mixing for two reasons. Firstly, it exacerbates the unfairness among validators. More importantly, it significantly undermines the reliability of the blockchain for the average user by frequently causing already published blocks to be forked out. By doing so, the attacker can fork the chain without losing slots, and we demonstrate that these are later fully compensated for. Our empirical measurements, investigating such manipulations on Ethereum mainnet, revealed no statistically significant traces of these attacks to date.
Scalable Post-Quantum Oblivious Transfers for Resource-Constrained Receivers
It is imperative to modernize traditional core cryptographic primitives, such as Oblivious Transfer (OT), to address the demands of the new digital era, where privacy-preserving computations are executed on low-power devices. This modernization is not merely an enhancement but a necessity to ensure security, efficiency, and continued relevance in an ever-evolving technological landscape.
This work introduces two scalable OT schemes: (1) Helix OT, a $1$-out-of-$n$ OT, and (2) Priority OT, a $t$-out-of-$n$ OT. Both schemes provide unconditional security, ensuring resilience against quantum adversaries. Helix OT achieves a receiver-side download complexity of $O(1)$. In big data scenarios, where certain data may be more urgent or valuable, we propose Priority OT. With a receiver-side download complexity of $O(t)$, this scheme allows data to be received based on specified priorities. By prioritizing data transmission, Priority OT ensures that the most important data is received first, optimizing bandwidth, storage, and processing resources. Performance evaluations indicate that Helix OT completes the transfer of 1 out of $n=$ 16,777,216 messages in 9 seconds, and Priority OT handles $t=$ 1,048,576 out of $n$ selections in 30 seconds. Both outperform existing $t$-out-of-$n$ OTs (when $t\geq 1$), underscoring their suitability for large-scale applications. To the best of our knowledge, Helix OT and Priority OT introduce unique advancements that distinguish them from previous schemes.
All-You-Can-Compute: Packed Secret Sharing for Combined Resilience
Unprotected cryptographic implementations are vulnerable to implementation attacks, such as passive side-channel attacks and active fault injection attacks. Recently, countermeasures like polynomial masking and duplicated masking have been introduced to protect implementations against combined attacks that exploit leakage and faults simultaneously.
While duplicated masking requires $O(t * e)$ shares to resist an adversary capable of probing $t$ values and faulting $e$ values, polynomial masking requires only $O(t + e)$ shares, which is particularly beneficial for affine computation.
At CHES'$24$, Arnold et al. showed how to further improve the efficiency of polynomial masking in the presence of combined attacks by embedding two secrets into one polynomial sharing. This essentially reduces the complexity of previous constructions by half. The authors also observed that using techniques from packed secret sharing (Grosso et al., CHES'$13$) cannot easily achieve combined resilience to encode an arbitrary number of secrets in one polynomial encoding.
In this work, we resolve these challenges and show that it is possible to embed an arbitrary number of secrets in one encoding and propose gadgets that are secure against combined attacks. We present two constructions that are generic and significantly improve the computational and randomness complexity of existing compilers, such as the laOla compiler presented by Berndt et al. at CRYPTO'$23$ and its improvement by Arnold et al.
For example, for an AES evaluation that protects against $t$ probes and $e$ faults, we improve the randomness complexity of the state-of-the-art construction when $t+e>3$, leading to an improvement of up to a factor of $2.41$.
ZODA: Zero-Overhead Data Availability
Uncategorized
Uncategorized
We introduce ZODA, short for 'zero-overhead data availability,' which is a protocol for proving that symbols received from an encoding (for tensor codes) were correctly constructed. ZODA has optimal overhead for both the encoder and the samplers. Concretely, the ZODA scheme incurs essentially no incremental costs (in either computation or size) beyond those of the tensor encoding itself. In particular, we show that a slight modification to the encoding scheme for tensor codes allows sampled rows and columns of this modified encoding to become proofs of their own correctness. When used as part of a data availability sampling protocol, both encoders (who encode some data using a tensor code with some slight modifications) and samplers (who sample symbols from the purported encoding and verify that these samples are correctly encoded) incur no incremental communication costs and only small additional computational costs over having done the original tensor encoding. ZODA additionally requires no trusted setup and is plausibly post-quantum secure.
Parametrizing Maximal Orders Along Supersingular $\ell$-Isogeny Paths
Suppose you have a supersingular $\ell$-isogeny graph with vertices given by $j$-invariants defined over $\mathbb{F}_{p^2}$, where $p = 4 \cdot f \cdot \ell^e - 1$ and $\ell \equiv 3 \pmod{4}$. We give an explicit parametrization of the maximal orders in $B_{p,\infty}$ appearing as endomorphism rings of the elliptic curves in this graph that are $\leq e$ steps away from a root vertex with $j$-invariant 1728. This is the first explicit parametrization of this kind and we believe it will be an aid in better understanding the structure of supersingular $\ell$-isogeny graphs that are widely used in cryptography. Our method makes use of the inherent directions in the supersingular isogeny graph induced via Bruhat-Tits trees, as studied in [1]. We also discuss how in future work other interesting use cases, such as $\ell=2$, could benefit from the same methodology.
A New Paradigm for Server-Aided MPC
The server-aided model for multiparty computation (MPC) was introduced to capture a real-world scenario where clients wish to off-load the heavy computation of MPC protocols to dedicated servers. A rich body of work has studied various trade-offs between security guarantees (e.g., semi-honest vs malicious), trust assumptions (e.g., the threshold on corrupted servers), and efficiency.
However, all existing works make the assumption that all clients must agree on employing the same servers, and accept the same corruption threshold. In this paper, we challenge this assumption and introduce a new paradigm for server-aided MPC, where each client can choose their own set of servers and their own threshold of corrupted servers. In this new model, the privacy of each client is guaranteed as long as their own threshold is satisfied, regardless of the other servers/clients. We call this paradigm per-party private server-aided MPC to highlight both a security and efficiency guarantee: (1) per-party privacy, which means that each party gets their own privacy guarantees that depend on their own choice of the servers; (2) per-party complexity, which means that each party only needs to communicate with their chosen servers. Our primary contribution is a new theoretical framework for server-aided MPC. We provide two protocols to show feasibility, but leave it as a future work to investigate protocols that focus on concrete efficiency.
Round-Optimal Compiler for Semi-Honest to Malicious Oblivious Transfer via CIH
A central question in the theory of cryptography is whether we can build protocols that achieve stronger security guarantees, e.g., security against malicious adversaries, by combining building blocks that achieve much weaker security guarantees, e.g., security only against semi-honest adversaries; and with the minimal number of rounds. An additional focus is whether these building blocks can be used only as a black-box. Since Oblivious Transfer (OT) is the necessary and sufficient building block to securely realize any two-party (and multi-party) functionality, theoreticians often focus on proving whether maliciously secure OT can be built from a weaker notion of OT.
There is a rich body of literature that provides (black-box) compilers that build malicious OT from OTs that achieve weaker security such as semi-malicious OT and defensibly secure OT, within the minimal number of rounds. However, no round-optimal compiler exists that builds malicious OT from the weakest notion of semi-honest OT, in the plain model.
Correlation intractable hash (CIH) functions are special hash functions whose properties allow instantiating the celebrated Fiat-Shamir transform, and hence reduce the round complexity of public-coin proof systems.
In this work, we devise the first round-optimal compiler from semi-honest OT to malicious OT, by a novel application of CIH for collapsing rounds in the plain model. We provide the following contributions. First, we provide a new CIH-based round-collapsing construction for general cut-and-choose. This gadget can be used generally to prove the correctness of the evaluation of a function. Then, we use our gadget to build the first round-optimal compiler from semi-honest OT to malicious OT.
Our compiler uses the semi-honest OT protocol and the other building blocks in a black-box manner. However, for technical reasons, the underlying CIH construction requires the upper bound of the circuit size of the semi-honest OT protocol used. The need for this upper-bound makes our protocol not fully black-box, hence is incomparable with existing, fully black-box, compilers.
Delegated Multi-party Private Set Intersection from Secret Sharing
In this work, we address the problem of Delegated PSI (D-PSI), where a cloud server is introduced to handle most computational and communication tasks. D-PSI enables users to securely delegate their private sets to the cloud, ensuring the privacy of their data while allowing efficient computation of the intersection. The cloud operates under strict security requirements, learning nothing about the individual sets or the intersection result. Moreover, D-PSI minimizes user-to-user communication and supports "silent" processing, where the cloud can perform computations independently without user interaction, apart from set delegation and result retrieval.
We formally define the D-PSI problem and propose a novel construction that extends beyond two-party scenarios to support multi-party settings. Our construction adheres to the D-PSI requirements, including security against semi-honest adversaries, and achieves computational and communication complexities close to the ideal "perfect" D-PSI protocol. Additionally, we demonstrate the practicality of our approach through a baseline implementation and an optimized version that further reduces computational overhead. Our results establish a strong foundation for secure and efficient PSI in real-world cloud computing scenarios.
Highly Efficient Server-Aided Multiparty Subfield VOLE Distribution Protocol
In recent development of secure multi-party computation (MPC), pseudorandom correlations of subfield vector oblivious linear evaluation (sVOLE) type become popular due to their amazing applicability in multi-dimensional MPC protocols such as privacy-preserving biometric identification and privacy-preserving machine learning protocols. In this paper, we introduce a novel way of VOLE distribution in three-party and four-party honest majority settings with the aid of a trusted server. This new method significantly decreases the communication cost and the memory storage of random sVOLE instances. On the other hand, it also enables a streamline distribution process that can generate a sVOLE instance of an arbitrary length, which results in 100 percent of utility rate of random sVOLE in multi-dimensional MPC protocols while preserving complete precomputability.
Extending Groth16 for Disjunctive Statements
Two most common ways to design non-interactive zero knowledge (NIZK) proofs are based on Sigma ($\Sigma$)-protocols (an efficient way to prove algebraic statements) and zero-knowledge succinct non-interactive arguments of knowledge (zk-SNARK) protocols (an efficient way to prove arithmetic statements). However, in the applications of cryptocurrencies such as privacy-preserving credentials, privacy-preserving audits, and blockchain-based voting systems, the zk-SNARKs for general statements are usually implemented with encryption, commitment, or other algebraic cryptographic schemes. Moreover, zk-SNARKs for many different arithmetic statements may also be required to be implemented together. Clearly, a typical solution is to extend the zk-SNARK circuit to include the code for algebraic part. However, complex cryptographic operations in the algebraic algorithms will significantly increase the circuit size, which leads to impractically large proving time and CRS size. Thus, we need a flexible enough proof system for composite statements including both algebraic and arithmetic statements. Unfortunately, while the conjunction of zk-SNARKs is relatively natural and numerous effective solutions are currently available (e.g. by utilizing the commit-and-prove technique), the disjunction of zk-SNARKs is rarely discussed in detail.
In this paper, we mainly focus on the disjunctive statements of Groth16, and we propose a Groth16 variant---CompGroth16, which provides a framework for Groth16 to prove the disjunctive statements that consist of a mix of algebraic and arithmetic components. Specifically, we could directly combine CompGroth16 with $\Sigma$-protocol or even CompGroth16 with CompGroth16 just like the logical composition of $\Sigma$-protocols. From this, we can gain many good properties, such as broader expression, better prover's efficiency and shorter CRS. In addition, for the combination of CompGroth16 and $\Sigma$-protocol, we also present two representative application scenarios to demonstrate the practicality of our construction.
Constant time lattice reduction in dimension 4 with application to SQIsign
In this paper we propose a constant time lattice reduction algorithm for integral dimension-4 lattices. Motivated by its application in the SQIsign post-quantum signature scheme, we provide for the first time a constant time LLL-like algorithm with guarantees on the length of the shortest output vector. We implemented our algorithm and ensured through various tools that it indeed operates in constant time. Our experiments suggest that in practice our implementation outputs a Minkowski reduced basis and thus can replace a non constant time lattice reduction subroutine in SQIsign.
How to use your brain for cryptography without trustworthy machines
In this work, we study cryptosystems that can be executed securely without fully trusting all machines, but only trusting the user's brain. This paper focuses on signature scheme. We first introduce a new concept called ``server-aided in-brain signature,'' which is a cryptographic protocol between a human brain and multiple servers to sign a message securely even if the user's device and servers are not completely trusted. Second, we propose a concrete scheme that is secure against mobile hackers in servers and malware infecting user's devices.
Chosen-Ciphertext Security for Inner Product FE: Multi-Client and Multi-Input, Generically
Functional Encryption is a powerful cryptographic primitive that allows for fine-grained access control over encrypted data. In the multi-user setting, especially Multi-Client and Multi-Input, a plethora of works have been proposed to study on concrete function classes, improving security, and more. However, the CCA-security for such schemes is still an open problem, where the only known works are on Public-Key Single-Client FE ($\mathit{e.g.}$ Benhamouda, Bourse, and Lipmaa, PKC'17).
This work provides the first generic construction of CCA-secure Multi-Client FE for Inner Products, and Multi-Input FE for Inner Products, with instantiations from $\mathsf{SXDH}$ and $\mathsf{DLIN}$ for the former, and $\mathsf{DDH}$ or $\mathsf{DCR}$ for the latter. Surprisingly, in the case of secret key MIFE we attain the same efficiency as in the public key single-client setting. In the MCFE setting, a toolkit of CCA-bootstrapping techniques is developed to achieve CCA-security in its $\mathit{secret~key}$ setting.
Quantum-resistant secret handshakes with dynamic joining, leaving, and banishment: GCD revisited
Secret handshakes, introduced by Balfanz et al. [3], allow users associated with various groups to determine if they share a common affiliation. These protocols ensure crucial properties such as fairness (all participants learn the result simultaneously), affiliation privacy (failed handshakes reveal no affiliation information), and result-hiding (even participants within a shared group cannot infer outcomes of unrelated handshakes). Over time, various secret-handshake schemes have been proposed, with a notable advancement being the modular framework by Tsudik and Xu. Their approach integrates three key components: group signature schemes, centralized secure channels for each group, and decentralized group key-agreement protocols.
Building upon this modularity, we propose significant updates. By addressing hidden complexities and revising the security model, we enhance both the efficiency and the privacy guarantees of the protocol. Specifically, we achieve the novel property of Self distinction—the ability to distinguish between two users in a session without revealing their identities—by replacing the group signature primitive with a new construct, the List MAC. This primitive is inherently untraceable, necessitating adjustments to the original syntax to support stronger privacy guarantees. Consequently, we introduce the Traitor Catching paradigm, where the transcript of a handshake reveals only the identity of a traitor, preserving the anonymity of all other participants.
To showcase the flexibility and robustness of our updated framework, we present two post-quantum instantiations (a hash-based one and another based on lattices). Our approach not only corrects prior limitations but also establishes a new benchmark for privacy and security in secret handshakes.
Cryptography is Rocket Science: Analysis of BPSec
Space networking has become an increasing area of development with the advent of commercial satellite networks such as those hosted by Starlink and Kuiper, and increased satellite and space presence by governments around the world. Yet, historically such network designs have not been made public, leading to limited formal cryptographic analysis of the security offered by them. One of the few public protocols used in space networking is the Bundle Protocol, which is secured by Bundle Protocol Security (BPSec), an Internet Engineering Task Force (IETF) standard. We undertake a first analysis of BPSec under its default security context, building a model of the secure channel security goals stated in the IETF standard, and note issues therein with message loss detection. We prove BPSec secure, and also provide a stronger construction, one that supports the Bundle Protocol's functionality goals while also ensuring destination awareness of missing message components.
Leveled Functional Bootstrapping via External Product Tree
Multi-input and large-precision lookup table (LUT) evaluation pose significant challenges in Fully Homomorphic Encryption (FHE). Currently, two modes are employed to address this issue. One is tree-based functional bootstrapping (TFBS), which uses multiple blind rotations to construct a tree for LUT evaluation. The second is circuit bootstrapping, which decomposes all inputs into bits and utilizes a CMux tree for LUT evaluation. In this work, we propose a novel mode that is leveled functional bootstrapping. This mode utilizes the external product tree to perform multi-input functional bootstrapping. We implement TFBS and LFBS within the OpenFHE library. The results demonstrate that our method outperforms TFBS in both computational efficiency and bootstrapping key size. Specifically, for 12-bit and 16-bit input LUTs, our method is approximately two to three orders of magnitude faster than TFBS.
Finally, we introduce a novel scheme-switching framework that supports large-precision polynomial and non-polynomial evaluations. The framework leverages digital extraction techniques to enable seamless switching between the BFV and LFBS schemes.
Efficient Authentication Protocols from the Restricted Syndrome Decoding Problem
In this paper, we introduce an oracle version of the Restricted Syndrome Decoding Problem (RSDP) and propose novel authentication protocols based on the hardness of this problem. They follow the basic structure of the HB-family of authentication protocols and later improvements but demonstrate several advantages.
An appropriate choice of multiplicative subgroup and ring structure gives rise to a very efficient hardware implementation compared to other \emph{Learning Parity with Noise} based approaches. In addition, the new protocols also have lower key size, lower communication costs, and potentially better completeness/soundness compared to learning-based alternatives. This is appealing in the context of low-cost, low-powered authenticating devices such as radio frequency identification (RFID) systems.
Lastly, we show that with additional assumptions, RSDP can be used to instantiate a Man-in-the-Middle secured authentication protocol.
ProbeShooter: A New Practical Approach for Probe Aiming
Electromagnetic side-channel analysis is a powerful method for monitoring processor activity and compromising cryptographic systems in air-gapped environments. As analytical methodologies and target devices evolve, the importance of leakage localization and probe aiming becomes increasingly apparent for capturing only the desired signals with a high signal-to-noise ratio. Despite its importance, there remains substantial reliance on unreliable heuristic approaches and inefficient exhaustive searches. Furthermore, related studies often fall short in terms of feasibility, practicality, and performance, and are limited to controlled DUTs and low-end MCUs.
To address the limitations and inefficiencies of the previous approaches, we propose a novel methodology―${\rm P{\tiny ROBE}S{\tiny HOOTER}}$―for leakage localization and probe aiming. This approach leverages new insights into the spatial characteristics of amplitude modulation and intermodulation distortion in processors. As a result, ${\rm P{\tiny ROBE}S{\tiny HOOTER}}$ provides substantial improvements in various aspects: $\boldsymbol 1)$ it is applicable to not only simple MCUs but also complex SoCs, $\boldsymbol 2)$ it effectively handles multi-core systems and dynamic frequency scaling, $\boldsymbol 3)$ it is adoptable to uncontrollable DUTs, making it viable for constrained real-world attacks, and $\boldsymbol 4)$ it performs significantly faster than previous methods. To demonstrate this, we experimentally evaluate ${\rm P{\tiny ROBE}S{\tiny HOOTER}}$ on a high-end MCU (the NXP i.MX RT1061 featuring a single ARM Cortex-M7 core) and a complex SoC (the Broadcom BCM2711 equipped with the Raspberry Pi 4 Model B, featuring four ARM Cortex-A72 cores).
Foundations of Platform-Assisted Auctions
Today, many auctions are carried out with the help of intermediary platforms like Google and eBay. These platforms serve as a rendezvous point for the buyers and sellers, and charge a fee for its service. We refer to such auctions as platform-assisted auctions. Traditionally, the auction theory literature mainly focuses on designing auctions that incentivize the buyers to bid truthfully, assuming that the platform always faithfully implements the auction. In practice, however, the platforms have been found to manipulate the auctions to earn more profit, resulting in high-profile anti-trust lawsuits.
We propose a new model for studying platform-assisted auctions in the permissionless setting, where anyone can register and participate in the auction. We explore whether it is possible to design a dream auction in this new model, such that honest behavior is the utility-maximizing strategy for each individual buyer, the platform, the seller, as well as platform-seller or platform-buyer coalitions. Through a collection of feasibility and infeasibility results, we carefully characterize the mathematical landscape of platform-assisted auctions.
Interestingly, our work reveals exciting connections between cryptography and mechanism design. We show how cryptography can lend to the design of an efficient platform-assisted auction with dream properties. Although a line of works have also used multi-party computation (MPC) or the blockchain to remove the reliance on a trusted auctioneer, our work is distinct in nature in several dimensions. First, we initiate a systematic exploration of the game theoretic implications when the service providers (e.g., nodes that implement the MPC or blockchain protocol) are strategic and can collude with sellers or buyers. Second, we observe that the full simulation paradigm used in the standard MPC literature is too stringent and leads to high asymptotical costs. Specifically, because every player has a different private outcome in an auction protocol, to the best of our knowledge, running any generic MPC protocol among the players would incur at least $n^2$ total cost where $n$ is the number of buyers.We propose a new notion of simulation called {\it utility-dominated emulation} that is sufficient for guaranteeing the game-theoretic properties needed in an auction. Under this new notion of simulation, we show how to design efficient auction protocols with quasilinear efficiency, which gives an $n$-fold improvement over any generic approach.
On the Independence Assumption in Quasi-Cyclic Code-Based Cryptography
Cryptography based on the presumed hardness of decoding codes -- i.e., code-based cryptography -- has recently seen increased interest due to its plausible security against quantum attackers. Notably, of the four proposals for the NIST post-quantum standardization process that were advanced to their fourth round for further review, two were code-based. The most efficient proposals -- including HQC and BIKE, the NIST submissions alluded to above -- in fact rely on the presumed hardness of decoding structured codes. Of particular relevance to our work, HQC is based on quasi-cyclic codes, which are codes generated by matrices consisting of two cyclic blocks.
In particular, the security analysis of HQC requires a precise understanding of the Decryption Failure Rate (DFR), whose analysis relies on the following heuristic: given random "sparse" vectors $e_1,e_2$ (say, each coordinate is i.i.d. Bernoulli) multiplied by fixed "sparse" quasi-cyclic matrices $A_1,A_2$, the weight of resulting vector $e_1A_1+e_2A_2$ is very concentrated around its expectation. In the documentation, the authors model the distribution of $e_1A_1+e_2A_2$ as a vector with independent coordinates (and correct marginal distribution). However, we uncover cases where this modeling fails. While this does not invalidate the (empirically verified) heuristic that the weight of $e_1A_1+e_2A_2$ is concentrated, it does suggest that the behavior of the noise is a bit more subtle than previously predicted. Lastly, we also discuss implications of our result for potential worst-case to average-case reductions for quasi-cyclic codes.
New Quantum Cryptanalysis of Binary Elliptic Curves (Extended Version)
This paper improves upon the quantum circuits required for the Shor's attack on binary elliptic curves. We present two types of quantum point addition, taking both qubit count and circuit depth into consideration.
In summary, we propose an in-place point addition that improves upon the work of Banegas et al. from CHES'21, reducing the qubit count – depth product by more than $73\%$ – $81\%$ depending on the variant. Furthermore, we develop an out-of-place point addition by using additional qubits. This method achieves the lowest circuit depth and offers an improvement of over $92\%$ in the qubit count – quantum depth product (for a single step).
To the best of our knowledge, our work improves from all previous works (including the CHES'21 paper by Banegas et al., the IEEE Access'22 paper by Putranto et al., and the CT-RSA'23 paper by Taguchi and Takayasu) in terms of circuit depth and qubit count – depth product.
Equipped with the implementations, we discuss the post-quantum security of the binary elliptic curve cryptography. Under the MAXDEPTH metric (proposed by the US government's NIST), the quantum circuit with the highest depth in our work is $2^{24}$, which is significantly lower than the MAXDEPTH limit of $2^{40}$. For the gate count – full depth product, a metric for estimating quantum attack cost (proposed by NIST), the highest complexity in our work is $2^{60}$ for the curve having degree 571 (which is comparable to AES-256 in terms of classical security), considerably below the post-quantum security level 1 threshold (of the order of $2^{156}$).
Dynamically Available Common Subset
Internet-scale consensus protocols used by blockchains are designed to remain operational in the presence of unexpected temporary crash faults (the so-called sleepy model of consensus) -- a critical feature for the latency-sensitive financial applications running on these systems.
However, their leader-based architecture, where a single block proposer is responsible for creating the block at each height, makes them vulnerable to short-term censorship attacks, in which the proposers profit at the application layer by excluding certain transactions.
In this work, we introduce an atomic broadcast protocol, secure in the sleepy model, that ensures the inclusion of all transactions within a constant expected time, provided that at least one participating node is non-censoring at all times.
Unlike traditional approaches, our protocol avoids designating a single proposer per block height, instead leveraging a so-called dynamically available common subset (DACS) protocol -- the first of its kind in the sleepy model. Additionally, our construction guarantees deterministic synchronization -- once an honest node confirms a block, all other honest nodes do so within a constant time, thus addressing a shortcoming of many low-latency sleepy protocols.
A New Method for Solving Discrete Logarithm Based on Index Calculus
Index Calculus (IC) algorithm is the most effective probabilistic algorithm for solving discrete logarithms over finite fields of prime numbers, and it has been widely applied to cryptosystems based on elliptic curves. Since the IC algorithm was proposed in 1920, the research on it has never stopped, especially discretization of prime numbers on the finite fields, both the algorithm itself and its application have been greatly developed. Of course, there has been some research on elliptic curves,but with little success. For the IC algorithm, scholars pay more attention to how to improve the probability of solving and reduce the time complexity of calculation. It is the first time for the IICA to study the optimization problem of the IC by using the method of integer. However, the IICA only studies the case of integer up, and fails to consider the case of integer down. It is found that the integer direction of the IICA can be integer up or integer down, but the concept of modular multiplication needs to be used when integer down. After optimizing the IICA, the probability of successful solution of discrete logarithm is increased by nearly 2 times, and the number of transformations is also reduced to a certain extent, thus reducing the time complexity of solution. The re-optimized the IC algorithm greatly improves the probability of successful the IC solution. This research result poses a serious challenge to cryptosystems based on finite fields of prime numbers.
SPY-PMU: Side-Channel Profiling of Your Performance Monitoring Unit to Leak Remote User Activity
The Performance Monitoring Unit (PMU), a standard feature in all modern computing systems, presents significant security risks by leaking sensitive user activities through microarchitectural event data. This work demonstrates the feasibility of remote side-channel attacks leveraging PMU data, revealing vulnerabilities that compromise user privacy and enable covert surveillance without physical access to the target machine. By analyzing the PMU feature space, we create distinct micro-architectural fingerprints for benchmark applications, which are then utilized in machine learning (ML) models to detect the corresponding benchmarks. This approach allows us to build a pre-trained model for benchmark detection using the unique micro-architectural fingerprints derived from PMU data. Subsequently, when an attacker remotely accesses the victim’s PMU data, the pre-trained model enables the identification of applications used by the victim with high accuracy. In our proof-of-concept demonstration, the pre-trained model successfully identifies applications used by a victim when the attacker remotely accesses PMU data, showcasing the potential for malicious exploitation of PMU data. We analyze stress-ng benchmarks and build our classifiers using logistic regression, decision tree, k-nearest neighbors, and random forest ML models. Our proposed models achieve an average prediction accuracy of 98%, underscoring the potential risks associated with remote side-channel analysis using PMU data and emphasizing the need for more robust safeguards. This work underscores the urgent need for robust countermeasures to protect against such vulnerabilities and provides a foundation for future research in micro-architectural security.
Wave Hello to Privacy: Efficient Mixed-Mode MPC using Wavelet Transforms
This paper introduces new protocols for secure multiparty computation (MPC) leveraging Discrete Wavelet Transforms (DWTs) for computing nonlinear functions over large domains. By employing DWTs, the protocols significantly reduce the overhead typically associated with Lookup Table-style (LUT) evaluations in MPC. We state and prove foundational results for DWT-compressed LUTs in MPC, present protocols for 9 of the most common activation functions used in ML, and experimentally evaluate the performance of our protocols for large domain sizes in the LAN and WAN settings. Our protocols are extremely fast -- for instance, when considering 64-bit inputs, computing 1000 parallel instances of the sigmoid function, with an error less than $2^{-24}$ takes only a few hundred milliseconds incurs just 29\,KiB of online communication (40 bytes per evaluation).
Leuvenshtein: Efficient FHE-based Edit Distance Computation with Single Bootstrap per Cell
This paper presents a novel approach to calculating the Levenshtein (edit) distance within the framework of Fully Homomorphic Encryption (FHE), specifically targeting third-generation schemes like TFHE. Edit distance computations are essential in applications across finance and genomics, such as DNA sequence alignment. We introduce an optimised algorithm that significantly reduces the cost of edit distance calculations called Leuvenshtein. This algorithm specifically reduces the number of programmable bootstraps (PBS) needed per cell of the calculation, lowering it from approximately 28 operations—required by the conventional Wagner-Fisher algorithm—to just 1. Additionally, we propose an efficient method for performing equality checks on characters, reducing ASCII character comparisons to only 2 PBS operations. Finally, we explore the potential for further performance improvements by utilizing preprocessing when one of the input strings is unencrypted. Our Leuvenshtein achieves up to $205\times$ faster performance compared to the best available TFHE implementation and up to $39\times$ faster than an optimised implementation of the Wagner-Fisher algorithm. Moreover, when offline preprocessing is possible due to the presence of one unencrypted input on the server side, an additional $3\times$ speedup can be achieved.
DL-SCADS: Deep Learning-Based Post-Silicon Side-Channel Analysis Using Decomposed Signal
Side-channel analysis (SCA) does not aim at the algorithm's weaknesses but rather its implementations. The rise of machine learning (ML) and deep learning (DL) is giving adversaries advanced capabilities to perform stealthy attacks. In this paper, we propose DL-SCADS, a DL-based approach along with signal decomposition techniques to leverage the power of secret key extraction from post-silicon EM/power side-channel traces. We integrate previously proven effective ideas of model ensembling and automated hyperparameter tuning with signal decomposition to develop an efficient and robust side-channel attack. Extensive experiments are performed on Advanced Encryption Standard (AES) and Post-Quantum Cryptography (PQC) to demonstrate the efficacy of our approach. The evaluation of the performance of the side-channel attack employing various decomposition techniques and comparison with the proposed approach in a range of datasets are also tabulated.
A Combinatorial Approach to IoT Data Security
This article explores the potential of Secret Sharing-Based Internet of Things (SBIoT) as a promising cryptographic element across diverse applications, including secure data storage in commercial cloud systems (Datachest), smart home environments (encompassing sensors, cameras, smart locks, and smart assistants), and e-health applications (protecting patient data and medical records). Beyond these applications, the paper makes two key contributions: the introduction of a novel cheater identification algorithm designed to verify the correct submission of shares during secret reconstruction, and empirical validation through experimental studies to support the theoretical advancements. This multifaceted approach not only demonstrates the versatility of SBIoT but also proposes innovative mechanisms to enhance security and integrity, contributing to the development of a more robust cryptographic framework.
This article expands upon the work presented in the poster "A Combinatorial Approach to IoT Data Security" at IWSEC 2023, Yokohama, Japan.
Efficient CPA Attack on Hardware Implementation of ML-DSA in Post-Quantum Root of Trust
Side-channel attacks (SCA) pose a significant threat to cryptographic implementations, including those designed to withstand the computational power of quantum computers.
This paper introduces the first side-channel attack on an industry-grade post-quantum cryptography implementation.
Specifically, we present a Correlation Power Analysis (CPA) attack targeting the open-source hardware implementation of ML-DSA within a Silicon Root of Trust framework developed through a multi-party collaboration involving leading technology companies.
Our attack focuses on the modular reduction process that follows the Number Theoretic Transform-based polynomial pointwise multiplication.
By exploiting side-channel leakage from a distinctive unique reduction algorithm and leveraging the zeroization mechanism used to securely erase sensitive information by clearing internal registers, we significantly enhance the attack's efficacy.
Our findings reveal that an adversary can extract the secret keys using only 10,000 power traces.
With access to these keys, an attacker could forge signatures for certificate generation, thereby compromising the integrity of the root of trust.
This work highlights the vulnerabilities of industry-standard root-of-trust systems to side-channel attacks. It underscores the urgent need for robust countermeasures to secure commercially deployed systems against such threats.
A Survey of Interactive Verifiable Computing: Utilizing Randomness in Low-Degree Polynomials
This survey provides a comprehensive examination of verifiable computing, tracing its evolution from foundational complexity theory to modern zero-knowledge succinct non-interactive arguments of knowledge (ZK-SNARKs). We explore key developments in interactive proof systems, knowledge complexity, and the application of low-degree polynomials in error detection and verification protocols. The survey delves into essential mathematical frameworks such as the Cook-Levin Theorem, the sum-check protocol, and the GKR protocol, highlighting their roles in enhancing verification efficiency and soundness. By systematically addressing the limitations of traditional NP-based proof systems and then introducing advanced interactive proof mechanisms to overcome them, this work offers an accessible step-by-step introduction for newcomers while providing detailed mathematical analyses for researchers. Ultimately, we synthesize these concepts to elucidate the GKR protocol, which serves as a foundation for contemporary verifiable computing models. This survey not only reviews the historical and theoretical advancements in verifiable computing over the past three decades but also lays the groundwork for understanding recent innovations in the field.
Non Linearizable Entropic Operator
In [Pan21] a linearization attack is proposed in order to break the cryp-
tosystem proposed in [Gli21]. We want to propose here a non-linearizable
operator that disables this attack as this operator doesn't give raise to a
quasigrup and doesn't obey the latin square property.
Nearly Quadratic Asynchronous Distributed Key Generation
We prove that for any $1\le k\le \log n$, given a VRF setup and assuming secure erasures, there exists a protocol for Asynchronous Distributed Key Generation (ADKG) that is resilient to a strongly adaptive adversary that can corrupt up to $f<n/3$ parties. With all but negligible probability, all nonfaulty parties terminate in an expected $O(k)$ rounds and send a total expected $\tilde{O}(n^{2+1/k})$ messages.
What is "legal" and "illegal?": Social Norms, Current Practices and Perceived Risks among the Cryptocurrency Users in Bangladesh
Cryptocurrency practices worldwide are seen as innovative, yet they navigate a fragmented regulatory environment. Many local authorities aim to balance promoting innovation, safeguarding consumers, and managing potential threats. In particular, it is unclear how people deal with cryptocurrencies in the regions where trading or mining is prohibited. This insight is crucial in conveying the risk reduction strategies. To address this, we conducted semi-structured interviews with 28 cryptocurrency traders and miners from Bangladesh, where the local authority is hostile towards cryptocurrencies. Our research revealed that the participants use unique strategies to mitigate risks around cryptocurrencies. Our findings indicate a prevalent uncertainty at both personal and organizational levels concerning the interpretation of laws, a situation worsened by the actions of the major financial service providers who indirectly facilitate cryptocurrency transactions. We further connect our findings to the broader issues in HCI regarding folk models, informal market and legality, and education and awareness.
Smaug: Modular Augmentation of LLVM for MPC
Secure multi-party computation (MPC) is a crucial tool for privacy-preserving computation, but it is getting increasingly complicated due to recent advancements and optimizations. Programming tools for MPC allow programmers to develop MPC applications without mastering all cryptography. However, most existing MPC programming tools fail to attract real users due to the lack of documentation, maintenance, and the ability to compose with legacy codebases. In this work, we build Smaug, a modular extension of LLVM. Smaug seamlessly brings all LLVM support to MPC programmers, including error messaging, documentation, code optimization, and frontend support to compile from various languages to LLVM intermediate representation (IR). Smaug can efficiently convert non-oblivious LLVM IR to their oblivious counterparts while applying popular optimizations as LLVM code transformations. With benchmarks written in C++ and Rust and backends for Yao and GMW protocols, we observe that Smaug performs as well as (and sometimes much better than) prior tools using domain-specific languages with similar backends. Finally, we use Smaug to compile open-source projects that
implement Minesweeper and Blackjack, producing usable two-party games with ease.
Post-Quantum DNSSEC with Faster TCP Fallbacks
In classical DNSSEC, a drop-in replacement with quantum-safe cryptography would increase DNS query resolution times by $\textit{at least}$ a factor of $2\times$. Since a DNS response containing large post-quantum signatures is likely to get marked truncated ($\texttt{TC}$) by a nameserver (resulting in a wasted UDP round-trip), the client (here, the resolver) would have to retry its query over TCP, further incurring a $\textit{minimum}$ of two round-trips due to the three-way TCP handshake.
We present $\mathsf{TurboDNS}$: a backward-compatible protocol that eliminates $\textit{two}$ round-trips from the preceding flow by 1) sending TCP handshake data in the initial DNS/UDP flight itself, and 2) immediately streaming the DNS response over TCP after authenticating the client with a cryptographic cookie. Our experiments show that DNSSEC over $\mathsf{TurboDNS}$, with either Falcon-512 or Dilithium-2 as the zone signing algorithm, is practically as fast as the currently deployed ECDSA P-256 and RSA-2048 setups in resolving $\texttt{QTYPE}$ $\texttt{A}$ DNS queries.
Voting with coercion resistance and everlasting privacy using linkable ring signatures
We propose an e-voting protocol based on a novel linkable ring signature scheme with unconditional anonymity. In our system, all voters create private credentials and register their public counterparts. To vote, they create a ring (anonymity set) consisting of public credentials together with a proof of knowledge of their secret credential via our signature. Its unconditional anonymity prevents an attacker, no matter how powerful, from deducing the identity of the voter, thus attaining everlasting privacy. Additionally, our protocol provides coercion resistance in the JCJ framework; when an adversary tries to coerce a voter, the attack can be evaded by creating a signature with a fake but indistinguishable credential. During a moment of privacy, they will cast their real vote. Our scheme also provides verifiability and ballot secrecy.
Attribute Based Encryption for Turing Machines from Lattices
We provide the first attribute based encryption (ABE) scheme for Turing machines supporting unbounded collusions from lattice assumptions. In more detail, the encryptor encodes an attribute $\mathbf{x}$ together with a bound $t$ on the machine running time and a message $m$ into the ciphertext, the key generator embeds a Turing machine $M$ into the secret key and decryption returns $m$ if and only if $M(\mathbf{x})=1$. Crucially, the input $\mathbf{x}$ and machine $M$ can be of unbounded size, the time bound $t$ can be chosen dynamically for each input and decryption runs in input specific time.
Previously the best known ABE for uniform computation supported only non-deterministic log space Turing machines (${\sf NL})$ from pairings (Lin and Luo, Eurocrypt 2020). In the post-quantum regime, the state of the art supports non-deterministic finite automata from LWE in the $\textit{ symmetric}$ key setting (Agrawal, Maitra and Yamada, Crypto 2019).
In more detail, our results are:
1. We construct the first ABE for ${\sf NL}$ from the LWE, evasive LWE (Wee, Eurocrypt 2022 and Tsabary, Crypto 2022) and Tensor LWE (Wee, Eurocrypt 2022) assumptions. This yields the first (conjectured) post-quantum ABE for ${\sf NL}$.
2. Relying on LWE, evasive LWE and a new assumption called $\textit{circular tensor}$ LWE, we construct ABE for all Turing machines. At a high level, the circular tensor LWE assumption incorporates circularity into the tensor LWE (Wee, Eurocrypt 2022) assumption.
Towards our ABE for Turing machines, we obtain the first CP-ABE for circuits of unbounded depth and size from the same assumptions -- this may be of independent interest.