FalconSign: An Efficient and High-Throughput Hardware Architecture for Falcon Signature Generation

被引:0
作者
Ouyang, Yi [1 ]
Zhu, Yihong [1 ]
Zhu, Wenping [1 ]
Yang, Bohan [1 ]
Zhang, Zirui [1 ]
Wang, Hanning [1 ]
Tao, Qichao [1 ]
Zhu, Min [2 ]
Wei, Shaojun [1 ]
Liu, Leibo [1 ]
机构
[1] Beijing National Research Center for Information Science and Technology (BNRist), School of Integrated Circuits, Tsinghua University, Beijing
[2] Wuxi Micro Innovation Integrated Circuit Design Co., Ltd, Wuxi
来源
IACR Transactions on Cryptographic Hardware and Embedded Systems | 2025年 / 2025卷 / 01期
基金
中国国家自然科学基金;
关键词
Configurable; Falcon; Fast-Fourier Sampling; Floating-point; FPGA; High-performance; Lattice; Post-quantum cryptography;
D O I
10.46586/tches.v2025.i1.203-226
中图分类号
学科分类号
摘要
Falcon is a lattice-based quantum-resistant digital signature scheme renowned for its high signature generation/verification speed and compact signature size. The scheme has been selected to be drafted in the third round of the post-quantum cryptography (PQC) standardization process due to its unique attributes and robust security features. Despite its strengths, there has been a lack of research on hardware acceleration, primarily due to its complex calculation flow and floating-point operations, which hinders its widespread adoption. To address this issue, we propose FalconSign, a high-performance, configurable crypto-processor designed to accelerate Falcon signature generation on FPGA/ASIC through algorithmhardware co-design. Our approach involves a new scheduling flow and architecture for Fast-Fourier Sampling to enhance computing unit reuse and reduce processing time. Additionally, we introduce several optimized modules, including configurable randomness generation units, parallel floating-point processing units, and an optimized SamplerZ module, to improve execution efficiency. Furthermore, this paper presents a finely optimized hardware accelerator for the Falcon scheme. Our FPGA implementation results demonstrate a throughput improvement of approximately 5.1 × compared to state-of-the-art designs, with 2.8×/4.5×/4.2×/3.2× fewer in the area (LUTs/FFs/DSPs/BRAMs)-time product, for NIST security level V. The crypto-processor occupies an area of 0.71 mm2 and achieves 5.2k OPS at throughput on the TSMC 28nm process for NIST security level I. © 2025, Ruhr-University of Bochum. All rights reserved.
引用
收藏
页码:203 / 226
页数:23
相关论文
共 30 条
  • [1] Boneh Dan, Dagdelen Ozgur, Fischlin Marc, Lehmann Anja, Schaffner Christian, Zhandry Mark, Random oracles in a quantum world, Advances in Cryptology–ASIACRYPT 2011: 17th International Conference on the Theory and Application of Cryptology and Information Security, Seoul, South Korea, December 4-8, 2011. Proceedings 17, pp. 41-69, (2011)
  • [2] Bos Joppe, Ducas Leo, Kiltz Eike, Lepoint Tancrede, Lyubashevsky Vadim, Schanck John M, Schwabe Peter, Seiler Gregor, Stehle Damien, Crystalskyber: a cca-secure module-lattice-based kem, 2018 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 353-367, (2018)
  • [3] Bernstein Daniel J, Hulsing Andreas, Kolbl Stefan, Niederhagen Ruben, Rijneveld Joost, Schwabe Peter, The sphincs+ signature framework, Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, pp. 2129-2146, (2019)
  • [4] Ducas Leo, Prest Thomas, Fast fourier orthogonalization, Proceedings of the ACM on international symposium on symbolic and algebraic computation, pp. 191-198, (2016)
  • [5] Garrido Mario, A survey on pipelined fft hardware architectures, Journal of Signal Processing Systems, 94, 11, pp. 1345-1364, (2022)
  • [6] Gentry Craig, Peikert Chris, Vaikuntanathan Vinod, Trapdoors for hard lattices and new cryptographic constructions, Proceedings of the fortieth annual ACM symposium on Theory of computing, pp. 197-206, (2008)
  • [7] Grover Lov K, A fast quantum mechanical algorithm for database search, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pp. 212-219, (1996)
  • [8] Hoffstein Jeffrey, Pipher Jill, Silverman Joseph H, Ntru: A ring-based public key cryptosystem, International algorithmic number theory symposium, pp. 267-288, (1998)
  • [9] Karabulut Emre, Aysu Aydin, A hardware-software co-design for the discrete gaussian sampling of falcon digital signature, 2024 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), pp. 90-100, (2024)
  • [10] Kahan William, Ieee standard 754 for binary floating-point arithmetic, Lecture Notes on the Status of IEEE, 754, 94720-1776, (1996)