Optimized Hardware-Software Co-Design for Kyber and Dilithium on RISC-V SoC FPGA

被引:0
作者
Wang, Tengfei [1 ,3 ]
Zhang, Chi [1 ,3 ]
Zhang, Xiaolin [1 ,3 ]
Gu, Dawu [1 ,3 ]
Cao, Pei [2 ]
机构
[1] School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai
[2] Viewsource (Shanghai) Technology Company Limited, Shanghai
[3] State Key Laboratory of Cryptology, Beijing
来源
IACR Transactions on Cryptographic Hardware and Embedded Systems | 2024年 / 2024卷 / 03期
基金
中国国家自然科学基金;
关键词
Dilithium; FPGA; Hardware-software co-design; Kyber; Post-quantum cryptography; RISC-V;
D O I
10.46586/tches.v2024.i3.99-135
中图分类号
学科分类号
摘要
Kyber and Dilithium are both lattice-based post-quantum cryptography (PQC) algorithms that have been selected for standardization by the American National Institute of Standards and Technology (NIST). NIST recommends them as two primary algorithms to be implemented for most use cases. As the applications of RISC-V processors move from specialized scenarios to general scenarios, efficient implementations of PQC algorithms on general-purpose RISC-V platforms are required. In this work, we present an optimized hardware-software co-design for Kyber and Dilithium on the industry’s first RISC-V System-on-Chip (SoC) Field Programmable Gate Array (FPGA) platform. The performance of both algorithms is enhanced through the utilization of hardware acceleration and software optimization, while a certain level of flexibility is still maintained. The polynomial arithmetic operations in Kyber and Dilithium are accelerated by the customized accelerators. We employ a unified high-level architecture to depict their shared characteristics and design dedicated underlying modular multipliers to explore their distinctive features. The hashing functions are optimized using RISC-V assembly instructions, resulting in improved performance and reduced code size without additional hardware resources. For other operations involving matrices and vectors, we present a multi-core acceleration scheme based on the multi-core RISC-V Microprocessor Sub-System (MSS). Combining these acceleration and optimization methods, experimental results show that the overall performance of Kyber and Dilithium across different security levels improves by 3 to 5 times, while the utilized FPGA resources account for less than 5% of the total resources provided by the platform. © 2024, Ruhr-University of Bochum. All rights reserved.
引用
收藏
页码:99 / 135
页数:36
相关论文
共 50 条
  • [41] Shor P.W., Algorithms for quantum computation: discrete logarithms and factoring, Proceedings 35th Annual Symposium on Foundations of Computer Science, pp. 124-134, (1994)
  • [42] Schneider Tobias, Moradi Amir, Leakage assessment methodology-A clear roadmap for side-channel evaluations, Cryptographic Hardware and Embedded Systems-CHES 2015-17th International Workshop, Saint-Malo, France, September 13-16, 2015, Proceedings, volume 9293 of Lecture Notes in Computer Science, pp. 495-513, (2015)
  • [43] Cryptographic suite for algebraic lattices, (2017)
  • [44] Wang Tengfei, Zhang Chi, Cao Pei, Gu Dawu, Efficient implementation of dilithium signature scheme on FPGA soc platform, IEEE Trans. Very Large Scale Integr. Syst, 30, 9, pp. 1158-1171, (2022)
  • [45] Xin Guozhu, Han Jun, Yin Tianyu, Zhou Yuchao, Yang Jianwei, Cheng Xu, Zeng Xiaoyang, VPQC: A domain-specific vector processor for post-quantum cryptography based on RISC-V architecture, IEEE Trans. Circuits Syst. I Regul. Pap, 67-I, 8, pp. 2672-2684, (2020)
  • [46] Xing Yufei, Li Shuguo, A compact hardware implementation of cca-secure key exchange mechanism CRYSTALS-KYBER on FPGA, IACR Trans. Cryptogr. Hardw. Embed. Syst, 2021, 2, pp. 328-356, (2021)
  • [47] Zhao Yifan, Xie Ruiqi, Xin Guozhu, Han Jun, A high-performance domain-specific processor with matrix extension of RISC-V for module-lwe applications, IEEE Trans. Circuits Syst. I Regul. Pap, 69, 7, pp. 2871-2884, (2022)
  • [48] Zhang Neng, Yang Bohan, Chen Chen, Yin Shouyi, Wei Shaojun, Liu Leibo, Highly efficient architecture of newhope-nist on FPGA using low-complexity NTT/INTT, IACR Trans. Cryptogr. Hardw. Embed. Syst, 2020, 2, pp. 49-72, (2020)
  • [49] Zhao Cankun, Zhang Neng, Wang Hanning, Yang Bohan, Zhu Wenping, Li Zhengdong, Zhu Min, Yin Shouyi, Wei Shaojun, Liu Leibo, A compact and high-performance hardware architecture for crystals-dilithium, IACR Trans. Cryptogr. Hardw. Embed. Syst, 2022, 1, pp. 270-295, (2022)
  • [50] Zhu Yihong, Zhu Wenping, Zhu Min, Li Chongyang, Deng Chenchen, Chen Chen, Yin Shuying, Yin Shouyi, Wei Shaojun, Liu Leibo, A 28nm 48kops 3.4µj/op agile crypto-processor for post-quantum cryptography on multi-mathematical problems, IEEE International Solid-State Circuits Conference, ISSCC 2022, pp. 514-516, (2022)