ReMCA: A Reconfigurable Multi-Core Architecture for Full RNS Variant of BFV Homomorphic Evaluation

被引:15
作者
Su, Yang [1 ,2 ]
Yang, Bai-Long [1 ]
Yang, Chen [3 ]
Zhao, Song-Yin [2 ]
机构
[1] Rocket Force Univ Engn, Sch Operat Support, Xian 710025, Peoples R China
[2] Engn Univ Peoples Armed Police, Sch Cryptog Engn, Xian 710086, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Microelect, Xian 710049, Peoples R China
关键词
Homomorphic encryption; reconfigurable PE; multi-core architecture; NTT/INTT; RNS; BFV scheme; POLYNOMIAL MULTIPLICATION; ENCRYPTION; ACCELERATOR; FV; CRYPTOGRAPHY; PROCESSOR;
D O I
10.1109/TCSI.2022.3163970
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Fully homomorphic encryption (FHE) allows arbitrary computation on encrypted data and thus has potential in privacy-preserving computing. However, efficiency is still the bottleneck. In this paper we present an area-efficient and highly unified reconfigurable multi-core architecture (named ReMCA) for full Residue Number System (RNS) variant of Fan-Vercauteren variant of Brakerski's scheme (RNS-BFV), which employs a variable number of reconfigurable processing elements (PEs) and RNS channels. The PE unit can be flexibly configured as NTT, INTT or modular multiplier, thereby avoiding the need of other extra computational units. To reduce the computational complexity, ReMCA merges the pre/post-processing into NTT/INTT and unifies the read/write structure of NTT and INTT. Also, a conflict-free memory access pattern that doesn't need separate bit-reversal operation is proposed to optimize the memory access. Furthermore, targeting different computational requirements, a unified hardware architecture mapping model and data memory organization model are introduced, and all the computing units that RNS-BFV involved are optimized and mapped on ReMCA. ReMCA is evaluated on a Xilinx Virtex-7 FPGA platform. Running at 250MHz, it can perform 2260 homomorphic multiplication per second. When normalized to the same parameter set, the throughput and Area-Time-Products (ATPs) of ReMCA achieve 1.45x similar to 5.51x and 1.58x similar to 5.12 x improvements.
引用
收藏
页码:2857 / 2870
页数:14
相关论文
共 36 条
[11]   High-Speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems [J].
Chen, Donald Donglong ;
Mentes, Nele ;
Vercauteren, Frederik ;
Roy, Sujoy Sinha ;
Cheung, Ray C. C. ;
Pao, Derek ;
Verbauwhede, Ingrid .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2015, 62 (01) :157-166
[12]  
Fan J., 2012, IACR Cryptol. ePrint Arch
[13]   Accelerating an FHE Integer Multiplier Using Negative Wrapped Convolution and Ping-Pong FFT [J].
Feng, Xiang ;
Li, Shuguo .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (01) :121-125
[14]   Fully Homomorphic Encryption Using Ideal Lattices [J].
Gentry, Craig .
STOC'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2009, :169-178
[15]  
Ghadamyari M, 2019, IEEE INT CONF BIG DA, P5474, DOI 10.1109/BigData47090.2019.9006231
[16]  
Halevi Shai, 2019, Topics in Cryptology - CT-RSA 2019. The Cryptographers Track at the RSA Conference 2019. Proceedings: Lecture Notes in Computer Science (LNCS 11405), P83, DOI 10.1007/978-3-030-12612-4_5
[17]  
Hart William., 2011, Flint-fast library for number theory
[18]   High-precision division and square root [J].
Karp, AH ;
Markstein, P .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1997, 23 (04) :561-589
[19]  
Lepoint T, 2014, LECT NOTES COMPUT SC, V8469, P318
[20]   Optimized Schoolbook Polynomial Multiplication for Compact Lattice-Based Cryptography on FPGA [J].
Liu, Weiqiang ;
Fan, Sailong ;
Khalid, Ayesha ;
Rafferty, Ciara ;
O'Neill, Maire .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (10) :2459-2463