A Scalable and Systolic Architectures of Montgomery Modular Multiplication for Public Key Cryptosystems Based on DSPs

被引:0
作者
Amine Mrabet
Nadia El-Mrabet
Ronan Lashermes
Jean-Baptiste Rigaud
Belgacem Bouallegue
Sihem Mesnager
Mohsen Machhout
机构
[1] University of Paris XIII,CNRS, UMR 7539 LAGA
[2] University of Monastir,EμE Lab
[3] National Engineering School of Tunis,undefined
[4] Ecole des Mines de St-Etienne,undefined
[5] SAS-CMP,undefined
[6] LHS-PEC TAMIS INRIA,undefined
[7] King Khalid University,undefined
[8] Télécom ParisTech,undefined
关键词
Hardware implementation; Modular multiplication; Montgomery algorithm; CIOS method; Systolic architecture; DSP48;
D O I
10.1007/s41635-017-0018-x
中图分类号
学科分类号
摘要
The arithmetic in a finite field constitutes the core of public key cryptography like RSA, ECC or pairing-based cryptography. This paper discusses an efficient hardware implementation of the Coarsely Integrated Operand Scanning (CIOS) method of Montgomery modular multiplication combined with an effective systolic architecture designed with a two-dimensional array of processing elements. The systolic architecture increases the speed of calculation by combining the concepts of pipelining and the parallel processing into a single concept. We propose the CIOS method for the Montgomery multiplication using a systolic architecture. As far as we know, this is the first implementation of such design. The proposed architectures are designed for field programmable gate array platforms. They targeted to reduce the number of clock cycles of the modular multiplication. The presented implementation results of the CIOS algorithms focus on different security levels useful in cryptography. This architecture has been designed in order to use the flexible DSP48 on Xilinx Field-Programmable Gate Array’s. Our architecture is scalable and depends only on the number and size of words. For instance, we provide results of implementation for 8-, 16-, 32- and 64-bit-long words in 33, 66, 132 and 264 clock cycles. We highlight the fact that for a given number of word, the number of clock cycles is constant. We propose a general version of our systolic architecture presented in SPACE2016.
引用
收藏
页码:219 / 236
页数:17
相关论文
共 23 条
[1]  
Hariri A(2009)Bit-serial and bit-parallel Montgomery multiplication and squaring over GF IEEE Trans Comput 58 1332-1345
[2]  
Reyhani-Masoleh A(2011)New hardware architectures for Montgomery modular multiplication algorithm IEEE Trans Comput 60 923-936
[3]  
Huang M(2004)A one round protocol for tripartite Diffie-Hellman J Cryptol 17 263-276
[4]  
Gaj K(1996)Analyzing and comparing Montgomery multiplication algorithms IEEE Micro 16 26-33
[5]  
El-Ghazawi T(1987)Elliptic curve cryptosystems Math Comput 48 203-209
[6]  
Joux A(2010)Montgomery and RNS for RSA hardware implementation Cpmput Inform 29 849-880
[7]  
Koç CK(1982)Why systolic architectures? Computer 15 37-46
[8]  
Acar T(1985)Modular multiplication without trial division Math Comput 44 519-521
[9]  
Kaliski BSJr(2011)Montgomery modular multiplication on reconfigurable hardware: systolic versus multiplexed implementation Int J Reconfig Comput 2011 61-610
[10]  
Koblitz N(1978)A method for obtaining digital signatures and public-key cryptosystems Commun ACM 21 120-126