An Efficient FPGA-Based Accelerator Design for Convolution

被引:0
作者
Song, Peng-Fei [1 ]
Pan, Jeng-Shyang [2 ]
Yang, Chun-Sheng [1 ]
Lee, Chiou-Yng [3 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Innovat Informat Ind Res Ctr, Shenzhen, Peoples R China
[2] Fujian Univ Technol, Coll Informat Sci & Engn, Innovat Informat Ind Res Ctr, Harbin Inst Technol,Shenzhen Grad Sch, Fuzhou, Fujian, Peoples R China
[3] Lunghwa Univ Sci & Technol, Dept Comp Informat & Network Engn, Taoyuan, Taiwan
来源
2017 IEEE 8TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST) | 2017年
关键词
Convolution; finite fields; number theoretic transform; FPGA; NUMBER TRANSFORM; MULTIPLICATION; ALGORITHM; ARCHITECTURES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Number theoretic transform with the modular arithmetic operations can perform convolution efficiently in a ring without round-off errors. In this paper, a new efficient architecture of the transform have been proposed which support a various operand size. To have a balanced trade-off between area and latency, a variant constant geometry architecture is used which the forward and backward sub-stage used the same computation pattern. In addition, a XOR-based multi-ported RAM is adopted to accelerate the memory access which allow multiple simultaneous reads and writes efficiently. As a result, the developed accelerator can achieve lower area-latency FPGA compared to other designs.
引用
收藏
页码:494 / 500
页数:7
相关论文
共 19 条
[1]   Parameter Space for the Architecture of FFT-Based Montgomery Modular Multiplication [J].
Chen, Donald Donglong ;
Yao, Gavin Xiaoxu ;
Cheung, Ray C. C. ;
Pao, Derek ;
Koc, Cetin Kaya .
IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (01) :147-160
[2]   High-Speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems [J].
Chen, Donald Donglong ;
Mentes, Nele ;
Vercauteren, Frederik ;
Roy, Sujoy Sinha ;
Cheung, Ray C. C. ;
Pao, Derek ;
Verbauwhede, Ingrid .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2015, 62 (01) :157-166
[3]   AN ALGORITHM FOR MACHINE CALCULATION OF COMPLEX FOURIER SERIES [J].
COOLEY, JW ;
TUKEY, JW .
MATHEMATICS OF COMPUTATION, 1965, 19 (90) :297-&
[4]   Area-Time Efficient Architecture of FFT-Based Montgomery Multiplication [J].
Dai, Wangchen ;
Chen, Donald Donglong ;
Cheung, Ray C. C. ;
Koc, Cetin Kaya .
IEEE TRANSACTIONS ON COMPUTERS, 2017, 66 (03) :375-388
[5]   GENERALIZED FERMAT-MERSENNE NUMBER-THEORETIC TRANSFORM [J].
DIMITROV, VS ;
COOKLEV, TV ;
DONEVSKY, BD .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-ANALOG AND DIGITAL SIGNAL PROCESSING, 1994, 41 (02) :133-139
[6]   Pipelined Radix-2k Feedforward FFT Architectures [J].
Garrido, Mario ;
Grajal, J. ;
Sanchez, M. A. ;
Gustafsson, Oscar .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2013, 21 (01) :23-32
[7]   PRIME FACTOR FFT ALGORITHM USING HIGH-SPEED CONVOLUTION [J].
KOLBA, DP ;
PARKS, TW .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1977, 25 (04) :281-294
[8]   Efficient Designs of Multiported Memory on FPGA [J].
Lai, Bo-Cheng Charles ;
Lin, Jiun-Liang .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (01) :139-150
[9]   HARDWARE REALIZATION OF A FERMAT NUMBER TRANSFORM [J].
MCCLELLAN, JH .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (03) :216-225
[10]   New frameworks for Montgomery's modular multiplication method [J].
McLaughlin, PB .
MATHEMATICS OF COMPUTATION, 2004, 73 (246) :899-906