VHDL Generator for A High Performance Convolutional Neural Network FPGA-Based Accelerator

被引:0
作者
Hamdan, Muhammad K. [1 ]
Rover, Diane T. [1 ]
机构
[1] Iowa State Univ Sci & Technol, Elect & Comp Engn Dept, Ames, IA 50011 USA
来源
2017 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG) | 2017年
关键词
VHDL generator; CNNs; AlexNet; parallelism; reconfigurable; adaptability; pipeline; scalable; FPGA; COPROCESSOR;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Network (CNN) has been proven as a highly accurate and effective algorithm that has been used in a variety of applications such as handwriting digit recognition, visual recognition, and image classification. As a matter of fact, state-of-the-art CNNs are computationally intensive; however, their parallel and modular nature make platforms like FPGAs well suited for the acceleration process. A typical CNN takes a very long development round on FPGAs, hence in this paper, we propose a tool which allows developers, through a configurable user-interface, to automatically generate VHDL code for their desired CNN model. The generated code or architecture is modular, massively parallel, reconfigurable, scalable, fully pipelined, and adaptive to different CNN models. We demonstrate the automatic VHDL generator and its adaptability by implementing a small-scale CNN model "LeNet" and a large-scale one "AlexNet". The parameters of small scale models are automatically hard-coded as constants (part of the programmable logic) to overcome the memory bottleneck issue. On a Xilinx Virtex-7 running at 200 MHz, the system is capable of processing up to 125k images/s of size 28x28 for LeNet and achieved a peak performance of 611.52 GOP/s and 414 FPS for AlexNet.
引用
收藏
页数:6
相关论文
共 15 条
[1]  
[Anonymous], 2016, J NANOMATER
[2]   A Programmable Parallel Accelerator for Learning and Classification [J].
Cadambi, Srihari ;
Majumdar, Abhinandan ;
Becchi, Michela ;
Chakradhar, Srimat ;
Graf, Hans Peter .
PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2010, :273-283
[3]  
Chakradhar S, 2010, CONF PROC INT SYMP C, P247, DOI 10.1145/1816038.1815993
[4]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[5]  
DiCecco R., 2016, Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks
[6]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[7]  
Liu ZQ, 2016, 2016 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), P61, DOI 10.1109/FPT.2016.7929190
[8]  
Ma Y., 2016, FPL 2016 26 INT C FI
[9]  
Park J, 2016, INT CONF ACOUST SPEE, P1011, DOI 10.1109/ICASSP.2016.7471828
[10]  
Pereira J.L.F., 2012, Proceedings of the 27th Annual ACM Symposium on Applied Computing - SAC'12, P286, DOI DOI 10.1145/2245276.2245333