Design of FPGA Based Convolutional Neural Network Co-Processor

被引:0
作者
Yang Y. [1 ]
Zhang G. [1 ]
Liang F. [1 ]
He P. [1 ]
Wu B. [1 ]
Gao Z. [1 ]
机构
[1] School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an
来源
Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University | 2018年 / 52卷 / 07期
关键词
Convolutional neural network; Deep learning; Programmable logic device;
D O I
10.7652/xjtuxb201807022
中图分类号
学科分类号
摘要
In the era of big data, the demand for computing resources and memory bandwidth in deep-level and large-scale deep learning network models is increasing exponentially. Traditional industry solution CPU+GPU is not suitable to the prevalent scenarios of mobile embedded applications. To deal with this problem, we proposed a design of convolutional neural network co-processor based on FPGA programmable logic device. This solution focuses on high compatibility. It has programmability and is compatible with a variety of network models to achieve hardware acceleration. It also has scalability to allow multi-core expansion within the range of hardware resources to achieve double performance. The design of convolutional operation module focuses on hardware parallelism and data reusability, which improves the utilization of hardware resources and computing efficiency. Rationally configured multi-level buffer structure reduces the co-processor's occupancy rate of external memory's read/write frequency and bandwidth, improves the internal communication efficiency of the module. The experimental results on the XILINX VC707 evaluation board show that the accuracy of the test set is 99%, the CIFAR-10 can achieve 80%, and the peak computing capability is 5.511×1010 s-1, the overall performance is approximately twice that of the general-purpose processor of Intel Xeno E5-2640 V4 server. Moreover, the processing performance of our design reaches the current mainstream level of FPGA solutions. © 2018, Editorial Office of Journal of Xi'an Jiaotong University. All right reserved.
引用
收藏
页码:153 / 159
页数:6
相关论文
共 15 条
[1]  
Rumelhart D.E., Hinton G.E., Williams R.J., Learning representations by back-propagating errors, Nature, 323, 6088, pp. 533-536, (1986)
[2]  
Liang P., Verhelst M., Session 14 overview: next-generation processing, Solid-State Circuits Conference, pp. 252-253, (2016)
[3]  
Daly D., Fujino L., ISSCC 2017: intelligent chips for a smart world, IEEE Solid-State Circuits Magazine, 8, 4, pp. 92-93, (2016)
[4]  
Friedman D., Hardware approaches to machine learning and inference, 2018 IEEE International Solid-State Circuits Conference, pp. 2376-8606, (2018)
[5]  
Chen T., Du Z., Sun N., Et al., DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning, ACM SIGPLAN Notices, 49, 4, pp. 269-284, (2014)
[6]  
Lu Q., Baidu's brain is the core of Baidu's AI platform Smart Cloud has the opportunity to subvert the cloud market, China Computer & Communication, 14, pp. 1-2, (2017)
[7]  
Alibaba teamed up with Intel to develop an FPGA-based solution, China Electronics Market: Basic Electronics, 3, (2017)
[8]  
Lu H., Zhang Q., Applications of deep convolutional neural network in computer vision, Journal of Data Acquisition and Processing, 31, 1, pp. 1-17, (2016)
[9]  
Anthimopoulos M., Christodoulidis S., Ebner L., Et al., Lung pattern classification for interstitial lung diseases using a deep convolutional neural network, IEEE Transactions on Medical Imaging, 35, 5, (2016)
[10]  
Krizhevsky A., Sutskever I., Hinton G.E., ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp. 1097-1105, (2012)