Design of the programmable neural network processor based on the transport triggered architecture

被引：0

作者：

Zhao B. ^{[1
]}

Zhang L. ^{[1
]}

Shi G. ^{[1
]}

Huang R. ^{[1
]}

Xu X. ^{[1
]}

机构：

[1] School of Electronic Engineering, Xidian Univ., Xi'an

来源：

Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University | 2018年 / 45卷 / 04期

关键词：

Convolutional neural networks; Deep learning; Field programmable gate array; Parallel computing;

D O I：

10.3969/j.issn.1001-2400.2018.04.017

中图分类号：

学科分类号：

摘要：

The convolutional neural networks have the problems of structure diversity and large amounts of data exchange and computation. A transport triggered architecture based convolutional neural network processor is presented in this paper. The data transport network is constructed with multi-channel direct memory access channels, the multi-port memory and the specialized pooling data path, which solves the inefficient data exchange problem. Experimental results show that, although the proposed architecture is 11% slower than the streamline structure, it can adapt to a variety of convolutional neural networks and save 46.5% multipliers. Compared with the schemes presented in other papers except pipeline implementation, our design improves the data throughput rate by 40% at least. Besides, this system has advantages of parallel efficiency, programmable flexibility, online architecture reconfiguration, high processing speed, etc © 2018, The Editorial Board of Journal of Xidian University. All right reserved.

引用

页码：92 / 98

页数：6

共 21 条

[1] Lecun Y., Bottou L., Bengio Y., Et al., Gradient-based Learning Applied to Document Recognition, Proceedings of the IEEE, 86, 11, pp. 2278-2324, (1998)
[2] Krizhevsky A., Sutskever I., Hinton G.E., ImageNet Classification with Deep Convolutional Neural Networks, Proceedings of the Advances in Neural Information Processing Systems, pp. 1097-1105, (2012)
[3] Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-scale Image Recognition, Computer Science, 41, 5, pp. 1409-1556, (2014)
[4] Sung W., Park J., Architecture Exploration of a Programmable Neural Network Processor for Embedded Systems, Proceedings of the 201616th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, pp. 124-131, (2017)
[5] Suda N., Chandra V., Dasika G., Et al., Throughput-optimized OpenCL-based FPGA Accelerator for Large-scale Convolutional Neural Networks, Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 16-25, (2016)
[6] Qiu J., Wang J., Yao S., Et al., Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 26-35, (2016)
[7] Liu Z., Dou Y., Jiang J., Et al., Automatic Code Generation of Convolutional Neural Networks in FPGA Implementation, Proceedings of the 2016 International Conference on Field-Programmable Technology, pp. 61-68, (2017)
[8] Zhang C., Li P., Sun G., Et al., Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks, Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161-170, (2015)
[9] Gokhale V., Jin J., Dundar A., Et al., A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks, Proceedings of the 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 696-701, (2014)
[10] Chen T., Du Z., Sun N., Et al., DianNao: a Small-footprint High-throughput Accelerator for Ubiquitous Machine-learning, Proceedings of the 2014 International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 269-284, (2014)

← 1 2 3 →