Design of the programmable neural network processor based on the transport triggered architecture

被引:0
作者
Zhao B. [1 ]
Zhang L. [1 ]
Shi G. [1 ]
Huang R. [1 ]
Xu X. [1 ]
机构
[1] School of Electronic Engineering, Xidian Univ., Xi'an
来源
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University | 2018年 / 45卷 / 04期
关键词
Convolutional neural networks; Deep learning; Field programmable gate array; Parallel computing;
D O I
10.3969/j.issn.1001-2400.2018.04.017
中图分类号
学科分类号
摘要
The convolutional neural networks have the problems of structure diversity and large amounts of data exchange and computation. A transport triggered architecture based convolutional neural network processor is presented in this paper. The data transport network is constructed with multi-channel direct memory access channels, the multi-port memory and the specialized pooling data path, which solves the inefficient data exchange problem. Experimental results show that, although the proposed architecture is 11% slower than the streamline structure, it can adapt to a variety of convolutional neural networks and save 46.5% multipliers. Compared with the schemes presented in other papers except pipeline implementation, our design improves the data throughput rate by 40% at least. Besides, this system has advantages of parallel efficiency, programmable flexibility, online architecture reconfiguration, high processing speed, etc © 2018, The Editorial Board of Journal of Xidian University. All right reserved.
引用
收藏
页码:92 / 98
页数:6
相关论文
共 21 条
  • [1] Lecun Y., Bottou L., Bengio Y., Et al., Gradient-based Learning Applied to Document Recognition, Proceedings of the IEEE, 86, 11, pp. 2278-2324, (1998)
  • [2] Krizhevsky A., Sutskever I., Hinton G.E., ImageNet Classification with Deep Convolutional Neural Networks, Proceedings of the Advances in Neural Information Processing Systems, pp. 1097-1105, (2012)
  • [3] Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-scale Image Recognition, Computer Science, 41, 5, pp. 1409-1556, (2014)
  • [4] Sung W., Park J., Architecture Exploration of a Programmable Neural Network Processor for Embedded Systems, Proceedings of the 201616th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, pp. 124-131, (2017)
  • [5] Suda N., Chandra V., Dasika G., Et al., Throughput-optimized OpenCL-based FPGA Accelerator for Large-scale Convolutional Neural Networks, Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 16-25, (2016)
  • [6] Qiu J., Wang J., Yao S., Et al., Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 26-35, (2016)
  • [7] Liu Z., Dou Y., Jiang J., Et al., Automatic Code Generation of Convolutional Neural Networks in FPGA Implementation, Proceedings of the 2016 International Conference on Field-Programmable Technology, pp. 61-68, (2017)
  • [8] Zhang C., Li P., Sun G., Et al., Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks, Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161-170, (2015)
  • [9] Gokhale V., Jin J., Dundar A., Et al., A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks, Proceedings of the 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 696-701, (2014)
  • [10] Chen T., Du Z., Sun N., Et al., DianNao: a Small-footprint High-throughput Accelerator for Ubiquitous Machine-learning, Proceedings of the 2014 International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 269-284, (2014)