A Data-Driven Asynchronous Neural Network Accelerator

被引:11
作者
Xiao, Shanlin [1 ]
Liu, Weikun [1 ]
Lin, Junshu [1 ]
Yu, Zhiyi [1 ]
机构
[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Accelerator; asynchronous circuit; data-driven; energy-efficiency; neural network; PROCESSOR;
D O I
10.1109/TCAD.2020.3025508
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (DNNs) are revolutionizing machine learning, with unprecedented accuracy on many AI tasks. Energy-efficient neural acceleration is crucial in broadening DNN applications in cloud and mobile end devices. However, power-hungry clock networks limit the energy-efficiency of DNN accelerators. In this work, we propose a novel DNN hardware accelerator, called the asynchronous neural network processor (AsNNP). At the heart of AsNNP is a scalable hierarchy matrix multiply unit, with bit-serial processing elements working in parallel. It replaces the global clock networks with asynchronous handshake protocols to realize the synchronization and communication between each part, minimizing the dynamic power. Meanwhile, a fine-grain asynchronous pipeline based on weak-conditioned half-buffer (WCHB) is introduced to pipe successive computations in a data-driven manner, i.e., once data arrives computation begins, maximizing the throughput. These techniques enable AsNNP to work in a fully data-driven asynchronous communication fashion with optimized energy-efficiency. The proposed accelerator is implemented with quasi-delay-insensitive (QDI) clockless logic family and evaluated in a 65 nm process. Compared with the synchronous baseline, simulation results show that AsNNP offers 2.2 x higher equivalent frequency and 1.59 x lower power. Compared with state-of-the-art DNN accelerators, AsNNP shows 1.17 x -4.97x energy-efficiency improvement.
引用
收藏
页码:1874 / 1886
页数:13
相关论文
共 42 条
  • [1] Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing
    Albericio, Jorge
    Judd, Patrick
    Hetherington, Tayler
    Aamodt, Tor
    Jerger, Natalie Enright
    Moshovos, Andreas
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 1 - 13
  • [2] [Anonymous], 2016, RISTRETTO HARDWARE O
  • [3] Bhadra D, 2017, DES AUT TEST EUROPE, P794, DOI 10.23919/DATE.2017.7927097
  • [4] DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning
    Chen, Tianshi
    Du, Zidong
    Sun, Ninghui
    Wang, Jia
    Wu, Chengyong
    Chen, Yunji
    Temam, Olivier
    [J]. ACM SIGPLAN NOTICES, 2014, 49 (04) : 269 - 283
  • [5] Chen WJ, 2018, IEEE ASIAN SOLID STA, P51
  • [6] Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks
    Chen, Yu-Hsin
    Emer, Joel
    Sze, Vivienne
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 367 - 379
  • [7] PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
    Chi, Ping
    Li, Shuangchen
    Xu, Cong
    Zhang, Tao
    Zhao, Jishen
    Liu, Yongpan
    Wang, Yu
    Xie, Yuan
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 27 - 39
  • [8] Loihi: A Neuromorphic Manycore Processor with On-Chip Learning
    Davies, Mike
    Srinivasa, Narayan
    Lin, Tsung-Han
    Chinya, Gautham
    Cao, Yongqiang
    Choday, Sri Harsha
    Dimou, Georgios
    Joshi, Prasad
    Imam, Nabil
    Jain, Shweta
    Liao, Yuyun
    Lin, Chit-Kwan
    Lines, Andrew
    Liu, Ruokun
    Mathaikutty, Deepak
    Mccoy, Steve
    Paul, Arnab
    Tse, Jonathan
    Venkataramanan, Guruguhanathan
    Weng, Yi-Hsin
    Wild, Andreas
    Yang, Yoonseok
    Wang, Hong
    [J]. IEEE MICRO, 2018, 38 (01) : 82 - 99
  • [9] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [10] EIE: Efficient Inference Engine on Compressed Deep Neural Network
    Han, Song
    Liu, Xingyu
    Mao, Huizi
    Pu, Jing
    Pedram, Ardavan
    Horowitz, Mark A.
    Dally, William J.
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 243 - 254