RNA: A Flexible and Efficient Accelerator Based on Dynamically Reconfigurable Computing for Multiple Convolutional Neural Networks

被引：2

作者：

Yang, Chen ^{[1
]}

Hou, Jia ^{[1
]}

Wang, Yizhou ^{[1
]}

Zhang, Haibo ^{[1
]}

Wang, Xiaoli ^{[1
]}

Geng, Li ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Microelect, 28 Xianning West Rd, Xian 710049, Shaanxi, Peoples R China

来源：

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS | 2022年 / 31卷 / 16期

基金：

中国国家自然科学基金;

关键词：

CNN; reconfigurable computing; image row broadcasting dataflow; tile-by-tile computing; zero detection technology; multi-bank RAM; dynamically adaptive data truncation; DEEP; ARCHITECTURE; PROCESSOR; TIME; UNPU;

D O I：

10.1142/S0218126622502899

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The increasingly complicated and versatile convolutional neural networks (CNNs) models bring challenges to hardware acceleration in terms of performance, energy efficiency and flexibility. This paper proposes a reconfigurable neural accelerator (RNA) for flexible and efficient CNN acceleration. To provide hardware flexibility, RNA employs dynamically reconfigurable computing framework to rapidly configure data path between processing elements (PE) at run-time, as well as an interlaced data access mechanism for multi-bank RAM. To achieve high energy efficiency, three optimization mechanisms, including image row broadcasting dataflow (IRBD), tile-by-tile computing (TTC), and zero detection technology (ZDT), are dedicatedly designed for RNA to exploit data reuse and decrease memory bandwidth requirement, which is the key to improving performance and saving power consumption. To save hardware overhead, an online dynamic adaptive data truncation (DADT) mechanism is designed to compensate accuracy loss of multiplication results so that the computational precision in RNA can be reduced from 16-bit to 8-bit, which contributes to reducing the area of data path. The RNA architecture is implemented on Xilinx XC7Z100 FPGA and works at 250 MHz. Experimental results show that the performance of running LeNet, AlexNet and VGG are 500 GOPS, 598 GOPS and 660 GOPS, respectively. Compared to previous FPGA-based designs, RNA achieves 1.5 x -4.3x performance speedup and 7.6 x -8.4x improvements on energy efficiency.

引用

页数：32

共 50 条

[1] An Energy-Efficient and Flexible Accelerator based on Reconfigurable Computing for Multiple Deep Convolutional Neural Networks
Yang, Chen
Zhang, HaiBo
Wang, XiaoLi
Geng, Li
2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1389 - 1391
[2] An Efficient Reconfigurable Hardware Accelerator for Convolutional Neural Networks
Ansari, Anaam
Gunnam, Kiran
Ogunfunmi, Tokunbo
2017 FIFTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2017, : 1337 - 1341
[3] An Efficient and Flexible Accelerator Design for Sparse Convolutional Neural Networks
Xie, Xiaoru
Lin, Jun
Wang, Zhongfeng
Wei, Jinghe
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (07) : 2936 - 2949
[4] DRGN: a dynamically reconfigurable accelerator for graph neural networks
Yang C.
Huo K.-B.
Geng L.-F.
Mei K.-Z.
Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (07) : 8985 - 9000
[5] An Efficient FIFO Based Accelerator for Convolutional Neural Networks
Panchbhaiyye, Vineet
Ogunfunmi, Tokunbo
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (10): : 1117 - 1129
[6] An Efficient FIFO Based Accelerator for Convolutional Neural Networks
Vineet Panchbhaiyye
Tokunbo Ogunfunmi
Journal of Signal Processing Systems, 2021, 93 : 1117 - 1129
[7] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[8] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel
Sze, Vivienne
2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 262 - U363
[9] GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks
Li, Jiajun
Louri, Ahmed
Karanth, Avinash
Bunescu, Razvan
2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 775 - 788
[10] Design of a Generic Dynamically Reconfigurable Convolutional Neural Network Accelerator with Optimal Balance
Tong, Haoran
Han, Ke
Han, Si
Luo, Yingqi
ELECTRONICS, 2024, 13 (04)

← 1 2 3 4 5 →