Spatio-Temporal Optimization of Deep Neural Networks for Reconfigurable FPGA SoCs

被引：18

作者：

Seyoum, Biruk ^{[1
]}

Pagani, Marco ^{[1
,3
]}

Biondi, Alessandro ^{[1
,2
]}

Balleri, Sara ^{[4
]}

Buttazzo, Giorgio ^{[1
,2
]}

机构：

[1] TeCIP Inst, Scuola Super St Anna, I-56124 Pisa, Italy

[2] Scuola Super Sant Anna, Dept Excellence Robot & AI, I-56124 Pisa, Italy

[3] Ctr Rech Informat Signal & Automat CRIStAL, Embedded Real Time Adaptat Syst Design & Execut, Villeneuve Dascq, France

[4] Univ Perfezionamento, Embedded Syst, Scuola Super Studi, St Anna19005, I-56127 Pisa, Italy

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2021年 / 70卷 / 11期

关键词：

Field programmable gate arrays; Neurons; Silicon; Synapses; Throughput; Timing; Hardware; FPGA; partial-reconfiguration; DNN acceleration; MILP optimization;

D O I：

10.1109/TC.2020.3033730

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article proposes a technique for optimizing the timing performance and the resource consumption of hardware accelerators for deep neural network (DNN) inference on FPGA-based system-on-chips (SoC). When required, the accelerators are decomposed into chunks, each exploiting at best the available FPGA area, and dynamic partial reconfiguration (DPR) is leveraged to schedule such chunks at run-time. To this end, the article presents accurate models of the resource consumption and timing of DNN accelerators provided by the Xilinx FINN framework. The models are then used to formulate an optimization problem that computes the optimal decomposition of DNN accelerators (and their configuration) by minimizing the inference time while ensuring area constraints on the FPGA. Experimental results on Zynq-7000 platforms demonstrate that the proposed technique provides consistent improvements with respect to both stock configurations of the accelerators and other configurations that can be obtained with a static FPGA allocation.

引用

页码：1988 / 2000

页数：13

共 34 条

[1]

Alom M. Z., 2018, The history began from alexnet: A comprehensive survey on deep learning approaches

[2]

[Anonymous], 2016, Bitwise Neural Networks

[3] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[4]

Courbariaux M., 2016, ABS160202830 CORR

[5] A Novel Design of Adaptive and Hierarchical Convolutional Neural Networks using Partial Reconfiguration on FPGA [J].

Farhadi, Mohammad ;

Ghasemi, Mehdi ;

Yang, Yezhou .

2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,

[6] Object Classification Using CNN-Based Fusion of Vision and LIDAR in Autonomous Vehicle Environment [J].

Gao, Hongbo ;

Cheng, Bo ;

Wang, Jianqiang ;

Li, Keqiang ;

Zhao, Jianhui ;

Li, Deyi .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (09) :4224-4231

[7] [DL] A Survey of FPGA-based Neural Network Inference Accelerators [J].

Guo, Kaiyuan ;

Zeng, Shulin ;

Yu, Jincheng ;

Wang, Yu ;

Yang, Huazhong .

ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2019, 12 (01)

[8] FBNA: A Fully Binarized Neural Network Accelerator [J].

Guo, Peng ;

Ma, Hong ;

Chen, Ruizhi ;

Li, Pin ;

Xie, Shaolin ;

Wang, Donglin .

2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2018, :51-54

[9]

Guo Y., 2018, ABS180804752 CORR

[10] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

← 1 2 3 4 →