Accelerating DNNs from local to virtualized FPGA in the Cloud: A survey of trends

被引：10

作者：

Wu, Chen ^{[1
]}

Fresse, Virginie ^{[1
]}

Suffran, Benoit ^{[2
]}

Konik, Hubert ^{[1
]}

机构：

[1] Univ Lyon, Hubert Curien Lab, Univ St Etienne, Lyon, France

[2] ST Microelect, F-38000 Grenoble, France

来源：

JOURNAL OF SYSTEMS ARCHITECTURE | 2021年 / 119卷 / 119期

关键词：

FPGA virtualization; Cloud computing; Deep neural network; Accelerator; Trends; NEURAL-NETWORK; DESIGN; CNN; SCALE; FLOW;

D O I：

10.1016/j.sysarc.2021.102257

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Field-programmable gate arrays (FPGAs) are widely used locally to speed up deep neural network (DNN) algorithms with high computational throughput and energy efficiency. Virtualizing FPGA and deploying FPGAs in the cloud are becoming increasingly attractive methods for DNN acceleration because they can enhance the computing ability to achieve on-demand acceleration across multiple users. In the past five years, researchers have extensively investigated various directions of FPGA-based DNN accelerators, such as algorithm optimization, architecture exploration, capacity improvement, resource sharing, and cloud construction. However, previous DNN accelerator surveys mainly focused on optimizing the DNN performance on a local FPGA, ignoring the trend of placing DNN accelerators in the cloud's FPGA. In this study, we conducted an in-depth investigation of the technologies used in FPGA-based DNN accelerators, including but not limited to architectural design, optimization strategies, virtualization technologies, and cloud services. Additionally, we studied the evolution of DNN accelerators, e.g., from a single DNN to framework-generated DNNs, from physical to virtualized FPGAs, from local to the cloud, and from single-user to multi-tenant. We also identified significant obstacles for DNN acceleration in the cloud. This article enhances the current understanding of the evolution of FPGA-based DNN accelerators.

引用

页数：15

共 118 条

[111]

Zhang C., 2015, P 2015 ACM SIGDA INT, P161, DOI [10.1145/2684746.2689060, DOI 10.1145/2684746.2689060]

[112] Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks [J].

Zhang, Chen ;

Sun, Guangyu ;

Fang, Zhenman ;

Zhou, Peipei ;

Pan, Peichen ;

Cong, Jason .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (11) :2072-2085

[113] Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster [J].

Zhang, Chen ;

Wu, Di ;

Sun, Jiayu ;

Sun, Guangyu ;

Luo, Guojie ;

Cong, Jason .

ISLPED '16: PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2016, :326-331

[114] Optimized Compression for Implementing Convolutional Neural Networks on FPGA [J].

Zhang, Min ;

Li, Linpeng ;

Wang, Hai ;

Liu, Yan ;

Qin, Hongbo ;

Zhao, Wei .

ELECTRONICS, 2019, 8 (03)

[115]

Zhang WT, 2019, DES AUT TEST EUROPE, P1241, DOI [10.23919/DATE.2019.8715174, 10.23919/date.2019.8715174]

[116] DNNBuilder: an Automated Tool for Building High -Performance DNN Hardware Accelerators for FPGAs [J].

Zhang, Xiaofan ;

Wang, Junsong ;

Zhu, Chao ;

Lin, Yonghua ;

Xiong, Jinjun ;

Hwu, Wen-mei ;

Chen, Deming .

2018 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) DIGEST OF TECHNICAL PAPERS, 2018,

[117]

Zhang XF, 2017, I C FIELD PROG LOGIC

[118]

Zhou YM, 2015, PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), P829, DOI 10.1109/ICCSNT.2015.7490869

← 3 4 5 6 7 8 9 10 11 12 →