User Driven FPGA-Based Design Automated Framework of Deep Neural Networks for Low-Power Low-Cost Edge Computing

被引:13
作者
Belabed, Tarek [1 ,2 ,3 ]
Coutinho, Maria Gracielly F. [4 ]
Fernandes, Marcelo A. C. [4 ]
Sakuyama, Carlos Valderrama [1 ]
Souani, Chokri [5 ]
机构
[1] Univ Mons, Fac Polytech, SEMi, B-7000 Mons, Belgium
[2] Univ Sousse, Ecole Natl Ingenieurs Sousse, Sousse 4000, Tunisia
[3] Univ Monastir, Fac Sci, Lab Microelect & Instrumentat, Monastir 5019, Tunisia
[4] Univ Fed Rio Grande do Norte, Dept Comp & Automat Engn, BR-59078970 Natal, RN, Brazil
[5] Univ Sousse, Inst Super Sci Appl & Technol Sousse, Sousse 4003, Tunisia
关键词
Field programmable gate arrays; Topology; Optimization; Hardware; Edge computing; Computer architecture; Tools; Deep learning; electronic design automation; edge computing; FPGA; low power systems; ARTIFICIAL-INTELLIGENCE; STATE;
D O I
10.1109/ACCESS.2021.3090196
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Learning techniques have been successfully applied to solve many Artificial Intelligence (AI) applications problems. However, owing to topologies with many hidden layers, Deep Neural Networks (DNNs) have high computational complexity, which makes their deployment difficult in contexts highly constrained by requirements such as performance, real-time processing, or energy efficiency. Numerous hardware/software optimization techniques using GPUs, ASICs, and reconfigurable computing (i.e, FPGAs), have been proposed in the literature. With FPGAs, very specialized architectures have been developed to provide an optimal balance between high-speed and low power. However, when targeting edge computing, user requirements and hardware constraints must be efficiently met. Therefore, in this work, we only focus on reconfigurable embedded systems based on the Xilinx ZYNQ SoC and popular DNNs that can be implemented on Embedded Edge improving performance per watt while maintaining accuracy. In this context, we propose an automated framework for the implementation of hardware-accelerated DNN architectures. This framework provides an end-to-end solution that facilitates the efficient deployment of topologies on FPGAs by combining custom hardware scalability with optimization strategies. Cutting-edge comparisons and experimental results demonstrate that the architectures developed by our framework offer the best compromise between performance, energy consumption, and system costs. For instance, the low power (0.266W) DNN topologies generated for the MNIST database achieved a high throughput of 3,626 FPS.
引用
收藏
页码:89162 / 89180
页数:19
相关论文
共 66 条
[21]  
Intel, OpenVINO Install OpenVINO Development Tools
[22]   Design and Optimization of Energy-Accuracy Tradeoff Networks for Mobile Platforms via Pretrained Deep Models [J].
Jayakodi, Nitthilan Kanappan ;
Belakaria, Syrine ;
Deshwal, Aryan ;
Doppa, Janardhan Rao .
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2020, 19 (01)
[23]  
Jiang W., 2019, PROC 56 ANN DESIGN A, P1
[24]  
Kim P., 2017, MATLAB DEEP LEARNING, P103, DOI DOI 10.1007/978-1-4842-2845-6_5
[25]   What are artificial neural networks? [J].
Krogh, Anders .
NATURE BIOTECHNOLOGY, 2008, 26 (02) :195-197
[26]  
Kung H. T., 1978, SYSTOLIC ARRAYS VLSI, P29
[27]   Are We There Yet? A Study on the State of High-Level Synthesis [J].
Lahti, Sakari ;
Sjovall, Panu ;
Vanne, Jarno ;
Hamalainen, Timo D. .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (05) :898-911
[28]  
LECUN Y., The MNIST database of handwritten digits
[29]   Applications of artificial intelligence in intelligent manufacturing: a review [J].
Li, Bo-hu ;
Hou, Bao-cun ;
Yu, Wen-tao ;
Lu, Xiao-bing ;
Yang, Chun-wei .
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (01) :86-96
[30]   Deep learning in bioinformatics: Introduction, application, and perspective in the big data era [J].
Li, Yu ;
Huang, Chao ;
Ding, Lizhong ;
Li, Zhongxiao ;
Pan, Yijie ;
Gao, Xin .
METHODS, 2019, 166 :4-21