A survey of FPGA-based accelerators for convolutional neural networks

被引:2
作者
Sparsh Mittal
机构
[1] Indian Institute of Technology,Department of Computer Science and Engineering
来源
Neural Computing and Applications | 2020年 / 32卷
关键词
Deep learning; Neural network (NN); Convolutional NN (CNN); Binarized NN; Hardware architecture for machine learning; FPGA; Reconfigurable computing; Parallelization; Low power;
D O I
暂无
中图分类号
学科分类号
摘要
Deep convolutional neural networks (CNNs) have recently shown very high accuracy in a wide range of cognitive tasks, and due to this, they have received significant interest from the researchers. Given the high computational demands of CNNs, custom hardware accelerators are vital for boosting their performance. The high energy efficiency, computing capabilities and reconfigurability of FPGA make it a promising platform for hardware acceleration of CNNs. In this paper, we present a survey of techniques for implementing and optimizing CNN algorithms on FPGA. We organize the works in several categories to bring out their similarities and differences. This paper is expected to be useful for researchers in the area of artificial intelligence, hardware architecture and system design.
引用
收藏
页码:1109 / 1139
页数:30
相关论文
共 56 条
[1]  
Mittal S(2015)A survey of methods for analyzing and improving GPU energy efficiency ACM Comput Surv 47 19-1221
[2]  
Vetter J(2017)PLACID: a platform for FPGA-based accelerator creation for DCNNs ACM Trans Multimed Comput Commun Appl (TOMM) 13 62-116
[3]  
Motamedi M(2017)A resource-limited hardware accelerator for convolutional neural networks in embedded vision applications IEEE Trans Circuits Syst II Express Briefs 64 1217-69:35
[4]  
Gysel P(2017)Tactics to directly map CNN graphs on embedded FPGAs IEEE Embed Syst Lett 9 113-547
[5]  
Ghiasi S(2014)A survey of techniques for managing and leveraging caches in GPUs J Circuits Syst Comput (JCSC) 23 1430002-1086
[6]  
Moini S(2015)A survey of CPU-GPU heterogeneous computing techniques ACM Comput Surv 47 69:1-62:33
[7]  
Alizadeh B(2018)A GPU-outperforming FPGA accelerator architecture for binary convolutional neural networks ACM J Emerg Technol Comput (JETC) 14 18-1536
[8]  
Emad M(2017)Throughput-optimized FPGA accelerator for deep convolutional neural networks ACM Trans Reconfig Technol Syst (TRETS) 10 17-29
[9]  
Ebrahimpour R(2017)Maximizing CNN Accelerator Efficiency Through Resource Partitioning ACM SIGARCH Computer Architecture News 45 535-undefined
[10]  
Abdelouahab K(2017)FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency Concurr Comput Pract Exp 29 e3850-undefined