共 56 条
[1]
Mittal S(2015)A survey of methods for analyzing and improving GPU energy efficiency ACM Comput Surv 47 19-1221
[2]
Vetter J(2017)PLACID: a platform for FPGA-based accelerator creation for DCNNs ACM Trans Multimed Comput Commun Appl (TOMM) 13 62-116
[3]
Motamedi M(2017)A resource-limited hardware accelerator for convolutional neural networks in embedded vision applications IEEE Trans Circuits Syst II Express Briefs 64 1217-69:35
[4]
Gysel P(2017)Tactics to directly map CNN graphs on embedded FPGAs IEEE Embed Syst Lett 9 113-547
[5]
Ghiasi S(2014)A survey of techniques for managing and leveraging caches in GPUs J Circuits Syst Comput (JCSC) 23 1430002-1086
[6]
Moini S(2015)A survey of CPU-GPU heterogeneous computing techniques ACM Comput Surv 47 69:1-62:33
[7]
Alizadeh B(2018)A GPU-outperforming FPGA accelerator architecture for binary convolutional neural networks ACM J Emerg Technol Comput (JETC) 14 18-1536
[8]
Emad M(2017)Throughput-optimized FPGA accelerator for deep convolutional neural networks ACM Trans Reconfig Technol Syst (TRETS) 10 17-29
[9]
Ebrahimpour R(2017)Maximizing CNN Accelerator Efficiency Through Resource Partitioning ACM SIGARCH Computer Architecture News 45 535-undefined
[10]
Abdelouahab K(2017)FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency Concurr Comput Pract Exp 29 e3850-undefined