Comparing Energy Efficiency of CPU, GPU and FPGA Implementations for Vision Kernels

被引:121
作者
Qasaimeh, Murad [1 ]
Denolf, Kristof [2 ]
Lo, Jack [2 ]
Vissers, Kees [2 ]
Zambreno, Joseph [1 ]
Jones, Phillip H. [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
[2] Xilinx Res Labs, San Jose, CA USA
来源
2019 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS) | 2019年
关键词
Embedded Vision; GPUs; FPGAs; CPUs; Energy Efficiency;
D O I
10.1109/icess.2019.8782524
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Developing high performance embedded vision applications requires balancing run-time performance with energy constraints. Given the mix of hardware accelerators that exist for embedded computer vision (e.g. multi-core CPUs, GPUs, and FPGAs), and their associated vendor optimized vision libraries, it becomes a challenge for developers to navigate this fragmented solution space. To aid with determining which embedded platform is most suitable for their application, we conduct a comprehensive benchmark of the run-time performance and energy efficiency of a wide range of vision kernels. We discuss rationales for why a given underlying hardware architecture innately performs well or poorly based on the characteristics of a range of vision kernel categories. Specifically, our study is performed for three commonly used HW accelerators for embedded vision applications: ARM57 CPU, Jetson TX2 GPU and ZCU102 FPGA, using their vendor optimized vision libraries: OpenCV, VisionWorks and xfOpenCV. Our results show that the GPU achieves an energy/frame reduction ratio of 1.1-3.2x compared to the others for simple kernels. While for more complicated kernels and complete vision pipelines, the FPGA outperforms the others with energy/frame reduction ratios of 1.2-22.3x. It is also observed that the FPGA performs increasingly better as a vision application's pipeline complexity grows.
引用
收藏
页数:8
相关论文
共 15 条
[1]  
[Anonymous], 2013, THESIS
[2]  
Boppana V., 2015, 2015 IEEE Hot Chips 27 Symposium (HCS), P1, DOI DOI 10.1109/HOTCHIPS.2015.7477457
[3]  
Bradski G., 2018, OPEN SOURCE COMPUTER
[4]  
Brugger C, 2015, ISCAIE 2015 - 2015 IEEE SYMPOSIUM ON COMPUTER APPLICATIONS AND INDUSTRIAL ELECTRONICS, P201, DOI 10.1109/ISCAIE.2015.7298356
[5]   Accelerating compute-intensive applications with GPUs and FPGAs [J].
Che, Shuai ;
Li, Jie ;
Sheaffer, Jeremy W. ;
Skadron, Kevin ;
Lach, John .
2008 SYMPOSIUM ON APPLICATION SPECIFIC PROCESSORS, 2008, :101-+
[6]  
Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
[7]  
Collange S, 2009, LECT NOTES COMPUT SC, V5544, P914, DOI 10.1007/978-3-642-01970-8_92
[8]   Understanding Performance Differences of FPGAs and GPUs [J].
Cong, Jason ;
Fang, Zhenman ;
Lo, Michael ;
Wang, Hanrui ;
Xu, Jingxian ;
Zhang, Shaochong .
PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, :93-96
[9]   A Tradeoff Analysis of FPGAs, GPUs, and Multicores for Sliding-Window Applications [J].
Cooke, Patrick ;
Fowers, Jeremy ;
Brown, Greg ;
Stitt, Greg .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2015, 8 (01)
[10]  
Fowers J, 2012, FPGA 12: PROCEEDINGS OF THE 2012 ACM-SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P47