CNN Workloads Characterization and Integrated CPU-GPU DVFS Governors on Embedded Systems

被引:1
|
作者
Karzhaubayeva, Meruyert [1 ]
Amangeldi, Aidar [1 ]
Park, Jurn-Gyu [1 ]
机构
[1] Nazarbayev Univ, Sch Engn & Digital Sci, Astana 010000, Kazakhstan
关键词
Convolutional neural networks (CNNs); dynamic power management (DPM); embedded systems; MANAGEMENT;
D O I
10.1109/LES.2023.3299335
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic power management (DPM) techniques on mobile systems are indispensable for deep learning (DL) inference optimization, which is mainly performed on battery-based mobile and/or embedded platforms with constrained resources. To this end, we characterize CNN workloads using object detection applications of YOLOv4/-tiny and YOLOv3/-tiny, and then propose integrated CPU-GPU DVFS governor policies that scale integrated pairs of CPU and GPU frequencies to improve energy-delay product (EDP) with negligible inference execution time degradation. Our results show up to 16.7% EDP improvements with negligible (mostly less than 2%) performance degradation using object detection applications on NVIDIA Jetson TX2.
引用
收藏
页码:202 / 205
页数:4
相关论文
共 50 条
  • [31] A Black-Box Approach to Energy-Aware Scheduling on Integrated CPU-GPU Systems
    Barik, Rajkishore
    Farooqui, Naila
    Lewis, Brian T.
    Hu, Chunling
    Shpeisman, Tatiana
    PROCEEDINGS OF CGO 2016: THE 14TH INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2016, : 70 - 81
  • [32] Accelerating Exact Inner Product Retrieval by CPU-GPU Systems
    Xiang, Long
    Tang, Bo
    Yang, Chuan
    PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 1277 - 1280
  • [33] Multireference coupled cluster methods on heterogeneous CPU-GPU systems
    Bhaskaran-Nair, Kiran
    Ma, Wenjing
    Krishnamoorthy, Sriram
    Villa, Oreste
    van Dam, Hubertus J. J.
    Apra, Edoardo
    Kowalski, Karol
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2013, 246
  • [35] GSched: An efficient scheduler for hybrid CPU-GPU HPC systems
    Mateos, Mariano Raboso
    Robles, Juan Antonio Cotobal
    1600, Springer Verlag (217): : 179 - 185
  • [36] MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems
    Yang XueJun
    Tang Tao
    Wang GuiBin
    Jia Jia
    Xu XinHai
    SCIENCE CHINA-INFORMATION SCIENCES, 2012, 55 (09) : 1961 - 1971
  • [37] MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems
    XueJun Yang
    Tao Tang
    GuiBin Wang
    Jia Jia
    XinHai Xu
    Science China Information Sciences, 2012, 55 : 1961 - 1971
  • [38] An orchestrated NoC prioritization mechanism for heterogeneous CPU-GPU systems
    Cai, Xiangwei
    Yin, Jieming
    Zhou, Pingqiang
    INTEGRATION-THE VLSI JOURNAL, 2019, 65 : 344 - 350
  • [39] WCET Analysis of the Shared Data Cache in Integrated CPU-GPU Architectures
    Huangfu, Yijie
    Zhang, Wei
    2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
  • [40] iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures
    Zhang, Chenyang
    Zhang, Feng
    Guo, Xiaoguang
    He, Bingsheng
    Zhang, Xiao
    Du, Xiaoyong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (07) : 1740 - 1752