Multi-Objective Concurrent Kernel Scheduling for Multi-GPU Systems

被引:0
作者
Alizadeh, Negar Baradar [1 ]
Momtazpour, Mahmoud [1 ]
机构
[1] Amirkabir Univ Technol, Dept Comp Engn, Tehran, Iran
来源
2024 32ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, ICEE 2024 | 2024年
关键词
GPU; Concurrent Kernel Execution; Scheduling; Energy Consumption; Quality of Service; Genetic Algorithm; PERFORMANCE;
D O I
10.1109/ICEE63041.2024.10667973
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
GPUs are now being widely used in many applications to speed up the computations. Due to its unique architecture, GPUs are well suited to run embarrassingly parallel applications such as deep learning and big data analysis. However, GPUs are power hungry devices, and many applications are not able to fully utilize the resources of a GPU, resulting in energy wastage in multi-GPU systems. To tackle this challenge, recent advancement has made it possible to run multiple kernels concurrently on a single GPU to share its resources. This however comes with a performance penalty due to the interference of concurrent kernels while competing over shared resources. In the presence of such impacts, mapping and scheduling of multiple kernels on multiple GPUs in order to fully utilize the GPUs, minimize the energy consumption, and reduce interference and performance degradation is challenging. In this paper, we propose a multi-objective kernel scheduling approach for multi-GPU systems, which considers both energy consumption and performance and supports concurrent kernel execution. To solve the scheduling problem, due to its large solution space, a GA-based scheduler called CK-GA has been used that can achieve desirable solutions at a reasonable pace. We also propose a heuristic scheduling algorithm called CK-HEFT based on the well-known HEFT algorithm for concurrent kernel execution on shared GPUs. Experimental results on several multi-GPU architectures show that the proposed CK-GA scheduler has obtained more than 14% improvement in performance, 30% improvement in quality of service and more than 18% improvement in energy consumption on average compared to similar approaches.
引用
收藏
页码:859 / 864
页数:6
相关论文
共 18 条
[1]   Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments [J].
Amaral, Marcelo ;
Polo, Jorda ;
Carrera, David ;
Seelam, Seetharami ;
Steinder, Malgorzata .
SC'17: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2017,
[2]   Understanding GPU Power: A Survey of Profiling, Modeling, and Simulation Methods [J].
Bridges, Robert A. ;
Imam, Neena ;
Mintz, Tiffany M. .
ACM COMPUTING SURVEYS, 2016, 49 (03)
[3]  
Chen Q, 2016, ACM SIGPLAN NOTICES, V51, P681, DOI [10.1145/2954680.2872368, 10.1145/2980024.2872368, 10.1145/2954679.2872368]
[4]   NVIDIA A100 Tensor Core GPU: Performance and Innovation [J].
Choquette, Jack ;
Gandhi, Wishwesh ;
Giroux, Olivier ;
Stam, Nick ;
Krashinsky, Ronny .
IEEE MICRO, 2021, 41 (02) :29-35
[5]  
developer.nvidia, NVIDIA CUDA Toolkit
[6]   GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers [J].
Jahanshahi, Ali ;
Sabzi, Hadi Zamani ;
Lau, Chester ;
Wong, Daniel .
IEEE COMPUTER ARCHITECTURE LETTERS, 2020, 19 (02) :139-142
[7]  
Jiao Q, 2015, INT SYM CODE GENER, P1, DOI 10.1109/CGO.2015.7054182
[8]   Modeling and Simulation of QoS-Aware Power Budgeting in Cloud Data Centers [J].
Krzywda, Jakob ;
Meyer, Vinfeius ;
Xavier, Miguel G. ;
Ali-Eldin, Ahmed ;
Ostberg, Per-Olov ;
De Rose, Cesar A. F. ;
Elmroth, Erik .
2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020), 2020, :88-93
[9]   Priority-Based PCIe Scheduling for Multi-Tenant Multi-GPU Systems [J].
Li, Chen ;
Sun, Yifan ;
Jin, Lingling ;
Xu, Lingjie ;
Cao, Zheng ;
Fan, Pengfei ;
Kaeli, David ;
Ma, Sheng ;
Guo, Yang ;
Yang, Jun .
IEEE COMPUTER ARCHITECTURE LETTERS, 2019, 18 (02) :157-160
[10]  
Mei XX, 2017, IEEE INFOCOM SER