Machine Learning-based Energy-efficient Workload Management for Data Centers

被引:0
作者
Smith, Matthew [1 ]
Zhao, Luke [2 ]
Cordova, Jonathan [1 ]
Jiang, Xunfei [1 ]
Ebrahimi, Mahdi [1 ]
机构
[1] Calif State Univ Northridge, Dept Comp Sci, Northridge, CA 91330 USA
[2] Brown Univ, Dept Comp Sci, Providence, RI USA
来源
2024 IEEE 21ST CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE, CCNC | 2024年
关键词
machine learning; energy-efficient; workload management; data centers;
D O I
10.1109/CCNC51664.2024.10454842
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cooling costs count for a significant part of the total energy consumption in data centers, and previous researchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at reducing energy consumption through thermal-aware workload distribution to build energy-efficient data centers for GPU-intensive workload. To achieve this goal, the study employs the GPUCloudSim Plus simulator, which effectively models the distribution of GPU-intensive applications under diverse workloads and utilizations. The integration of advanced machine learning models allows for accurate temperature predictions and comprehensive evaluation of the proposed algorithm's performance. We evaluated our ThermalAwareGpu workload scheduling algorithm, and it saved up to 12.82% of computing cost compared to the baseline algorithms. Our future work will explore the estimation of data center cooling energy and conduct in-depth comparisons of different workload balancing algorithms on more intensive experiments.
引用
收藏
页码:799 / 806
页数:8
相关论文
共 16 条
[1]   Optimal Power Management with Guaranteed Minimum Energy Utilization for Solar Energy Harvesting Systems [J].
Ahmed, Rehan ;
Buchli, Bernhard ;
Draskovic, Stefan ;
Sigrist, Lukas ;
Kumar, Pratyush ;
Thiele, Lothar .
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2019, 18 (04)
[2]  
Cardwell Neal, 2013, 2013 USENIX Annual Technical Conference (USENIX ATC 13), P213
[3]   Compiler-Directed High-Performance Intermittent Computation with Power Failure Immunity [J].
Choi, Jongouk ;
Kittinger, Larry ;
Liu, Qingrui ;
Jung, Changhee .
2022 IEEE 28TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS), 2022, :40-54
[4]  
De Vogeleer K, 2014, 2014 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION (SAMOS XIV), P172, DOI 10.1109/SAMOS.2014.6893209
[5]   Neverlast: An NVM-centric Operating System for Persistent Edge Systems [J].
Eichler, Christian ;
Hofmeier, Henriette ;
Reif, Stefan ;
Hoenig, Timo ;
Nolte, Joerg ;
Schroeder-Preikschat, Wolfgang .
APSYS '21: PROCEEDINGS OF THE 12TH ACM SIGOPS ASIA-PACIFIC WORKSHOP ON SYSTEMS, 2021, :146-153
[6]  
Espressif Systems, 2022, ESP32-C3 Series Datasheet
[7]   Concurrent Programming from PSEUCO to Petri [J].
Freiberger, Felix ;
Hermanns, Holger .
APPLICATION AND THEORY OF PETRI NETS AND CONCURRENCY, PETRI NETS 2019, 2019, 11522 :279-297
[8]  
Gal E, 2005, USENIX Association Proceedings of the General Track: 2005 UNENIX Annual Technical Conference, P89
[9]  
Haas F, 2014, INT SYM DEFEC FAU TO, P197, DOI 10.1109/DFT.2014.6962083
[10]  
Hahn S., 2015, ACM SIGBED Review, V12, P28