Machine Learning Regression-Based Prediction for Improving Performance and Energy Consumption in HPC Platforms

被引:1
作者
Coelho, Micaella [1 ]
Ocana, Kary [1 ]
Pereira, Andre [2 ,3 ]
Porto, Alexandre [1 ]
Cardoso, Douglas O. [4 ]
Lorenzon, Arthur [5 ]
Oliveira, Rui [2 ,3 ]
Navaux, Philippe O. A. [5 ]
Osthoff, Carla [1 ]
机构
[1] Natl Lab Sci Comp, LNCC, Rio De Janeiro, Brazil
[2] Univ Minho, Campus Gualtar, Braga, Portugal
[3] HASLab INESC TEC, Campus Gualtar, Braga, Portugal
[4] Univ Porto, Fac Arts & Human, Ctr Linguist, Porto, Portugal
[5] Univ Fed Rio Grande do Sul, Inst Informat, Porto Alegre, RS, Brazil
来源
HIGH PERFORMANCE COMPUTING, CARLA 2024 | 2025年 / 2270卷
关键词
Machine learning; High-performance computing; Scientific applications; Bioinformatics; Resource management;
D O I
10.1007/978-3-031-80084-9_13
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
High-performance computing is pivotal for processing large datasets and executing complex simulations, ensuring faster and more accurate results. Improving the performance of software and scientific workflows in such environments requires careful analysis of their computational behavior and energy consumption. Therefore, maximizing computational throughput in these environments, through adequate software configuration and resource allocation, is essential for improving performance. The work presented in this paper focuses on leveraging regression-based machine learning and decision trees to analyze and optimize resource allocation in high-performance computing environments based on application's performance and energy metrics. Applied to a bioinformatics case study, these models enable informed decision-making by selecting the appropriate computing resources to enhance the performance of a phylogenomics software. Our contribution is to better explore and understand the efficient resource management of supercomputers, namely Santos Dumont. We show that the predictions for application's execution time using the proposed method are accurate for various amounts of computing nodes, while energy consumption predictions are less precise. The application parameters most relevant for this work are identified and the relative importance of each application parameter to the accuracy of the prediction is analysed.
引用
收藏
页码:186 / 200
页数:15
相关论文
共 20 条
[1]   Improving prediction of computational job execution times with machine learning [J].
Balis, Bartosz ;
Lelek, Tomasz ;
Bodera, Jakub ;
Grabowski, Michal ;
Grigoras, Costin .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (02)
[2]  
Carastan-Santos D., 2024, hal-04566184
[3]  
Coelho M., 2022, LAT AM HIGH PERF COM
[4]  
Coelho M., 2022, AN 23 S SIST COMP AL, P205, DOI [10.5753/wscad.2022.226377, DOI 10.5753/WSCAD.2022.226377]
[5]   Extremely randomized trees [J].
Geurts, P ;
Ernst, D ;
Wehenkel, L .
MACHINE LEARNING, 2006, 63 (01) :3-42
[6]  
IEEE, 1993, SUPERCOMP PROC, P878
[7]  
James G, 2013, SPRINGER TEXTS STAT, V103, P15, DOI 10.1007/978-1-4614-7138-7_2
[8]  
Matsunaga Andrea., 2010, 2010 10 IEEEACM INT, P495, DOI DOI 10.1109/CCGRID.2010.98
[9]   BioinfoPortal: A scientific gateway for integrating bioinformatics applications on the Brazilian national high-performance computing network [J].
Ocana, Kary A. C. S. ;
Galheigo, Marcelo ;
Osthoff, Carla ;
Gadelha Jr, Luiz M. R. ;
Porto, Fabio ;
Gomes, Antonio Tadeu A. ;
de Oliveira, Daniel ;
Vasconcelos, Ana Tereza .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 107 :192-214
[10]  
Pedregosa F, 2011, J MACH LEARN RES, V12, P2825