An In-Depth I/O Pattern Analysis in HPC Systems

被引:4
作者
Bang, Jiwoo [1 ]
Kim, Chungyong [1 ]
Wu, Kesheng [2 ]
Sim, Alex [2 ]
Byna, Suren [2 ]
Sung, Hanul [3 ]
Eom, Hyeonsang [1 ]
机构
[1] Seoul Natl Univ, Dept Comp Sci & Engn, Seoul, South Korea
[2] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA USA
[3] Sangmyung Univ, Dept Game Design & Dev, Seoul, South Korea
来源
2021 IEEE 28TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2021) | 2021年
基金
新加坡国家研究基金会;
关键词
I/O characteristic; Unsupervised learning; Feature selection; Clustering; Prediction model; High performance computing;
D O I
10.1109/HiPC53243.2021.00056
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High-performance computing (HPC) systems consist of thousands of compute nodes, storage systems and high-speed networks, providing multiple layers of I/O stack with high complexity. By adjusting the diverse configuration settings that HPC systems provide, the I/O performance of applications can be improved. However, it is challenging to identify the optimal configuration settings without a thorough knowledge of the system, as each of the different I/O characteristics of applications can be an important factor for parameter decision. In this paper, we use multiple machine learning approaches to perform an in-depth analysis on I/O behaviors of HPC applications and to search for the optimal configuration settings for jobs sharing similar I/O characteristics. Improved by maximum 0.07 H-squared score, our results in overall show that jobs run on the HPC systems can obtain the predicted I/O performance for different configuration parameters with a high accuracy, using the proposed machine learning-based prediction models.
引用
收藏
页码:400 / 405
页数:6
相关论文
共 22 条
[1]  
Ali-Eldin A., 2013, Workload classification for efficient auto-scaling of cloud resources
[2]  
[Anonymous], 2017, P 26 INT S HIGH PERF, DOI DOI 10.1145/3078597.3078614
[3]   HPAS: An HPC Performance Anomaly Suite for Reproducing Performance Variations [J].
Ates, Emre ;
Zhang, Yijia ;
Aksar, Burak ;
Brandt, Jim ;
Leung, Vitus J. ;
Egele, Manuel ;
Coskun, Ayse K. .
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[4]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[5]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[6]   Gauge: An Interactive Data-Driven Visualization Tool for HPC Application I/O Performance Analysis [J].
del Rosario, Eliakin ;
Currier, Mikaela ;
Isakov, Mihailo ;
Madireddy, Sandeep ;
Balaprakash, Prasanna ;
Carns, Philip ;
Ross, Robert B. ;
Harms, Kevin ;
Snyder, Shane ;
Kinsy, Michel A. .
PROCEEDINGS OF 2020 IEEE/ACM FIFTH INTERNATIONAL PARALLEL DATA SYSTEMS WORKSHOP (PDSW 2020), 2020, :15-21
[7]  
Li DY, 2019, 2019 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE BIG DATA AND INTELLIGENT SYSTEMS (HPBD&IS), P14, DOI [10.1109/HPBDIS.2019.8735467, 10.1109/hpbdis.2019.8735467]
[8]  
Lux T., 2018, PREDICTIVE MODELING
[9]  
Lv YR, 2018, 2018 51ST ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), P613, DOI [10.1109/MICR0.2018.00056, 10.1109/MICRO.2018.00056]
[10]   Machine Learning Based Parallel I/O Predictive Modeling: A Case Study on Lustre File Systems [J].
Madireddy, Sandeep ;
Balaprakash, Prasanna ;
Carns, Philip ;
Latham, Robert ;
Ross, Robert ;
Snyder, Shane ;
Wild, Stefan M. .
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 10876 :184-204