New Performance Modeling Methods for Parallel Data Processing Applications

被引：11

作者：

Bhimani, Janki ^{[1
]}

Mi, Ningfang ^{[1
]}

Leeser, Miriam ^{[1
]}

Yang, Zhengyu ^{[1
]}

机构：

[1] Northeastern Univ, 360 Huntington Ave, Boston, MA 02115 USA

来源：

ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION | 2019年 / 29卷 / 03期

基金：

美国国家科学基金会;

关键词：

Performance modeling; queuing theory; Markov model; distributed systems; execution time; parallel calculation; communication network; prediction;

D O I：

10.1145/3309684

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Predicting the performance of an application running on parallel computing platforms is increasingly becoming important because of its influence on development time and resource management. However, predicting the performance with respect to parallel processes is complex for iterative and multi-stage applications. This research proposes a performance approximation approach FiM to predict the calculation time with FiM-Cal and communication time with FiM-Com of an application running on a distributed framework. FiM-Cal consists of two key components that are coupled with each other: (1) a Stochastic. Markov Model to capture non-deterministic runtime that often depends on parallel resources, e.g., number of processes, and (2) a machine-learning model that extrapolates the parameters for calibrating our Markov model when we have changes in application parameters such as dataset. Along with the parallel calculation time, parallel computing platforms consume some data transfer time to communicate among different nodes. FiM-Com consists of a simulation queuing model to quickly estimate communication time. Our new modeling approach considers different design choices along multiple dimensions, namely (i) process-level parallelism, (ii) distribution of cores on multi-processor platform, (iii) application related parameters, and (iv) characteristics of datasets. The major contribution of our prediction approach is that FiM can provide an accurate prediction of parallel processing time for the datasets that have a much larger size than that of the training datasets. We evaluate our approach with NAS Parallel Benchmarks and real iterative data processing applications. We compare the predicted results (e.g., end-to-end execution time) with actual experimental measurements on a real distributed platform. We also compare our work with an existing prediction technique based on machine learning. We rank the number of processes according to the actual and predicted results from FLM and calculate the correlation between the actual and predicted rankings. Our results show that FiM obtains a high correlation in the range of 0.80 to 0.99, which indicates considerable accuracy of our technique. Such prediction provides data analysts a useful insight of optimal configuration of parallel resources (e.g., number of processes and number of cores) and also helps system designers to investigate the impact of changes in application parameters on system performance.

引用

页数：24

共 50 条

[1] FiM: Performance Prediction for Parallel Computation in Iterative Data Processing Applications
Bhimani, Janki
Mi, Ningfang
Leeser, Miriam
Yang, Zhengyu
2017 IEEE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2017, : 359 - 366
[2] Modeling the performance of parallel applications using model selection techniques
Martinez, D. R.
Blanco, V.
Cabaleiro, J. C.
Pena, T. F.
Rivera, F. F.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (02) : 586 - 599
[3] Performance modeling of parallel applications for grid scheduling
Sanjay, H. A.
Vadhiyar, Sathish
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2008, 68 (08) : 1135 - 1145
[4] Ensemble: A Tool for Performance Modeling of Applications in Cloud Data Centers
Chen, Jin
Soundararajan, Gokul
Ghanbari, Saeed
Iorio, Francesco
Hashemi, Ali B.
Amza, Cristiana
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2016, 4 (01) : 20 - 33
[5] Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes
Langguth, J.
Wu, N.
Chai, J.
Cai, X.
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2015, 76 : 120 - 131
[6] Performance modeling of big data applications in the cloud centers
Chao Shen
Weiqin Tong
Jenq-Neng Hwang
Qiang Gao
The Journal of Supercomputing, 2017, 73 : 2258 - 2283
[7] Benchmarking Machine Learning Methods for Performance Modeling of Scientific Applications
Malakar, Preeti
Balaprakash, Prasanna
Vishwanath, Venkatram
Morozov, Vitali
Kumaran, Kalyan
PROCEEDINGS OF 2018 IEEE/ACM PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS 2018), 2018, : 33 - 44
[8] Constructing Skeleton for Parallel Applications with Machine Learning Methods
Zhang, Zihang
Sun, Jingwei
Zhang, Jiepeng
Qin, Yuze
Sun, Guangzhong
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPP 2019), 2019,
[9] Performance modeling of big data applications in the cloud centers
Shen, Chao
Tong, Weiqin
Hwang, Jenq-Neng
Gao, Qiang
JOURNAL OF SUPERCOMPUTING, 2017, 73 (05) : 2258 - 2283
[10] coreSNP: Parallel Processing of Microarray Data
Guzzi, Pietro Hiram
Agapito, Giuseppe
Cannataro, Mario
IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (12) : 2961 - 2974

← 1 2 3 4 5 →