Optimizing process allocation of parallel programs for heterogeneous clusters

被引:3
作者
Ichikawa, Shuichi [1 ,2 ]
Takahashi, Sho [1 ]
Kawai, Yuu [1 ]
机构
[1] Toyohashi Univ Technol, Dept Knowledge Based Informat Engn, Aichi 4418580, Japan
[2] Toyohashi Univ Technol, Intelligent Sensing Syst Res Ctr, Aichi 4418580, Japan
基金
日本学术振兴会;
关键词
heterogeneous cluster; high-performance computing; performance evaluation; multiprocessing; optimization; HIGH-PERFORMANCE; IMPLEMENTATION; MPI;
D O I
10.1002/cpe.1349
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The performance of a conventional parallel application is often degraded by load-imbalance on heterogeneous clusters. Although it is simple to invoke multiple processes on fast processing elements to alleviate load-imbalance, the optimal process allocation is not obvious. Kishimoto and Ichikawa presented performance models for high-performance Linpack (HPL), with which the sub-optimal configurations of heterogeneous clusters were actually estimated. Their results on HPL are encouraging, whereas their approach is not yet verified with other applications. This study presents some enhancements of Kishimoto's scheme, which are evaluated with four typical scientific applications: computational fluid dynamics (CFD), finite-element method (FEM), HPL (linear algebraic system), and fast Fourier transform (FFT). According to our experiments, our new models (NP-T models) are superior to Kishimoto's models, particularly when the non-negative least squares method is used for parameter extraction. The average errors of the derived models were 0.2% for the CFD benchmark, 2% for the FEM benchmark, 1% for HPL, and 28% for the FFT benchmark. This study also emphasizes the importance of predictability in clusters, listing practical examples derived from our study. Copyright (C) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:475 / 507
页数:33
相关论文
共 35 条
[1]  
Alexeyev A. A., 1995, Proceedings of the the 3rd International Specialist Workshop on Nonlinear Dynamics of Electronic Systems. NDES '95, P95, DOI 10.1145/215399.215427
[2]   COMPILER TRANSFORMATIONS FOR HIGH-PERFORMANCE COMPUTING [J].
BACON, DF ;
GRAHAM, SL ;
SHARP, OJ .
ACM COMPUTING SURVEYS, 1994, 26 (04) :345-420
[3]   A proposal for a heterogeneous cluster ScaLAPACK (dense linear solvers) [J].
Beaumont, O ;
Boudet, V ;
Petitet, A ;
Rastello, F ;
Robert, Y .
IEEE TRANSACTIONS ON COMPUTERS, 2001, 50 (10) :1052-1070
[4]  
Blackford L., 1997, ScaLAPACK Users Guide
[5]   Heuristics for work distribution of a homogeneous parallel dynamic programming scheme on heterogeneous systems [J].
Cuenca, J ;
Giménez, D ;
Martínez, JP .
PARALLEL COMPUTING, 2005, 31 (07) :711-735
[6]  
Culler David., 1993, P 4 ACM SIGPLAN S PR, P1
[7]  
Frank M.I., 1997, PROC 6 ACM SIGPLAN S, P276
[8]  
*GNU, 2008, FREE SOFTW FDN
[9]  
Goedecker S., 2001, SOFTW ENVIRONM TOOL
[10]   A high-performance, portable implementation of the MPI message passing interface standard [J].
Gropp, W ;
Lusk, E ;
Doss, N ;
Skjellum, A .
PARALLEL COMPUTING, 1996, 22 (06) :789-828