Malleable iterative MPI applications

被引:17
作者
El Maghraoui, K. [1 ]
Desell, Travis J. [2 ]
Szymanski, Boleslaw K. [2 ]
Varela, Carlos A. [2 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
关键词
dynamic reconfiguration; malleability; MPI;
D O I
10.1002/cpe.1362
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Malleability enables a parallel application's execution system to split or merge processes modifying granularity. While process migration is widely used to adapt applications to dynamic execution environments, it is limited by the granularity of the application's processes. Malleability empowers process migration by allowing the application's processes to expand or shrink following the availability of resources. We have implemented malleability as an extension to the process checkpointing and migration (PCM) library, a user-level library for iterative message passing interface (MPI) applications. PCM is integrated with the Internet Operating System, a framework for middleware-driven dynamic application reconfiguration. Our approach requires minimal code modifications and enables transparent middleware-triggered reconfiguration. Experimental results using it two-dimensional data parallel program that has a regular communication structure demonstrate the usefulness of malleability. Copyright (C) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:393 / 413
页数:21
相关论文
共 16 条
[1]  
AGBARIA A, 1999, P 8 IEEE INT S HIGH, P31
[2]  
[Anonymous], 2006, P 11 ACM SIGPLAN S P, DOI [10.1145/1122971.1122976, DOI 10.1145/1122971.1122976]
[3]  
[Anonymous], MPICH2
[4]  
Desell T., 2006, PROC HPDC 15 WORKSHO, P37
[5]  
Desell T, 2007, CLUSTER COMPUT, V10, P323, DOI 10.1007/s10586-007-0032-9
[6]   The Internet Operating System: Middleware for adaptive distributed computing [J].
El Maghraoui, Kaoutar ;
Desell, Travis J. ;
Szymanski, Boleslaw K. ;
Varela, Carlos A. .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2006, 20 (04) :467-480
[7]  
FEITELSON DG, 1996, LECT NOTES COMPUTER, V1162, P1
[8]  
MAGHRAOUI KE, 2005, LNCS, V3911, P258
[9]   Distributed and dynamic self-scheduling of parallel MPI grid applications [J].
Nascimento, Aline P. ;
Sena, Alexandre C. ;
Boeres, Cristina ;
Rebello, Vinod E. F. .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2007, 19 (14) :1955-1974
[10]   A simple MPI process swapping architecture for iterative applications [J].
Sievert, O ;
Casanova, H .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2004, 18 (03) :341-352