The hector distributed run-time environment

被引:12
作者
Russ, SH
Robinson, J
Flachs, BK
Heckel, B
机构
[1] Mississippi State Univ, Engn Res Ctr, Mississippi State, MS 39762 USA
[2] Adv Microelect, Ridgeland, MS 39157 USA
[3] IBM Corp, Austin Res Lab, Austin, TX 78758 USA
[4] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
关键词
parallel computing; load balancing; fault tolerance; resource allocation; task migration;
D O I
10.1109/71.735957
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Harnessing the computational capabilities of a network of workstations promises to off-load work from overloaded supercomputers onto largely idle resources overnight. Several capabilities are needed to do this, including support for an architecture-independent parallel programming environment, task migration, automatic resource allocation, and fault tolerance. the Hector distributed run-time environment is designed to present these capabilities transparently to programmers. MPI programs can be run under this environment on homogeneous clusters with no modifications to their source code needed. The design of Hector, its internal structure, and several benchmarks and tests are presented.
引用
收藏
页码:1102 / 1114
页数:13
相关论文
共 31 条
[1]  
[Anonymous], 1986, UCRL53745 LAWR LIV N
[2]  
ARNOLD D, 1997, DR DOBBS SOURCEBOOK, V4, P13
[3]  
BAKER M, 1995, CLUSTER COMPUTING RE
[4]   VISUALIZATION AND DEBUGGING IN A HETEROGENEOUS ENVIRONMENT [J].
BEGUELIN, A ;
DONGARRA, J ;
GEIST, A ;
SUNDERAM, V .
COMPUTER, 1993, 26 (06) :88-95
[5]   MONITORS, MESSAGES, AND CLUSTERS - THE P4 PARALLEL PROGRAMMING SYSTEM [J].
BUTLER, RM ;
LUSK, EL .
PARALLEL COMPUTING, 1994, 20 (04) :547-564
[6]  
CASAS J, 1995, P 3 ANN PVM US GROUP
[7]  
CASAS J, 1995, USENIX COMPUTING FEB
[8]   Memory space representation for heterogeneous network process migration [J].
Chanchio, K ;
Sun, XH .
FIRST MERGED INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, 1998, :801-805
[9]   Utilization and predictability in scheduling the IBM SP2 with backfilling [J].
Feitelson, DG ;
Weil, AM .
FIRST MERGED INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, 1998, :542-546
[10]  
FEITELSON DG, 1997, P IPPS 97 WORKSH JOB