Efficient collective communication in distributed heterogeneous systems

被引:37
作者
Bhat, PB [1 ]
Raghavendra, CS [1 ]
Prasanna, VK [1 ]
机构
[1] Univ So Calif, Dept Elect Engn Syst, Los Angeles, CA 90089 USA
来源
19TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS | 1999年
关键词
D O I
10.1109/ICDCS.1999.776502
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Information Power Grid (IPG) is emerging as an infrastructure that will enable distributed applications - such as video conferencing and distributed interactive simulation - to seamlessly integrate collections of heterogeneous workstations, multiprocessors, and mobile nodes, over heterogeneous wide-area networks. This paper introduces a framework for developing efficient collective communication schedules in such systems. Our framework consists of analytical models of the heterogeneous system, scheduling algorithms for the collective communication pattern, and performance evaluation mechanisms. We show that previous models, which considered node heterogeneity, bur ignored network heterogeneity, can lead to solutions which are worse than the optimal by an unbounded factor. We then introduce an enhanced communication model, and develop three heuristic algorithms for the broadcast and multicast patterns. The completion time of the schedule is chosen as the performance metric. The heuristic algorithms are FEF (Fastest Edge First), ECEF (Earliest Completing Edge First), and ECEF with look-ahead. For small system sizes, we find the optimal solution using exhaustive search. Our simulation experiments indicate that the performance of our heuristic algorithms is close to optimal. For performance evaluation of larger systems, Mle have also developed a simple lower bound on the completion time. Our heuristic algorithms achieve significant performance improvements over previous approaches.
引用
收藏
页码:15 / 24
页数:10
相关论文
共 19 条
[1]  
[Anonymous], 1998, GRID BLUEPRINT NEW C
[2]   CCL - A PORTABLE AND TUNABLE COLLECTIVE COMMUNICATION LIBRARY FOR SCALABLE PARALLEL COMPUTERS [J].
BALA, V ;
BRUCK, J ;
CYPHER, R ;
ELUSTONDO, P ;
HO, A ;
HO, CT ;
KIPNIS, S ;
SNIR, M .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1995, 6 (02) :154-164
[3]  
BALARDIE T, 1993, P ACM SIGCOMM, P85
[4]   Efficient collective communication on Heterogeneous Networks of Workstations [J].
Banikazemi, M ;
Moorthy, V ;
Panda, DK .
1998 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - PROCEEDINGS, 1998, :460-467
[5]   Efficient message passing interface (MPI) for parallel computing on clusters of workstations [J].
Bruck, J ;
Dolev, D ;
Ho, CT ;
Rosu, MC ;
Strong, R .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1997, 40 (01) :19-34
[6]  
DEERING S, 1994, P ACM SIGCOMM
[7]   Globus: A metacomputing infrastructure toolkit [J].
Foster, I ;
Kesselman, C .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1997, 11 (02) :115-128
[8]   EFFICIENT ALGORITHMS FOR FINDING MINIMUM SPANNING-TREES IN UNDIRECTED AND DIRECTED-GRAPHS [J].
GABOW, HN ;
GALIL, Z ;
SPENCER, T ;
TARJAN, RE .
COMBINATORICA, 1986, 6 (02) :109-122
[9]   The core legion object model [J].
Lewis, M ;
Grimshaw, A .
PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 1996, :551-561
[10]  
LIN X, 1991, P INT C PAR PROC AUG, V1, P435