COLLECTIVE COMMUNICATION IN WORMHOLE-ROUTED MASSIVELY-PARALLEL COMPUTERS

被引:73
作者
MCKINLEY, PK [1 ]
TSAI, YJ [1 ]
ROBINSON, DF [1 ]
机构
[1] QUINCY UNIV,DEPT COMP SCI,QUINCY,IL 62301
基金
美国国家科学基金会;
关键词
D O I
10.1109/2.476198
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The supercomputer market is now dominated by parallel architectures, among which massively parallel computers (MPCs) are an important class of systems. The memory of an MPC is physically distributed among an ensemble of computing nodes that communicate by sending data through a network. Communication operations can be either point-to-point, with one source and one destination, or collective, with more than two participating processes. The design of collective communication operations depends on the MPC's underlying network architecture. While there has been little consensus on some aspects of communication architectures, such as network topology, a good deal of agreement exists regarding the most efficient way to switch messages through the network. Most MPCs use wormhole routing, in which each message is divided into small pieces that are pipelined through the network. Compared with the store-and-forward switching method used in early multicomputers, wormhole routing reduces the effect of path length on communication time. However, in situations where multiple messages exist in the network concurrently, wormhole routing can exacerbate channel contention, which occurs when blocked messages hold some communication channels while waiting for others. Invoking a collective operation, which can involve many messages, poses this situation. In recent years, many projects have addressed the design of efficient collective communication algorithms for wormhole-routed systems. By exploiting the relative distance-insensitivity of wormhole routing, these new algorithms often differ fundamentally from their store-and-forward counterparts. This article examines software and hardware approaches to implementing collective communication operations, illustrating several issues arising in this research area and describing the major classes of algorithms proposed to solve these problems.
引用
收藏
页码:39 / &
相关论文
共 24 条
[1]   CCL - A PORTABLE AND TUNABLE COLLECTIVE COMMUNICATION LIBRARY FOR SCALABLE PARALLEL COMPUTERS [J].
BALA, V ;
BRUCK, J ;
CYPHER, R ;
ELUSTONDO, P ;
HO, A ;
HO, CT ;
KIPNIS, S ;
SNIR, M .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1995, 6 (02) :154-164
[2]  
BARNETT M, 1994, PROCEEDINGS OF THE SCALABLE HIGH-PERFORMANCE COMPUTING CONFERENCE, P357, DOI 10.1109/SHPCC.1994.296665
[3]  
BARNETT M, 1993, TR9324 U TEX AUST DE
[4]  
BARNOY A, 1992, 1992 P S PAR ALG ARC, P13
[5]  
BOPPANA RV, 1994, 1994 P S PAR DISTR P, P722
[6]  
BRUCK J, 1994, 1994 P S PAR DISTR P, P594
[7]  
CULLER D, 1993, 5TH ACM SIGPLAN S PR, P1
[8]   VIRTUAL-CHANNEL FLOW-CONTROL [J].
DALLY, WJ .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1992, 3 (02) :194-205
[9]   THE TORUS ROUTING CHIP [J].
DALLY, WJ ;
SEITZ, CL .
DISTRIBUTED COMPUTING, 1986, 1 (04) :187-196
[10]  
DECOSTER L, 1995, 1995 P INT C PAR PRO, V3, P137