Fault-Tolerant Adaptive Routing in Dragonfly Networks

被引:21
|
作者
Xiang, Dong [1 ]
Li, Bing [2 ]
Fu, Yi [1 ]
机构
[1] Tsinghua Univ, Sch Software, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
美国国家科学基金会;
关键词
Dragonfly networks; flow-control scheme; deadlock-free adaptive fault-tolerant routing; SCHEME; ALGORITHM; IMMUNET; MESHES; DCELL;
D O I
10.1109/TDSC.2017.2693372
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dragonfly networks have been widely used in the current high-performance computers or high-end servers. Fault-tolerant routing in dragonfly networks is essential. The rich interconnects provide good fault-tolerance ability for the network. A new deadlock-free adaptive fault-tolerant routing algorithm based on a new two-layer safety information model, is proposed by mapping routers in a group, and groups of the dragonfly network into two separate hypercubes. The new fault-tolerant routing algorithm tolerates static and dynamic faults. Our method can determine whether a packet can reach the destination at the source by using the new safety information model, which avoids dead-ends and aimless misrouting. Sufficient simulation results show that the proposed fault-tolerant routing algorithm even outperforms the previous minimal routing algorithm in fault-free networks in many cases.
引用
收藏
页码:259 / 271
页数:13
相关论文
共 50 条