Path Diversity and Survivability for the HyperX Datacenter Topology

被引:2
作者
Rottenstreich, Ori [1 ,2 ]
机构
[1] Technion, Taub Dept Comp Sci, IL-3200003 Haifa, Israel
[2] Technion, Viterbi Dept Elect & Comp Engn, IL-3200003 Haifa, Israel
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2023年 / 20卷 / 03期
关键词
Topology; Network topology; Fats; Hypercubes; Optical switches; Costs; Routing; Datacenters; network topology; HyperX; reliability;
D O I
10.1109/TNSM.2023.3285914
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Network survivability has been recognized as an issue of a major importance in terms of security, stability and prosperity. This paper studies fundamental properties of HyperX, an emerging topology for connecting supercomputing and datacenter networks. We focus on the establishment of paths with guaranteed survivability, allowing path availability even upon a restricted number of link failures. We first examine the availability of disjoint paths connecting a pair of input nodes. Disjoint paths guarantee path existence even upon a bounded number of link failures. We explore the inherent tradeoff between allowing slightly longer paths and the ability to extend available sets of mutually disjoint paths. Second, we study the availability of paths in a HyperX topology that already observed link failures. Such failures can increase the length of available paths or even eliminate connectivity between network parts. We also compare basic properties of shortest paths in HyperX to other datacenter topologies. Last, we provide an evaluation to illustrate path availability along with the potential impact of failures.
引用
收藏
页码:2370 / 2385
页数:16
相关论文
共 44 条
[1]  
Abts D., 2011, High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities
[2]  
Ahn Jung Ho, 2009, P C HIGH PERFORMANCE
[3]   A scalable, commodity data center network architecture [J].
Al-Fares, Mohammad ;
Loukissas, Alexander ;
Vahdat, Amin .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2008, 38 (04) :63-74
[4]   On the topological properties of HyperX [J].
Azizi, Sadoon ;
Safaei, Farshad ;
Hashemi, Naser .
JOURNAL OF SUPERCOMPUTING, 2013, 66 (01) :572-593
[5]   The power of tuning: A novel approach for the efficient design of survivable networks [J].
Banner, R ;
Orda, A .
12TH IEEE INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS - PROCEEDINGS, 2004, :2-11
[6]  
Besta M, 2020, Arxiv, DOI arXiv:1906.10885
[7]   Slim Fly: A Cost Effective Low-Diameter Network Topology [J].
Besta, Maciej ;
Hoefler, Torsten .
SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, :348-359
[8]  
BHUYAN LN, 1984, IEEE T COMPUT, V33, P323, DOI 10.1109/TC.1984.1676437
[9]   Fault-Tolerant Approximate Shortest-Path Trees [J].
Bilo, Davide ;
Guala, Luciano ;
Leucci, Stefano ;
Proietti, Guido .
ALGORITHMICA, 2018, 80 (12) :3437-3460
[10]   Surviving Failures in Bandwidth-Constrained Datacenters [J].
Bodik, Peter ;
Menache, Ishai ;
Chowdhury, Mosharaf ;
Mani, Pradeepkumar ;
Maltz, David A. ;
Stoica, Ion .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2012, 42 (04) :431-442