Reliability and Survivability Analysis of Data Center Network Topologies

被引:34
作者
Couto, Rodrigo de Souza [1 ,2 ]
Secci, Stefano [3 ]
Mitre Campista, Miguel Elias [1 ]
Maciel Kosmalski Costa, Luis Henrique [1 ]
机构
[1] Univ Fed Rio de Janeiro, POLI DEL, COPPE PEE GTA, POB 68504, BR-21941972 Rio De Janeiro, RJ, Brazil
[2] Univ Estado Rio de Janeiro, FEN DETEL PEL, BR-20550013 Rio De Janeiro, RJ, Brazil
[3] Univ Paris 06, Sorbonne Univ, UMR 7606, LIP6, F-75005 Paris, France
关键词
Data center networks; Cloud networks; Survivability; Reliability; Robustness; AVAILABILITY; FRAMEWORK; COST;
D O I
10.1007/s10922-015-9354-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The architecture of several data centers have been proposed as alternatives to the conventional three-layer one. Most of them employ commodity equipment for cost reduction. Thus, robustness to failures becomes even more important, because commodity equipment is more failure-prone. Each architecture has a different network topology design with a specific level of redundancy. In this work, we aim at analyzing the benefits of different data center topologies taking the reliability and survivability requirements into account. We consider the topologies of three alternative data center architecture: Fat-tree, BCube, and DCell. Also, we compare these topologies with a conventional three-layer data center topology. Our analysis is independent of specific equipment, traffic patterns, or network protocols, for the sake of generality. We derive closed-form formulas for the Mean Time To Failure of each topology. The results allow us to indicate the best topology for each failure scenario. In particular, we conclude that BCube is more robust to link failures than the other topologies, whereas DCell has the most robust topology when considering switch failures. Additionally, we show that all considered alternative topologies outperform a three-layer topology for both types of failures. We also determine to which extent the robustness of BCube and DCell is influenced by the number of network interfaces per server.
引用
收藏
页码:346 / 392
页数:47
相关论文
共 50 条
[41]   Survivability analysis of reconfigurable systems [J].
Bai, Li ;
Biswas, Saroj ;
Ortiz, Albert ;
Ferrese, Frank ;
Dalessandro, Don ;
Dong, Qing .
2007 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-4, 2007, :663-+
[42]   Reliability Evaluation of Mid-voltage DC Distribution Network with Multiple Topologies [J].
Zeng, Jiasi ;
Gao, Yibo ;
Yang, Feng ;
Xu, Xidong ;
Qiu, Peng ;
Lu, Yi ;
Huang, Xiaoming .
ELECTRONICS, MECHATRONICS AND AUTOMATION III, 2014, 666 :112-+
[43]   Reliability and Survivability Analysis of Artificial Cobweb Network Model Used in the Low-Voltage Power-Line Communication System [J].
Zhang, L. ;
Liu, X. S. ;
Pang, J. W. ;
Xu, D. G. ;
Leung, V. C. M. .
IEEE TRANSACTIONS ON POWER DELIVERY, 2016, 31 (05) :1980-1988
[44]   Online Robust Placement of Service Chains for Large Data Center Topologies [J].
Moualla, Ghada ;
Turletti, Thierry ;
Saucez, Damien .
IEEE ACCESS, 2019, 7 :60150-60162
[45]   Survivability Analysis and Evaluation for Wireless Heterogeneous Emergency Communication Network [J].
Wang, Haitao ;
Zhu, Shicai ;
Chen, Hui ;
Yan, Li ;
Song, Lihua .
2013 3RD INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, COMMUNICATIONS AND NETWORKS (CECNET), 2013, :641-644
[46]   “Selfish” algorithm for reducing the computational cost of the network survivability analysis [J].
Svetlana V. Poroseva .
Optimization and Engineering, 2014, 15 :381-400
[47]   "Selfish" algorithm for reducing the computational cost of the network survivability analysis [J].
Poroseva, Svetlana V. .
OPTIMIZATION AND ENGINEERING, 2014, 15 (02) :381-400
[48]   Network survivability performance evaluation using fault trees [J].
Keshtgary, M ;
Jahangir, AH ;
Jayasumana, AP .
PROCEEDINGS OF THE THIRD IASTED INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND COMPUTER NETWORKS, 2005, :158-163
[49]   Network System Survivability: A Survey [J].
Wang Xue-Guang .
2009 INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS, PROCEEDINGS, 2009, :232-235
[50]   Reliability of Example Mechanical Systems for Data Center Cooling Selected by Tier Classification [J].
Arno, Robert ;
Githu, Gardson ;
Gross, Peter ;
Schuerger, Robert ;
Wilson, Scott .
2010 IEEE INDUSTRY APPLICATIONS SOCIETY ANNUAL MEETING, 2010,