Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

被引:18
作者
Wang, Peng [1 ]
Xu, Hong [1 ]
Niu, Zhixiong [1 ]
Han, Dongsu [2 ]
Xiong, Yongqiang [3 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
[3] Microsoft Res Asia, Wireless & Networking Grp, Beijing 100080, Peoples R China
关键词
Data center networks; load balancing; network congestion; distributed;
D O I
10.1109/TNET.2017.2731986
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data center networks often use multi-rooted Clos topologies to provide a large number of equal cost paths between two hosts. Thus, load balancing traffic among the paths is important for high performance and low latency. However, it is well known that ECMP-the de facto load balancing scheme-performs poorly in data center networks. The main culprit of ECMP's problems is its congestion agnostic nature, which fundamentally limits its ability to deal with network dynamics. We propose Expeditus, a novel distributed congestion-aware load balancing protocol for general 3-tier Clos networks. The complex 3-tier Clos topologies present significant scalability challenges that make a simple per-path feedback approach infeasible. Expeditus addresses the challenges by using simple local information collection, where a switch only monitors its egress and ingress link loads. It further employs a novel two-stage path selection mechanism to aggregate relevant information across switches and make path selection decisions. Testbed evaluation on Emulab and large-scale ns-3 simulations demonstrate that, Expeditus outperforms ECMP by up to 45% in tail flow completion times (FCT) for mice flows, and by up to 38% in mean FCT for elephant flows in 3-tier Clos networks.
引用
收藏
页码:3175 / 3188
页数:14
相关论文
共 46 条
[1]  
Agache Alexandru, 2015, P 12 USENIX S NETW S, P29
[2]   A scalable, commodity data center network architecture [J].
Al-Fares, Mohammad ;
Loukissas, Alexander ;
Vahdat, Amin .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2008, 38 (04) :63-74
[3]   CONGA: Distributed Congestion-Aware Load Balancing for Datacenters [J].
Alizadeh, Mohammad ;
Edsall, Tom ;
Dharmapurikar, Sarang ;
Vaidyanathan, Ramanan ;
Chu, Kevin ;
Fingerhut, Andy ;
Vinh The Lam ;
Matus, Francis ;
Pan, Rong ;
Yadav, Navindra ;
Varghese, George .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2014, 44 (04) :503-514
[4]   pFabric: Minimal Near-Optimal Datacenter Transport [J].
Alizadeh, Mohammad ;
Yang, Shuang ;
Sharif, Milad ;
Katti, Sachin ;
McKeown, Nick ;
Prabhakar, Balaji ;
Shenker, Scott .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2013, 43 (04) :435-446
[5]   Data Center TCP (DCTCP) [J].
Alizadeh, Mohammad ;
Greenberg, Albert ;
Maltz, David A. ;
Padhye, Jitendra ;
Patel, Parveen ;
Prabhakar, Balaji ;
Sengupta, Sudipta ;
Sridharan, Murari .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) :63-74
[6]  
[Anonymous], 2014, LOW LATENCY DATACENT
[7]  
[Anonymous], WORLDS FASTEST MOST
[8]  
[Anonymous], 2001, THESIS
[9]  
[Anonymous], 2014, FACEBOOK ENG
[10]  
[Anonymous], 2014, 11 USENIX S NETWORKE