Luopan: Sampling-Based Load Balancing in Data Center Networks

被引:44
作者
Wang, Peng [1 ]
Trimponias, George [2 ]
Xu, Hong [1 ]
Geng, Yanhui [3 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[2] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
[3] Huawei Montreal Res Ctr, Markham, ON L3R 5A4, Canada
关键词
Data center networks; load balancing; network congestion; distributed;
D O I
10.1109/TPDS.2018.2858815
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data center networks demand high-performance, robust, and practical data plane load balancing protocols. Despite progress, existing work falls short of meeting these requirements. We design, analyze, and evaluate Luopan, a novel sampling based load balancing protocol that overcomes these challenges. Luopan operates at flowcell granularity similar to Presto. It periodically samples a few paths for each destination switch and directs flowcells to the least congested one. By being congestion-aware, Luopan improves flow completion time (FCT), and is more robust to topological asymmetries compared to Presto. The sampling approach simplifies the protocol and makes it much more scalable for implementation in large-scale networks compared to existing congestion-aware schemes. We provide analysis to show that Luopan's periodic sampling has the same asymptotic behavior as instantaneous sampling: taking 2 random samples provides exponential improvements over 1 sample. We conduct comprehensive packet-level simulations with production workloads. The results show that Luopan consistently outperforms state-of-the-art schemes in large-scale topologies. Compared to Presto, Luopan with 2 samples improves the 99.9%ile FCT of mice flows by up to 35 percent, and average FCT of medium and elephant flows by up to 30 percent. Luopan also performs significantly better than Local Sampling with large asymmetry.
引用
收藏
页码:133 / 145
页数:13
相关论文
共 37 条
[1]  
Agache Alexandru, 2015, P 12 USENIX S NETW S, P29
[2]  
Al-Fares M., 2010, Hedera: dynamic flow scheduling for data center networks, P19
[3]   A scalable, commodity data center network architecture [J].
Al-Fares, Mohammad ;
Loukissas, Alexander ;
Vahdat, Amin .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2008, 38 (04) :63-74
[4]   CONGA: Distributed Congestion-Aware Load Balancing for Datacenters [J].
Alizadeh, Mohammad ;
Edsall, Tom ;
Dharmapurikar, Sarang ;
Vaidyanathan, Ramanan ;
Chu, Kevin ;
Fingerhut, Andy ;
Vinh The Lam ;
Matus, Francis ;
Pan, Rong ;
Yadav, Navindra ;
Varghese, George .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2014, 44 (04) :503-514
[5]   pFabric: Minimal Near-Optimal Datacenter Transport [J].
Alizadeh, Mohammad ;
Yang, Shuang ;
Sharif, Milad ;
Katti, Sachin ;
McKeown, Nick ;
Prabhakar, Balaji ;
Shenker, Scott .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2013, 43 (04) :435-446
[6]   Data Center TCP (DCTCP) [J].
Alizadeh, Mohammad ;
Greenberg, Albert ;
Maltz, David A. ;
Padhye, Jitendra ;
Patel, Parveen ;
Prabhakar, Balaji ;
Sengupta, Sudipta ;
Sridharan, Murari .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) :63-74
[7]  
[Anonymous], 2014, LOW LATENCY DATACENT
[8]  
[Anonymous], 2014, FACEBOOK ENG
[9]  
[Anonymous], 2016, WORLDS FASTEST MOST
[10]   Balanced allocations [J].
Azar, Y ;
Broder, AZ ;
Karlin, AR ;
Upfal, E .
SIAM JOURNAL ON COMPUTING, 1999, 29 (01) :180-200