Adaptive load balancing based on accurate congestion feedback for asymmetric topologies

被引:9
作者
Shi, Qingyu [1 ]
Wang, Fang [1 ,2 ]
Feng, Dan [1 ]
Xie, Weibin [1 ]
机构
[1] Huazhong Univ Sci & Technol, Minist Educ China, Sch Comp Sci & Technol, Wuhan Natl Lab Optoelect,Key Lab Informat Storage, Wuhan, Hubei, Peoples R China
[2] Shenzhen Huazhong Univ Sci & Technol, Res Inst, Wuhan, Hubei, Peoples R China
关键词
Datacenter network; Load balancing; Congestion feedback; Low latency;
D O I
10.1016/j.comnet.2019.04.006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Datacenter load balancing schemes exist to facilitate parallel data transmission with multiple paths under various uncertainties such as traffic dynamics and topology asymmetries. Taking deployment challenges into account, several optimized schemes (e.g. CLOVE, Hermes) to ECMP balance load at end hosts. However, inaccurate congestion feedback exists in these solutions. They either detect congestion through Explicit Congestion Notification (ECN) and coarse-grained Round-Trip Time (RTT) measurements or are congestion-oblivious. These congestion feedbacks are not sufficient enough to indicate the accurate congestion status under asymmetry. And when rerouting events occur, outdated ACKs carrying congestion feedback of other paths can improperly influence the current sending rate. After our observations and analyses, these inaccurate congestion feedback can degrade performance. Therefore, we explore how to address above problems while ensuring good adaptation to existing switch hardware and network protocol stack. We propose ALB, an adaptive load balancing mechanism based on accurate congestion feedback running at end hosts, which is resilient to asymmetry. ALB leverages a latency-based congestion detection to precisely reroute new flowlets to the paths with lighter load, and an ACK correction method to avoid inaccurate flow rate adjustment. In large-scale simulations, ALB achieves up to 13% and 48% better average flow completion time (FCT) than CONGA and CLOVE-ECN under asymmetry, respectively. And compared with other schemes ALB improves the average and the 99th percentile FCTs for small flows under high bursty traffic by 43-174% and 75-129%. Under the situation of dynamic network changes, ALB also provides competitive overall performance and maintains stable performance for small flows. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:133 / 145
页数:13
相关论文
共 27 条
[1]  
Al-Fares Mohammad, 2010, Hedera: dynamic flow scheduling for data center networks (NSDI'10)
[2]   CONGA: Distributed Congestion-Aware Load Balancing for Datacenters [J].
Alizadeh, Mohammad ;
Edsall, Tom ;
Dharmapurikar, Sarang ;
Vaidyanathan, Ramanan ;
Chu, Kevin ;
Fingerhut, Andy ;
Vinh The Lam ;
Matus, Francis ;
Pan, Rong ;
Yadav, Navindra ;
Varghese, George .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2014, 44 (04) :503-514
[3]  
Alizadeh Mohammad, 2010, P ACM SIGCOMM 2010 C, V40, P63, DOI [DOI 10.1145/1851182, DOI 10.1145/1851275.1851192]
[4]  
Benson T., 2010, Proceedings of the 10th annual conference on Internet measurement - IMC '10, P267, DOI [DOI 10.1145/1879141.1879175, 10.1145/1879141.1879175]
[5]  
Benson Theophilus, 2011, P 7 C EM NETW EXP TE, DOI 10.1145/2079296.2079304
[6]   Per-packet Load-balanced, Low-Latency Routing for Clos-based Data Center Networks [J].
Cao, Jiaxin ;
Xia, Rui ;
Yang, Pengkun ;
Guo, Chuanxiong ;
Lu, Guohan ;
Yuan, Lihua ;
Zheng, Yixin ;
Wu, Haitao ;
Xiong, Yongqiang ;
Maltz, Dave .
PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES (CONEXT '13), 2013, :49-60
[7]  
Cisco systems, TREX CISC REAL TRAFF
[8]   Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications [J].
Gill, Phillipa ;
Jain, Navendu ;
Nagappan, Nachiappan .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2011, 41 (04) :350-361
[9]   VL2: A Scalable and Flexible Data Center Network [J].
Greenberg, Albert ;
Hamilton, James R. ;
Jain, Navendu ;
Kandula, Srikanth ;
Kim, Changhoon ;
Lahiri, Parantap ;
Maltz, David A. ;
Patel, Parveen ;
Sengupta, Sudipta .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2009, 39 (04) :51-62
[10]  
Guo C., PINGMESH LARGE SCALE, P139