Predictive Queue-Based Rate Control for Low Latency in Lossless Data Center Networks

被引:2
作者
Dong, Pingping [1 ]
Lu, Xiaojuan [1 ]
Huang, Tairan [1 ]
Chen, Liying [1 ]
Yang, Yang [1 ]
Zhang, Lianming [1 ]
机构
[1] Hunan Normal Univ, Coll Informat Sci & Engn, Changsha 410081, Peoples R China
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2024年 / 21卷 / 03期
关键词
Throughput; Data centers; Switches; Delays; Topology; Packet loss; Low latency communication; Lossless data center network; congestion control; priority-based flow control (PFC); egress queue; PFC;
D O I
10.1109/TNSM.2024.3363463
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In lossless data center networks (DCN), many existing congestion control schemes are used to address the impact caused by priority-based flow control (PFC), such as congestion spreading, and victim flow problems. However, in some special cases, this problem is not solved. Through observation, we examine the interaction between flow control and congestion control, and realize that the mismatch between hop-by-hop flow control and end-to-end congestion feedback, as well as inaccurate rate regulation, are the root causes of frequent PFC triggering. Therefore, we propose Egress Queue Congestion Information Notification (EQCIN). EQCIN implements threshold-based flow identification to avoid packet buildup due to congestion spreading being considered as the root cause of congestion, while using direct feedback from the congestion side to reduce unnecessary link loss. For different flow identifiers, EQCIN adopts different algorithms to achieve targeted rate control. Experimental results show that EQCIN can reduce the number of PFC PAUSEs tends to zero, compared to TIMELY, DCQCN, DCQCN+TCD and improve the link utilization by 7%-77%, respectively.
引用
收藏
页码:3428 / 3439
页数:12
相关论文
共 30 条
[1]   Data Center TCP (DCTCP) [J].
Alizadeh, Mohammad ;
Greenberg, Albert ;
Maltz, David A. ;
Padhye, Jitendra ;
Patel, Parveen ;
Prabhakar, Balaji ;
Sengupta, Sudipta ;
Sridharan, Murari .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) :63-74
[2]   Data Center Transport Mechanisms: Congestion Control Theory and IEEE Standardization [J].
Alizadeh, Mohammad ;
Atikoglu, Berk ;
Kabbani, Abdul ;
Lakshmikantha, Ashvin ;
Pan, Rong ;
Prabhakar, Balaji ;
Seaman, Mick .
2008 46TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, VOLS 1-3, 2008, :1270-+
[3]   Learned Load Balancing [J].
Chang, Brian ;
Subramanian, Kausik ;
D'Antoni, Loris ;
Akella, Aditya .
PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING, ICDCN 2023, 2023, :177-187
[4]   MP-RDMA: Enabling RDMA With Multi-Path Transport in Datacenters [J].
Chen, Guo ;
Lu, Yuanwei ;
Li, Bojie ;
Tan, Kun ;
Xiong, Yongqiang ;
Cheng, Peng ;
Zhang, Jiansong ;
Moscibroda, Thomas .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (06) :2308-2323
[5]  
Cheng WX, 2020, PROCEEDINGS OF THE 17TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, P19
[6]   Credit-Scheduled Delay-Bounded Congestion Control for Datacenters [J].
Cho, Inho ;
Jang, Keon ;
Han, Dongsu .
SIGCOMM '17: PROCEEDINGS OF THE 2017 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2017, :239-252
[7]  
Cui ZG, 2020, ASIA-PAC NETW OPER M, P385, DOI [10.23919/apnoms50412.2020.9236778, 10.23919/APNOMS50412.2020.9236778]
[8]  
Dong P., 2023, Proc IEEE ISPA, P1
[9]  
Goyal P, 2022, PROCEEDINGS OF THE 19TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI '22), P779
[10]   Re-architecting datacenter networks and stacks for low latency and high performance [J].
Handley, Mark ;
Raiciu, Costin ;
Agache, Alexandru ;
Voinescu, Andrei ;
Moore, Andrew W. ;
Antichi, Gianni ;
Wojcik, Marcin .
SIGCOMM '17: PROCEEDINGS OF THE 2017 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2017, :29-42