Predictive Queue-Based Rate Control for Low Latency in Lossless Data Center Networks

被引:0
作者
Dong, Pingping [1 ]
Lu, Xiaojuan [1 ]
Huang, Tairan [1 ]
Chen, Liying [1 ]
Yang, Yang [1 ]
Zhang, Lianming [1 ]
机构
[1] Hunan Normal Univ, Coll Informat Sci & Engn, Changsha 410081, Peoples R China
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2024年 / 21卷 / 03期
关键词
Throughput; Data centers; Switches; Delays; Topology; Packet loss; Low latency communication; Lossless data center network; congestion control; priority-based flow control (PFC); egress queue; PFC;
D O I
10.1109/TNSM.2024.3363463
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In lossless data center networks (DCN), many existing congestion control schemes are used to address the impact caused by priority-based flow control (PFC), such as congestion spreading, and victim flow problems. However, in some special cases, this problem is not solved. Through observation, we examine the interaction between flow control and congestion control, and realize that the mismatch between hop-by-hop flow control and end-to-end congestion feedback, as well as inaccurate rate regulation, are the root causes of frequent PFC triggering. Therefore, we propose Egress Queue Congestion Information Notification (EQCIN). EQCIN implements threshold-based flow identification to avoid packet buildup due to congestion spreading being considered as the root cause of congestion, while using direct feedback from the congestion side to reduce unnecessary link loss. For different flow identifiers, EQCIN adopts different algorithms to achieve targeted rate control. Experimental results show that EQCIN can reduce the number of PFC PAUSEs tends to zero, compared to TIMELY, DCQCN, DCQCN+TCD and improve the link utilization by 7%-77%, respectively.
引用
收藏
页码:3428 / 3439
页数:12
相关论文
共 30 条
  • [1] Data Center TCP (DCTCP)
    Alizadeh, Mohammad
    Greenberg, Albert
    Maltz, David A.
    Padhye, Jitendra
    Patel, Parveen
    Prabhakar, Balaji
    Sengupta, Sudipta
    Sridharan, Murari
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) : 63 - 74
  • [2] Data Center Transport Mechanisms: Congestion Control Theory and IEEE Standardization
    Alizadeh, Mohammad
    Atikoglu, Berk
    Kabbani, Abdul
    Lakshmikantha, Ashvin
    Pan, Rong
    Prabhakar, Balaji
    Seaman, Mick
    [J]. 2008 46TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, VOLS 1-3, 2008, : 1270 - +
  • [3] Learned Load Balancing
    Chang, Brian
    Subramanian, Kausik
    D'Antoni, Loris
    Akella, Aditya
    [J]. PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING, ICDCN 2023, 2023, : 177 - 187
  • [4] MP-RDMA: Enabling RDMA With Multi-Path Transport in Datacenters
    Chen, Guo
    Lu, Yuanwei
    Li, Bojie
    Tan, Kun
    Xiong, Yongqiang
    Cheng, Peng
    Zhang, Jiansong
    Moscibroda, Thomas
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (06) : 2308 - 2323
  • [5] Cheng WX, 2020, PROCEEDINGS OF THE 17TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, P19
  • [6] Credit-Scheduled Delay-Bounded Congestion Control for Datacenters
    Cho, Inho
    Jang, Keon
    Han, Dongsu
    [J]. SIGCOMM '17: PROCEEDINGS OF THE 2017 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2017, : 239 - 252
  • [7] Cui ZG, 2020, ASIA-PAC NETW OPER M, P385, DOI [10.23919/APNOMS50412.2020.9236778, 10.23919/apnoms50412.2020.9236778]
  • [8] Dong P., 2023, Proc IEEE ISPA, P1
  • [9] Goyal P, 2022, PROCEEDINGS OF THE 19TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI '22), P779
  • [10] Re-architecting datacenter networks and stacks for low latency and high performance
    Handley, Mark
    Raiciu, Costin
    Agache, Alexandru
    Voinescu, Andrei
    Moore, Andrew W.
    Antichi, Gianni
    Wojcik, Marcin
    [J]. SIGCOMM '17: PROCEEDINGS OF THE 2017 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2017, : 29 - 42