DRL-TAL: Deep Reinforcement Learning-Based Traffic-Aware Load Balancing in Data Center Networks

被引：1

作者：

Jiang, Guoyong ^{[1
]}

Wei, Wenting ^{[1
]}

Wang, Kun ^{[2
]}

Pang, Chengding ^{[1
]}

Liu, Yong ^{[3
]}

机构：

[1] Xidian Univ, State Key Lab Integrated Serv Networks, Xian, Peoples R China

[2] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China

[3] Hangzhou Normal Univ, Sch Informat Sci & Technol, Hangzhou, Peoples R China

来源：

IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Data center networks; load balancing; DDPG;

D O I：

10.1109/GLOBECOM54140.2023.10437481

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Load balancing in data center networks is crucial to effectively utilize network resources and enhance Quality of Service (QoS). Especially, the flowlet-level load balancing has been proven efficient in reducing latency and increasing throughput simultaneously. However, most existing work relying on empirical static timeout encounters performance degradation in dynamic network scenarios, due to a mismatch between the static timeout and changing traffic conditions. To address this problem, we propose a Deep Reinforcement Learning-Based Traffic-Aware Load Balancing scheme (DRL-TAL), which uses deep reinforcement learning (DRL) to update the flowlet timeout adaptively. The agent using a deep deterministic policy gradient (DDPG) algorithm continuously senses network throughput and generates the timeout threshold dynamically for the next time slot. The flowlet granularity is deployed for elephant flows to achieve a balance between throughput and disorder, where the timeout value relies on the threshold generated by the agent. Furthermore, the mice flow gets forwarded under packet granularity by selecting the port with the smallest queue length to ensure a shorter flow completion time. The results demonstrate that DRL-TAL performs impressively well in the symmetric topology, with no packet loss and minimal disorder under high load compared to the state-of-the-art schemes. Moreover, it significantly reduces flow completion time by up to 45% compared to Conga in the asymmetric topology.

引用

页码：928 / 933

页数：6

共 14 条

[1]

Al-Fares M., 2010, Nsdi

[2]

Alizadeh M, 2014, ACM SIGCOMM COMP COM, V44, P503, DOI [10.1145/2619239.2626316, 10.1145/2740070.2626316]

[3] AuTO: Scaling Deep Reinforcement Learning for Datacenter-Scale Automatic Traffic Optimization [J].

Chen, Li ;

Lingys, Justinas ;

Chen, Kai ;

Liu, Feng .

PROCEEDINGS OF THE 2018 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION (SIGCOMM '18), 2018, :191-205

[4]

Curtis AR, 2011, IEEE INFOCOM SER, P1629, DOI 10.1109/INFCOM.2011.5934956

[5] Data classification and reinforcement learning to avoid congestion on SDN-based data centers [J].

Diel, Gustavo ;

Miers, Charles Christian ;

Pillon, Mauricio Aronne ;

Koslovski, Guilherme Piegas .

2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, :2861-2866

[6] Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers [J].

Farrington, Nathan ;

Porter, George ;

Radhakrishnan, Sivasankar ;

Bazzaz, Hamid Hajabdolali ;

Subramanya, Vikram ;

Fainman, Yeshaiahu ;

Papen, George ;

Vahdat, Amin .

ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) :339-350

[7] Presto: Edge-based Load Balancing for Fast Datacenter Networks [J].

He, Keqiang ;

Rozner, Eric ;

Agarwal, Kanak ;

Felter, Wes ;

Carter, John ;

Akella, Aditya .

Computer Communication Review, 2015, 45 (04) :465-478

[8] A Receiver-Driven Transport Protocol With High Link Utilization Using Anti-ECN Marking in Data Center Networks [J].

Hu, Jinbin ;

Huang, Jiawei ;

Li, Zhaoyi ;

Wang, Jianxin ;

He, Tian .

IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2023, 20 (02) :1898-1912

[9] Adjusting Switching Granularity of Load Balancing for Heterogeneous Datacenter Traffic [J].

Hu, Jinbin ;

Huang, Jiawei ;

Lyu, Wenjun ;

Li, Weihe ;

Li, Zhaoyi ;

Jiang, Wenchao ;

Wang, Jianxin ;

He, Tian .

IEEE-ACM TRANSACTIONS ON NETWORKING, 2021, 29 (05) :2367-2384

[10]

Lillicrap T. P, 2016, P 4 INT C LEARN REPR, P1

← 1 2 →