RAN Information-Assisted TCP Congestion Control Using Deep Reinforcement Learning With Reward Redistribution

被引：5

作者：

Chen, Minghao ^{[1
]}

Li, Rongpeng ^{[1
]}

Crowcroft, Jon ^{[2
]}

Wu, Jianjun ^{[3
]}

Zhao, Zhifeng ^{[4
]}

Zhang, Honggang ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Peoples R China

[2] Univ Cambridge, Dept Comp Sci, Cambridge CB2 1TN, England

[3] Huawei Technol Co Ltd, Shanghai 201206, Peoples R China

[4] Zhejiang Lab, Hangzhou 311121, Peoples R China

来源：

IEEE TRANSACTIONS ON COMMUNICATIONS | 2022年 / 70卷 / 01期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Reinforcement learning; Servers; Internet; Throughput; Radio access networks; Bandwidth; 5G mobile communication; Deep reinforcement learning; congestion control; radio access network; reward redistribution; delayed feedback;

D O I：

10.1109/TCOMM.2021.3123130

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we aim to propose a novel transmission control protocol (TCP) congestion control method from a cross-layer-based perspective and present a deep reinforcement learning (DRL)-driven method called DRL-3R (DRL for congestion control with Radio access network information and Reward Redistribution) so as to learn the TCP congestion control policy in a superior manner. In particular, we incorporate the RAN information to timely grasp the dynamics of RAN, and empower DRL to learn from the delayed RAN information feedback potentially induced by several consecutive actions. Meanwhile, we relax the implicit assumption (that the feedback to one specific action returns at a round-trip-time (RTT) after the action is applied) in previous researches, by redistributing the rewards and evaluating the merits of actions more accurately. Experiment results show that besides maintaining a reasonable fairness, DRL-3R significantly outperforms classical congestion control methods (e.g., TCP Reno, Westwood, Cubic, BBR and DRL-CC) on network utility by achieving a higher throughput while reducing delay in various network environments.

引用

页码：215 / 230

页数：16

共 50 条

[41] Traffic signal control based on deep reinforcement learning using state fusion and trend reward [J].

Tan, Xiaoxue ;

Zhou, Yonghua ;

Jiao, Xiangmeng .

Engineering Applications of Artificial Intelligence, 2025, 159

[42] Age of Information Aware VNF Scheduling in Industrial IoT Using Deep Reinforcement Learning [J].

Akbari, Mohammad ;

Abedi, Mohammad Reza ;

Joda, Roghayeh ;

Pourghasemian, Mohsen ;

Mokari, Nader ;

Erol-Kantarci, Melike .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (08) :2487-2500

[43] An intelligent scheme for congestion control: When active queue management meets deep reinforcement learning [J].

Ma, Huihui ;

Xu, Du ;

Dai, Yueyue ;

Dong, Qing .

COMPUTER NETWORKS, 2021, 200

[44] Generation of Optimized Trajectories for Congestion Mitigation in Fukuoka Approach Control Area Using Deep Reinforcement Learning [J].

Iwatsuki, Yota ;

Kawamoto, Yasutaka ;

Higashino, Shin-Ichiro .

2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 :1087-1113

[45] Adaptive traffic signal control system using composite reward architecture based deep reinforcement learning [J].

Jamil, Abu Rafe Md ;

Ganguly, Kishan Kumar ;

Nower, Naushin .

IET INTELLIGENT TRANSPORT SYSTEMS, 2020, 14 (14) :2030-2041

[46] Continuous Control of Autonomous Vehicles using Plan-assisted Deep Reinforcement Learning [J].

Dwivedi, Tanay ;

Betz, Tobias ;

Sauerbeck, Florian ;

Manivannan, P., V ;

Lienkamp, Markus .

2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, :244-250

[47] Reward design for intelligent deep reinforcement learning based power flow control using topology optimization [J].

Hrgovic, Ivana ;

Pavic, Ivica .

SUSTAINABLE ENERGY GRIDS & NETWORKS, 2025, 41

[48] Deep reinforcement learning for static noisy state feedback control with reward estimation [J].

Wang, Ran ;

Kashima, Kenji .

ADVANCED ROBOTICS, 2025, 39 (05) :259-272

[49] Vehicle emission control on road with temporal traffic information using deep reinforcement learning [J].

Xu, Zhenyi ;

Cao, Yang ;

Kang, Yu ;

Zhao, Zhenyi .

IFAC PAPERSONLINE, 2020, 53 (02) :14960-14965

[50] A dual-USV cooperative collision avoidance method based on deep reinforcement learning with reward redistribution mechanism [J].

He, Zehao ;

Li, Ligang ;

Zong, Lv ;

Zhu, Chuanzhi ;

Dai, Yongshou .

SHIPS AND OFFSHORE STRUCTURES, 2025,

← 1 2 3 4 5 →