Experience-Driven Congestion Control: When Multi-Path TCP Meets Deep Reinforcement Learning

被引：130

作者：

Xu, Zhiyuan ^{[1
]}

Tang, Jian ^{[1
]}

Yin, Chengxiang ^{[1
]}

Wang, Yanzhi ^{[2
]}

Xue, Guoliang ^{[3
]}

机构：

[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA

[2] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA

[3] Arizona State Univ, Ira A Fulton Sch Engn, Tempe, AZ 85287 USA

来源：

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS | 2019年 / 37卷 / 06期

基金：

美国国家科学基金会;

关键词：

AI; deep learning; experience-driven control; congestion control; TCP; multi-path TCP; ALGORITHM;

D O I：

10.1109/JSAC.2019.2904358

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we aim to study networking problems from a whole new perspective by leveraging emerging deep learning, to develop an experience-driven approach, which enables a network or a protocol to learn the hest way to control itself from its own experience (e.g., runtime statistics data), just as a human learns a skill. We present design, implementation and evaluation of a deep reinforcement learning (DRL)-based control framework, DRL-CC (DRL for Congestion Control), which realizes our experience-driven design philosophy on multi-path TCP (MPTCP) congestion control. DRL-CC utilizes a single (instead of multiple independent) agent to dynamically and jointly perform congestion control for all active MPTCP flows on an end host with the objective of maximizing the overall utility. The novelty of our design is to utilize a flexible recurrent neural network, LSTM, under a DRL framework for learning a representation for all active flows and dealing with their dynamics. Moreover, we, for the first time, integrate the above LSTM-based representation network into an actor-critic framework for continuous (congestion) control, which leverages the emerging deterministic policy gradient to train critic, actor, and LSTM networks in an endto-end manner. We implemented DRL-CC based on the MPTCP implementation in the Linux kernel. The experimental results show that 1) DRL-CC consistently and significantly outperforms a few well-known MPTCP congestion control algorithms in terms of goodput without sacrificing fairness, 2) it is flexible and robust to highly-dynamic network environments with time-varying flows, and 3) it is friendly to regular TCP.

引用

页码：1325 / 1336

页数：12

共 41 条

[1]

Abadi M., 2015, TensorFlow: Large-scale machine learning on heterogeneous systems

[2]

[Anonymous], 2015, PRIORITIZED EXPERIEN

[3]

[Anonymous], 2016, PROC INT C MACH LEAR

[4]

[Anonymous], 2009, P PFLDNET WORKSH

[5]

[Anonymous], P USENIX NSDI

[6]

[Anonymous], Multipath TCP in the linux kernel

[7]

[Anonymous], 2018, REINFORCEMENT LEARNI

[8]

[Anonymous], 2013, 6824 RFC IETF

[9]

[Anonymous], ALGORITHMS NEXT GENE

[10] TCP VEGAS - END-TO-END CONGESTION AVOIDANCE ON A GLOBAL INTERNET [J].

BRAKMO, LS ;

PETERSON, LL .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1995, 13 (08) :1465-1480

← 1 2 3 4 5 →