Experience-driven Networking: A Deep Reinforcement Learning based Approach

被引：0

作者：

Xu, Zhiyuan ^{[1
]}

Tang, Jian ^{[1
]}

Meng, Jingsong ^{[1
]}

Zhang, Weiyi ^{[2
]}

Wang, Yanzhi ^{[1
]}

Liu, Chi Harold ^{[3
]}

Yang, Dejun ^{[4
]}

机构：

[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA

[2] AT&T Labs Res, Middletown, NJ 07748 USA

[3] Beijing Inst Technol, Beijing 100081, Peoples R China

[4] Colorado Sch Mines, Dept Elect Engn & Comp Sci, Golden, CO 80401 USA

来源：

IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2018) | 2018年

基金：

美国国家科学基金会;

关键词：

Experience-driven Networking; Deep Reinforcement Learning; Traffic Engineering; CONGESTION CONTROL; STABILITY;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern communication networks have become very complicated and highly dynamic, which makes them hard to model, predict and control. In this paper we develop a novel experience-driven approach that can learn to well control a communication network from its own experience rather than an accurate mathematical model, just as a human learns a new skill (such as driving, swimming, etc). Specifically, we, for the first time, propose to leverage emerging Deep Reinforcement Learning (DRL) for enabling model-free control in communication networks; and present a novel and highly effective DRL-based control framework. DRL-TE, fur a fundamental networking problem: Traffic Engineering (TE). The proposed framework maximizes a widely-used utility function by jointly learning network environment and its dynamics, and making decisions under the guidance of powerful Deep Neural Networks (DNNs). We propose two new techniques, TE-aware exploration and actor-critic-based prioritized experience replay, to optimize the general DRL framework particularly for TE. To validate and evaluate the proposed framework, we implemented it in ns-3, and tested it comprehensively with both representative and randomly generated network topologies. Extensive packet-level simulation results show that 1) compared to several widely-used baseline methods, DRL-TE signiticantly reduces end-to-end delay and consistently improves the network utility, while offering better or comparable throughput; 2) DRL-TE is robust to network changes; and 3) DRL-TE consistently outperforms a state-of-the-art DRL method (for continuous control), Deep Deterministic Policy Gradient (DDPG), which, however, does not otter satisfying performance.

引用

页码：1880 / 1888

页数：9

共 33 条

[1]

Agarwal S, 2013, IEEE INFOCOM SER, P2211

[2]

[Anonymous], 2016, ARXIV161102247

[3]

[Anonymous], 2016, DEEP LEARNING

[4]

[Anonymous], 2015, Reinforcement Learning: An Introduction

[5]

[Anonymous], 2015, P INT C LEARN REPR I

[6]

[Anonymous], ALGORITHMS NEXT GENE

[7]

[Anonymous], 2012, ser. Systems & Control: Foundations & Applications

[8]

Bai S, 2012, IEEE INFOCOM SER, P1593, DOI 10.1109/INFCOM.2012.6195528

[9]

Dulac-Arnold G., 2016, Deep reinforcement learning in large discrete action spaces

[10]

Einhorn E., IFIP AIMS 2008, P120

← 1 2 3 4 →