Experience-driven Networking: A Deep Reinforcement Learning based Approach

被引:0
作者
Xu, Zhiyuan [1 ]
Tang, Jian [1 ]
Meng, Jingsong [1 ]
Zhang, Weiyi [2 ]
Wang, Yanzhi [1 ]
Liu, Chi Harold [3 ]
Yang, Dejun [4 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA
[2] AT&T Labs Res, Middletown, NJ 07748 USA
[3] Beijing Inst Technol, Beijing 100081, Peoples R China
[4] Colorado Sch Mines, Dept Elect Engn & Comp Sci, Golden, CO 80401 USA
来源
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2018) | 2018年
基金
美国国家科学基金会;
关键词
Experience-driven Networking; Deep Reinforcement Learning; Traffic Engineering; CONGESTION CONTROL; STABILITY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern communication networks have become very complicated and highly dynamic, which makes them hard to model, predict and control. In this paper we develop a novel experience-driven approach that can learn to well control a communication network from its own experience rather than an accurate mathematical model, just as a human learns a new skill (such as driving, swimming, etc). Specifically, we, for the first time, propose to leverage emerging Deep Reinforcement Learning (DRL) for enabling model-free control in communication networks; and present a novel and highly effective DRL-based control framework. DRL-TE, fur a fundamental networking problem: Traffic Engineering (TE). The proposed framework maximizes a widely-used utility function by jointly learning network environment and its dynamics, and making decisions under the guidance of powerful Deep Neural Networks (DNNs). We propose two new techniques, TE-aware exploration and actor-critic-based prioritized experience replay, to optimize the general DRL framework particularly for TE. To validate and evaluate the proposed framework, we implemented it in ns-3, and tested it comprehensively with both representative and randomly generated network topologies. Extensive packet-level simulation results show that 1) compared to several widely-used baseline methods, DRL-TE signiticantly reduces end-to-end delay and consistently improves the network utility, while offering better or comparable throughput; 2) DRL-TE is robust to network changes; and 3) DRL-TE consistently outperforms a state-of-the-art DRL method (for continuous control), Deep Deterministic Policy Gradient (DDPG), which, however, does not otter satisfying performance.
引用
收藏
页码:1880 / 1888
页数:9
相关论文
共 33 条
[1]  
Agarwal S, 2013, IEEE INFOCOM SER, P2211
[2]  
[Anonymous], 2016, ARXIV161102247
[3]  
[Anonymous], 2016, DEEP LEARNING
[4]  
[Anonymous], 2015, Reinforcement Learning: An Introduction
[5]  
[Anonymous], 2015, P INT C LEARN REPR I
[6]  
[Anonymous], ALGORITHMS NEXT GENE
[7]  
[Anonymous], 2012, ser. Systems & Control: Foundations & Applications
[8]  
Bai S, 2012, IEEE INFOCOM SER, P1593, DOI 10.1109/INFCOM.2012.6195528
[9]  
Dulac-Arnold G., 2016, Deep reinforcement learning in large discrete action spaces
[10]  
Einhorn E., IFIP AIMS 2008, P120