An Actor-Critic-Based Transfer Learning Framework for Experience-Driven Networking

被引:24
作者
Xu, Zhiyuan [1 ]
Yang, Dejun [2 ]
Tang, Jian [1 ]
Tang, Yinan [3 ]
Yuan, Tongtong [3 ]
Wang, Yanzhi [4 ]
Xue, Guoliang [5 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA
[2] Colorado Sch Mines, Dept Comp Sci, Golden, CO 80401 USA
[3] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100876, Peoples R China
[4] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[5] Arizona State Univ, Sch Comp Informat & Decis Syst Engn, Tempe, AZ 85287 USA
基金
美国国家科学基金会;
关键词
Training; Network topology; Topology; Routing; Knowledge engineering; Throughput; Mathematical model; Experience-driven networking; deep reinforcement learning and transfer learning; REINFORCEMENT; ACCESS;
D O I
10.1109/TNET.2020.3037231
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Experience-driven networking has emerged as a new and highly effective approach for resource allocation in complex communication networks. Deep Reinforcement Learning (DRL) has been shown to be a useful technique for enabling experience-driven networking. In this paper, we focus on a practical and fundamental problem for experience-driven networking: when network configurations are changed, how to train a new DRL agent to effectively and quickly adapt to the new environment. We present an Actor-Critic-based Transfer learning framework for the Traffic Engineering (TE) problem using policy distillation, which we call ACT-TE. ACT-TE effectively and quickly trains a new DRL agent to solve the TE problem in a new network environment, using both old knowledge (i.e., distilled from the existing agent) and new experience (i.e., newly collected samples). We implement ACT-TE in ns-3, and compare it with commonly-used baselines using packet-level simulations on three representative network topologies: NSFNET, ARPANET and random topology. The extensive simulation results show that 1) The existing well-trained DRL agents do not work well in new network environments; 2) ACT-TE significantly outperforms both two straightforward methods (training from scratch and fine-tuning based on an existing DRL agent) and several widely-used traditional methods in terms of network utility, throughput and delay.
引用
收藏
页码:360 / 371
页数:12
相关论文
共 49 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]  
[Anonymous], 2005, Advances in Neural Information Processing Systems
[3]  
[Anonymous], 2017, GLOBECOM 2017, DOI DOI 10.1109/GLOCOM.2017.8254101
[4]  
[Anonymous], 2016, P AAAI
[5]  
[Anonymous], 2016, PROCEEDINGS
[6]  
[Anonymous], 2015, Financial reporting with XBRL and its impact on the accounting profession
[7]   DeepCog: Optimizing Resource Provisioning in Network Slicing With AI-Based Capacity Forecasting [J].
Bega, Dario ;
Gramaglia, Marco ;
Fiore, Marco ;
Banchs, Albert ;
Costa-Perez, Xavier .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2020, 38 (02) :361-376
[8]  
Blitzer John R., 2006, P C EMP METH NAT LAN, P120, DOI [10.3115/1610075.1610094, DOI 10.3115/1610075.1610094]
[9]  
Bonilla E., 2008, ADV NEURAL INFORM PR, P153, DOI DOI 10.5555/2981562.2981582
[10]   AuTO: Scaling Deep Reinforcement Learning for Datacenter-Scale Automatic Traffic Optimization [J].
Chen, Li ;
Lingys, Justinas ;
Chen, Kai ;
Liu, Feng .
PROCEEDINGS OF THE 2018 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION (SIGCOMM '18), 2018, :191-205