Delay-Optimal Traffic Engineering through Multi-agent Reinforcement Learning

被引:7
作者
Pinyoanuntapong, Pinyarash [1 ]
Lee, Minwoo [1 ]
Wang, Pu [1 ]
机构
[1] Univ North Carolina Charlotte, Dept Comp Sci, Charlotte, NC 28223 USA
来源
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM 2019 WKSHPS) | 2019年
基金
美国国家科学基金会;
关键词
D O I
10.1109/infcomw.2019.8845154
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Traffic engineering is one of the most important methods of optimizing network performance by designing optimal forwarding and routing rules to meet the quality of service (QoS) requirements for a large volume of traffic flows. End-to-end (E2E) delay is one of the key TE metrics. Optimizing E2E delay, however, is very challenging in large-scale multi-hop networks due to the profound network uncertainties and dynamics. This paper proposes a model-free TE framework that adopts multi-agent reinforcement learning for distributed control to minimize the E2E delay. In particular, distributed TE is formulated as a multi-agent extension of Markov decision process (MA-MDP). To solve this problem, a modular and composable learning framework is proposed, which consists of three interleaving modules including policy evaluation, policy improvement, and policy execution. Each of component can be implemented using different algorithms along with their extensions. Simulation results show that the combination of several extensions, such as double learning, expected policy evaluation, and on-policy learning, can provide superior E2E delay performance under high traffic load cases.
引用
收藏
页码:435 / 442
页数:8
相关论文
共 29 条
[1]  
Agarwal S, 2013, IEEE INFOCOM SER, P2211
[2]  
Andrychowicz M., 2017, Advances in neural information processing systems, P5048
[3]   The complexity of decentralized control of Markov decision processes [J].
Bernstein, DS ;
Givan, R ;
Immerman, N ;
Zilberstein, S .
MATHEMATICS OF OPERATIONS RESEARCH, 2002, 27 (04) :819-840
[4]  
Boyan Justin A, 1994, Adv. Neural Inf. Process. Syst, P671
[5]  
Chen YF, 2017, IEEE INT C INT ROBOT, P1343, DOI 10.1109/IROS.2017.8202312
[6]  
De Asis K., 2018, P 2018 C UNC ART INT
[7]   Deep Direct Reinforcement Learning for Financial Signal Representation and Trading [J].
Deng, Yue ;
Bao, Feng ;
Kong, Youyong ;
Ren, Zhiquan ;
Dai, Qionghai .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :653-664
[8]   State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow's Intelligent Network Traffic Control Systems [J].
Fadlullah, Zubair Md. ;
Tang, Fengxiao ;
Mao, Bomin ;
Kato, Nei ;
Akashi, Osamu ;
Inoue, Takeru ;
Mizutani, Kimihiro .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2017, 19 (04) :2432-2455
[9]   IEEE 802.11S: THE WLAN MESH STANDARD [J].
Hiertz, Guido R. ;
Denteneer, Dee ;
Max, Sebastian ;
Taori, Rakesh ;
Cardona, Javier ;
Berlemann, Lars ;
Walke, Bernhard .
IEEE WIRELESS COMMUNICATIONS, 2010, 17 (01) :104-111
[10]  
Kavalerov M, 2017, PROC CONF OPEN INNOV, P138, DOI 10.23919/FRUCT.2017.8071304