iRAF: A Deep Reinforcement Learning Approach for Collaborative Mobile Edge Computing IoT Networks

被引:175
作者
Chen, Jienan [1 ]
Chen, Siyu [1 ]
Wang, Qi [1 ]
Cao, Bin [2 ]
Feng, Gang [1 ]
Hu, Jianhao [1 ]
机构
[1] Univ Elect Sci & Technol China, Natl Key Lab Sci & Technol Commun, Chengdu 611731, Sichuan, Peoples R China
[2] Beijing Univ Posts & Telecommun, Inst Network Technol, Beijing 100876, Peoples R China
基金
中国国家自然科学基金;
关键词
Collaborative mobile edge computing (CoMEC); deep reinforcement learning (DRL); intelligent resource allocation framework (iRAF); Internet of Things (IoT); Monte Carlo tree search (MCTS); MACHINE; GAME; GO;
D O I
10.1109/JIOT.2019.2913162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, as the development of artificial intelligence (AI), data-driven AI methods have shown amazing performance in solving complex problems to support the Internet of Things (IoT) world with massive resource-consuming and delay-sensitive services. In this paper, we propose an intelligent resource allocation framework (iRAF) to solve the complex resource allocation problem for the collaborative mobile edge computing (CoMEC) network. The core of iRAF is a multitask deep reinforcement learning algorithm for making resource allocation decisions based on network states and task characteristics, such as the computing capability of edge servers and devices, communication channel quality, resource utilization, and latency requirement of the services, etc. The proposed iRAF can automatically learn the network environment and generate resource allocation decision to maximize the performance over latency and power consumption with self-play training. iRAF becomes its own teacher: a deep neural network (DNN) is trained to predict iRAF's resource allocation action in a self-supervised learning manner, where the training data is generated from the searching process of Monte Carlo tree search (MCTS) algorithm. A major advantage of MCTS is that it will simulate trajectories into the future, starting from a root state, to obtain a best action by evaluating the reward value. Numerical results show that our proposed iRAF achieves 59.27% and 51.71% improvement on service latency performance compared with the greedy-search and the deep Q-learning-based methods, respectively.
引用
收藏
页码:7011 / 7024
页数:14
相关论文
共 44 条
[1]   Mobile Edge Computing: A Survey [J].
Abbas, Nasir ;
Zhang, Yan ;
Taherkordi, Amir ;
Skeie, Tor .
IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (01) :450-465
[2]  
Ahmed A, 2016, PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16)
[3]  
[Anonymous], 2017, 2017 IEEE WIR COMM N
[4]  
[Anonymous], 2016, INT SYM TURBO CODES
[5]  
Balevi E., 2018, 2018 IEEE INT C COMM, P1
[6]  
Balevi E, 2017, IEEE IPCCC
[7]  
Basu A, 2018, PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), P974, DOI 10.1109/ICCSP.2018.8524369
[8]   LEARN TO CACHE: MACHINE LEARNING FOR NETWORK EDGE CACHING IN THE BIG DATA ERA [J].
Chang, Zheng ;
Lei, Lei ;
Zhou, Zhenyu ;
Mao, Shiwen ;
Ristaniemi, Tapani .
IEEE WIRELESS COMMUNICATIONS, 2018, 25 (03) :28-35
[9]  
Chen X., IEEE INTERNET THINGS
[10]   Efficient Multi-User Computation Offloading for Mobile-Edge Cloud Computing [J].
Chen, Xu ;
Jiao, Lei ;
Li, Wenzhong ;
Fu, Xiaoming .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2016, 24 (05) :2827-2840