Joint Optimization of Caching, Computing, and Radio Resources for Fog-Enabled IoT Using Natural Actor-Critic Deep Reinforcement Learning

被引:251
作者
Wei, Yifei [1 ]
Yu, F. Richard [2 ]
Song, Mei [1 ]
Han, Zhu [3 ,4 ,5 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing Key Lab Work Safety Intelligent Monitorin, Beijing 100876, Peoples R China
[2] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[3] Univ Houston, Elect & Comp Engn Dept, Houston, TX 77004 USA
[4] Univ Houston, Comp Sci Dept, Houston, TX 77004 USA
[5] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 12001, South Korea
基金
中国国家自然科学基金;
关键词
Actor-critic; deep neural network (DNN); edge caching; fog computing; Internet of Things (IoT); reinforcement learning (RL);
D O I
10.1109/JIOT.2018.2878435
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The cloud-based Internet of Things (IoT) develops rapidly but suffer from large latency and backhaul bandwidth requirement, the technology of fog computing and caching has emerged as a promising paradigm for IoT to provide proximity services, and thus reduce service latency and save backhaul bandwidth. However, the performance of the fog-enabled IoT depends on the intelligent and efficient management of various network resources, and consequently the synergy of caching, computing, and communications becomes the big challenge. This paper simultaneously tackles the issues of content caching strategy, computation offloading policy, and radio resource allocation, and propose a joint optimization solution for the fog-enabled IoT. Since wireless signals and service requests have stochastic properties, we use the actor-critic reinforcement learning framework to solve the joint decision-making problem with the objective of minimizing the average end-to-end delay. The deep neural network (DNN) is employed as the function approximator to estimate the value functions in the critic part due to the extremely large state and action space in our problem. The actor part uses another DNN to represent a parameterized stochastic policy and improves the policy with the help of the critic. Furthermore, the Natural policy gradient method is used to avoid converging to the local maximum. Using the numerical simulations, we demonstrate the learning capacity of the proposed algorithm and analyze the end-to-end service latency.
引用
收藏
页码:2061 / 2073
页数:13
相关论文
共 38 条
[1]   Natural gradient works efficiently in learning [J].
Amari, S .
NEURAL COMPUTATION, 1998, 10 (02) :251-276
[2]   Exploring Synergy between Communications, Caching, and Computing in 5G-Grade Deployments [J].
Andreev, Sergey ;
Galinina, Olga ;
Pyattaev, Alexander ;
Hosek, Jiri ;
Masek, Pavel ;
Yanikomeroglu, Halim ;
Koucheryavy, Yevgeni .
IEEE COMMUNICATIONS MAGAZINE, 2016, 54 (08) :60-69
[3]  
[Anonymous], 2016, PROC INT C MACH LEAR
[4]  
[Anonymous], 2017, P IEEE WIR COMM NETW
[5]   Communicating While Computing [Distributed mobile cloud computing over 5G heterogeneous networks] [J].
Barbarossa, Sergio ;
Sardellitti, Stefania ;
Di Lorenzo, Paolo .
IEEE SIGNAL PROCESSING MAGAZINE, 2014, 31 (06) :45-55
[6]  
Barbarossa S, 2013, IEEE INT WORK SIGN P, P26, DOI 10.1109/SPAWC.2013.6612005
[7]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[8]   Natural actor-critic algorithms [J].
Bhatnagar, Shalabh ;
Sutton, Richard S. ;
Ghavamzadeh, Mohammad ;
Lee, Mark .
AUTOMATICA, 2009, 45 (11) :2471-2482
[9]   Fundamental Storage-Latency Tradeoff in Cache-Aided MIMO Interference Networks [J].
Cao, Youlong ;
Tao, Meixia ;
Xu, Fan ;
Liu, Kangqi .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2017, 16 (08) :5061-5076
[10]   Decentralized Computation Offloading Game for Mobile Cloud Computing [J].
Chen, Xu .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (04) :974-983