A Novel Adaptive Resource Allocation Model Based on SMDP and Reinforcement Learning Algorithm in Vehicular Cloud System

被引:49
作者
Liang, Hongbin [1 ,2 ]
Zhang, Xiaohui [1 ,2 ,3 ]
Zhang, Jin [1 ,2 ]
Li, Qizhen [3 ]
Zhou, Shuya [1 ,2 ]
Zhao, Lian [4 ]
机构
[1] Southwest Jiaotong Univ, Natl United Engn Lab Integrated & Intelligent Tra, Sch Transportat & Logist, Chengdu 611756, Sichuan, Peoples R China
[2] Southwest Jiaotong Univ, Natl Engn Lab Integrated Transportat Big Data Ap, Chengdu 611756, Sichuan, Peoples R China
[3] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 611756, Sichuan, Peoples R China
[4] Ryerson Univ, Dept Elect Comp & Biomed Engn, Toronto, ON M5B 2K3, Canada
基金
中国国家自然科学基金;
关键词
Cloud computing; Resource management; Adaptation models; Adaptive systems; Quality of service; Quality of experience; Computational modeling; Semi-Markov Decision Process (SMDP); Reinforcement Learning (RL) Algorithm; Vehicular Cloud System; Neural-Network; Quality of Experience (QoE); Quality of Service (QoS); ASSIGNMENT; COMMUNICATION; OPTIMIZATION;
D O I
10.1109/TVT.2019.2937842
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a novel adaptive cloud resource allocation model based on Semi-Markov Decision Process (SMDP) and Reinforcement Learning (RL) algorithm in vehicular cloud system. The issue of adaptive resource allocation for vehicular request is formed as an SMDP in order to gain the dynamics of vehicular requests arrival and departure. An optimized decision is made to guarantee the Quality of Service (QoS) of the vehicular cloud system and the Quality of Experience (QoE) of the vehicular users as well as to maximize the total system reward of the vehicular cloud system in consideration of the balance between the vehicular cloud resource expense and the system income. Furthermore, to capture the mobility feature of the vehicular cloud system, we also apply a neural-network-based RL algorithm to resolve our proposed SMDP-based adaptive cloud resource allocation model. Firstly, we use a Planning algorithm to get the action values under certain state-action pairs, which are the initial samples to train the neural network. Then the RL is used to update the parameters of the neural network, train the neural network and adaptively improve the decision strategy. Subsequently, an adaptive vehicular cloud resource allocation scheme which can approach the optimal strategy is obtained without the knowledge of the distribution function of vehicular requests arrival and departure during the RL process. The simulation results show that our proposed adaptive cloud resource allocation model for vehicular cloud system can reduce the probability of delay in processing requests and achieve high system rewards in comparison with the regularly used greedy resource allocation method. The performance of the RL solution approaches that of traditional value iteration solution for our proposed adaptive cloud resource allocation model.
引用
收藏
页码:10018 / 10029
页数:12
相关论文
共 37 条
[1]  
Alam MGR, 2016, 2016 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN), P285, DOI 10.1109/ICOIN.2016.7427078
[2]  
Boukerche A, 2017, INT WIREL COMMUN, P159, DOI 10.1109/IWCMC.2017.7986279
[3]   Coordinated Self-Configuration of Virtual Machines and Appliances Using a Model-Free Learning Approach [J].
Bu, Xiangping ;
Rao, Jia ;
Xu, Cheng-Zhong .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (04) :681-690
[4]   Efficient Resource Allocation for On-Demand Mobile-Edge Cloud Computing [J].
Chen, Xu ;
Li, Wenzhong ;
Lu, Sanglu ;
Zhou, Zhi ;
Fu, Xiaoming .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2018, 67 (09) :8769-8780
[5]  
Cheng JY, 2015, SHOCK VIB, V2015, DOI [10.1155/2015/290293, 10.1038/srep12516]
[6]  
Hoang DT, 2014, IEEE ICC, P3764, DOI 10.1109/ICC.2014.6883907
[7]   Contention Intensity Based Distributed Coordination for V2V Safety Message Broadcast [J].
Gao, Jie ;
Li, Mushu ;
Zhao, Lian ;
Shen, Xuemin .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2018, 67 (12) :12288-12301
[8]   Network Utility Maximization Based on an Incentive Mechanism for Truthful Reporting of Local Information [J].
Gao, Jie ;
Zhao, Lian ;
Shen, Xuemin .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2018, 67 (08) :7523-7537
[9]   Performance Analysis and Enhancement of the DSRC for VANET's Safety Applications [J].
Hafeez, Khalid Abdel ;
Zhao, Lian ;
Ma, Bobby ;
Mark, Jon W. .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2013, 62 (07) :3069-3083
[10]   A Continuous-Time Markov decision process-based resource allocation scheme in vehicular cloud for mobile video services [J].
Hou, Lu ;
Zheng, Kan ;
Chatzimisios, Periklis ;
Feng, Yi .
COMPUTER COMMUNICATIONS, 2018, 118 :140-147