Reinforcement Learning Enabled Dynamic Resource Allocation in the Internet of Vehicles

被引:43
作者
Liang, Hongbin [1 ,2 ]
Zhang, Xiaohui [3 ,4 ]
Hong, Xintao [5 ,6 ]
Zhang, Zongyuan [7 ]
Li, Mushu [8 ]
Hu, Guangdi [9 ]
Hou, Fen [10 ,11 ]
机构
[1] Southwest Jiaotong Univ, Sch Transportat & Logist, Natl United Engn Lab Integrated & Intelligent Tra, Chengdu 611756, Peoples R China
[2] Southwest Jiaotong Univ, Natl Engn Lab Integrated Trans Portat Big Data Ap, Chengdu 611756, Peoples R China
[3] Nanjing NARI Informat & Commun Technol Co Ltd, Nanjing 210000, Peoples R China
[4] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 611756, Peoples R China
[5] Chengdu Technol Univ, Sch Econ & Management, Chengdu 611756, Peoples R China
[6] Southwest Jiaotong Univ, Sch Econ & Management, Chengdu 611756, Peoples R China
[7] Beijing Univ Technol, Fac Sci, Beijing 100124, Peoples R China
[8] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
[9] Southwest Jiaotong Univ, Automot Res Inst, Chengdu 611756, Peoples R China
[10] Univ Macau, State Key Lab IoT Smart City, Macau, Peoples R China
[11] Univ Macau, Dept Elect & Comp Engn, Macau, Peoples R China
基金
中国国家自然科学基金;
关键词
Resource management; Computational modeling; Cloud computing; Learning (artificial intelligence); Vehicle dynamics; Dynamic scheduling; Hierarchical architecture; Internet of Vehicles (IoV); reinforcement learning; resource allocation; semi-Markov decision process (SMDP); VEHICULAR NETWORKS; CHANNEL ASSIGNMENT; SPECTRUM; FRAMEWORK;
D O I
10.1109/TII.2020.3019386
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an important application scenario of the industrial Internet of things, the Internet of Vehicles can significantly improve road safety, improve traffic management efficiency, and improve people's travel experience. Due to the high dynamics of the Internet of vehicles environment, the traditional resource optimization technologies cannot meet the requirements of the Internet of vehicles for dynamic communication, computing and storage resources optimization management, and artificial intelligence algorithms can adaptively obtain dynamic resource allocation schemes through self-learning. Therefore, adopting artificial intelligence techniques to optimize the dynamic resource of the Internet of Vehicles is the research focus of this article. In this article, we first model the Internet of Vehicles resource allocation problem as a semi-Markov decision process that introduces a resource reservation strategy and a secondary resource allocation mechanism. Then, the reinforcement learning algorithm is used to solve the model. Thereafter, it theoretically analyzes the joint optimization of computing and communication resources, models it as a hierarchical architecture, and uses hierarchical reinforcement learning to obtain the optimal system resource allocation plan. Finally, the results of simulation experiments show that the dynamic resource allocation scheme of the Internet of vehicles based on the reinforcement learning in this article greatly improve resource utilization and user quality of experience with guaranteeing system quality of service compared with the traditional greedy algorithm.
引用
收藏
页码:4957 / 4967
页数:11
相关论文
共 26 条
[1]  
Alam MGR, 2016, 2016 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN), P285, DOI 10.1109/ICOIN.2016.7427078
[2]  
Berkenkamp F, 2017, ADV NEUR IN, V30
[3]   Opportunistic Spectrum Access for CR-VANETs: A Game-Theoretic Approach [J].
Cheng, Nan ;
Zhang, Ning ;
Lu, Ning ;
Shen, Xuemin ;
Mark, Jon W. ;
Liu, Fuqiang .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2014, 63 (01) :237-251
[4]   Dynamic Resource Prediction and Allocation in C-RAN With Edge Artificial Intelligence [J].
Chien, Wei-Che ;
Lai, Chin-Feng ;
Chao, Han-Chieh .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (07) :4306-4314
[5]  
Darken C., 1992, Neural Networks for Signal Processing II. Proceedings of the IEEE-SP Workshop (Cat. No.92TH0430-9), P3, DOI 10.1109/NNSP.1992.253713
[6]   Dynamical Resource Allocation in Edge for Trustable Internet-of-Things Systems: A Reinforcement Learning Method [J].
Deng, Shuiguang ;
Xiang, Zhengzhe ;
Zhao, Peng ;
Taheri, Javid ;
Gao, Honghao ;
Yin, Jianwei ;
Zomaya, Albert Y. .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (09) :6103-6113
[7]   From Theory to Experimental Evaluation: Resource Management in Software-Defined Vehicular Networks [J].
Fontes, Ramon Dos Reis ;
Campolo, Claudia ;
Rothenberg, Christian Esteve ;
Molinaro, Antonella .
IEEE ACCESS, 2017, 5 :3069-3076
[8]  
Kaiser L., 2020, P INT C LEARN REPR
[9]   Channel assignment schemes for cellular mobile telecommunication systems: A comprehensive survey [J].
Katzela, I ;
Naghshineh, M .
IEEE PERSONAL COMMUNICATIONS, 1996, 3 (03) :10-31
[10]   Privacy Leakage of Location Sharing in Mobile Social Networks: Attacks and Defense [J].
Li, Huaxin ;
Zhu, Haojin ;
Du, Suguo ;
Liang, Xiaohui ;
Shen, Xuemin .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2018, 15 (04) :646-660