Q-Learning Aided Intelligent Routing With Maximum Utility in Cognitive UAV Swarm for Emergency Communications

被引:23
作者
Zhang, Long [1 ,2 ]
Ma, Xiaozheng [3 ]
Zhuang, Zirui [4 ]
Xu, Haitao [5 ]
Sharma, Vishal [6 ]
Han, Zhu [7 ,8 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Mobile Commun Technol, Chongqing 400065, Peoples R China
[3] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[5] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[6] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT9 5BN, North Ireland
[7] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA
[8] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea
基金
中国国家自然科学基金;
关键词
Emergency communications; UAV swarm; cognitive radio; intelligent routing; maximum utility; Q-learning; SPECTRUM ACCESS; NETWORKS; OPPORTUNITIES; INTEGRATION; CHALLENGES; DELAY;
D O I
10.1109/TVT.2022.3221538
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article studies the routing problem in a cognitive unmanned aerial vehicle (UAV) swarm (CU-SWARM), which employs the cognitive radio into a swarm of UAVs within a three-layer hierarchical aerial-ground integrated network architecture for emergency communications. In particular, the flexibly converged architecture utilizes a UAV swarm and a high-altitude platform to support aerial sensing and access, respectively, over the disaster-affected areas. We develop a Q-learning framework to achieve the intelligent routing to maximize the utility for CU-SWARM. To characterize the reward function, we take into account both the routing metric design and the candidate UAV selection optimization. The routing metric jointly captures the achievable rate and the residual energy of UAV. Besides, under the location, arc, and direction constraints, the circular sector is modeled by properly choosing the central angle and the acceptable signal-to-noise ratio for UAV to optimize the candidate UAV selection. With this setup, we further propose a low-complexity iterative algorithm using the dynamic learning rate to update Q-values during the training process for achieving a fast convergence speed. Simulation results are provided to assess the potential of the Q-learning framework of intelligent routing as well as to verify our overall iterative algorithm via the dynamic learning rate for training procedure. Our findings reveal that the proposed algorithm converges in a few number of iterations. Furthermore, the proposed algorithm can increase the accumulated rewards, and achieve significant performance gains, as compared to the benchmark schemes.
引用
收藏
页码:3707 / 3723
页数:17
相关论文
共 47 条
[1]   Forming a Two-Tier Heterogeneous Air-Network via Combination of High and Low Altitude Platforms [J].
Ahmadinejad, Hosein ;
Falahati, Abolfazl .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (02) :1989-2001
[2]   Stochastic Geometry Study on Device-to-Device Communication as a Disaster Relief Solution [J].
Al-Hourani, Akram ;
Kandeepan, Sithamparanathan ;
Jamalipour, Abbas .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2016, 65 (05) :3005-3017
[3]   Localization and Clustering Based on Swarm Intelligence in UAV Networks for Emergency Communications [J].
Arafat, Muhammad Yeasir ;
Moh, Sangman .
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (05) :8958-8976
[4]   Using Reinforcement Learning to Minimize the Probability of Delay Occurrence in Transportation [J].
Cao, Zhiguang ;
Guo, Hongliang ;
Song, Wen ;
Gao, Kaizhou ;
Chen, Zhenghua ;
Zhang, Le ;
Zhang, Xuexi .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (03) :2424-2436
[5]   Integration of Satellite and LTE for Disaster Recovery [J].
Casoni, Maurizio ;
Grazia, Carlo Augusto ;
Klapez, Martin ;
Patriciello, Natale ;
Amditis, A. ;
Sdongos, E. .
IEEE COMMUNICATIONS MAGAZINE, 2015, 53 (03) :47-53
[6]   Toward Robust and Intelligent Drone Swarm: Challenges and Future Directions [J].
Chen, Wu ;
Liu, Jiajia ;
Guo, Hongzhi ;
Kato, Nei .
IEEE NETWORK, 2020, 34 (04) :278-283
[7]   Air-Ground Integrated Mobile Edge Networks: Architecture, Challenges, and Opportunities [J].
Cheng, Nan ;
Xu, Wenchao ;
Shi, Weisen ;
Zhou, Yi ;
Lu, Ning ;
Zhou, Haibo ;
Shen, Xuemin .
IEEE COMMUNICATIONS MAGAZINE, 2018, 56 (08) :26-32
[8]  
Erdelj M, 2017, IEEE PERVAS COMPUT, V16, P24, DOI 10.1109/MPRV.2017.11
[9]   A Dynamic Priority Packet Scheduling Scheme for Post-disaster UAV-assisted Mobile Ad Hoc network [J].
Gao, Mengdi ;
Zhang, Biling ;
Wang, Li .
2021 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2021,
[10]   A NOMA-Enabled Framework for Relay Deployment and Network Optimization in Double-Layer Airborne Access VANETs [J].
He, Yixin ;
Nie, Laisen ;
Guo, Tan ;
Kaur, Kuljeet ;
Hassan, Mohammad Mehedi ;
Yu, Keping .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) :22452-22466