Path Planning for Cellular-Connected UAV: A DRL Solution With Quantum-Inspired Experience Replay

被引:64
作者
Li, Yuanjian [1 ]
Aghvami, A. Hamid [1 ]
Dong, Daoyi [2 ]
机构
[1] Kings Coll London, Ctr Telecommun Res CTR, London WC2R 2LS, England
[2] Univ New South Wales, Sch Engn & Informat Technol, Canberra, ACT 2600, Australia
关键词
Autonomous aerial vehicles; Wireless communication; Optimization; Navigation; Trajectory; Antenna radiation patterns; Reinforcement learning; Drone; trajectory design; deep reinforcement learning; quantum-inspired experience replay; INTERFERENCE CANCELLATION; TRAJECTORY OPTIMIZATION; COMMUNICATION; NETWORKS;
D O I
10.1109/TWC.2022.3162749
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In cellular-connected unmanned aerial vehicle (UAV) network, a minimization problem on the weighted sum of time cost and expected outage duration is considered. Taking advantage of UAV's adjustable mobility, a UAV navigation approach is formulated to achieve the aforementioned optimization goal. Conventional offline optimization techniques suffer from inefficiency in accomplishing the formulated UAV navigation task due to the practical consideration of local building distribution and directional antenna radiation pattern. Alternatively, after mapping the navigation task into a Markov decision process (MDP), a deep reinforcement learning (DRL)-aided solution is proposed to help the UAV find the optimal flying direction within each time slot, and thus the designed trajectory towards the destination can be generated. To help the DRL agent commit a better trade-off between sampling priority and diversity, a novel quantum-inspired experience replay (QiER) framework is proposed, via relating experienced transition's importance to its associated quantum bit (qubit) and applying Grover iteration based amplitude amplification technique. Compared to several representative DRL-related and non-learning baselines, the effectiveness and supremacy of the proposed DRL-QiER solution are demonstrated and validated in numerical results.
引用
收藏
页码:7897 / 7912
页数:16
相关论文
共 42 条
[1]  
[Anonymous], 2015, DEEP REINFORCEMENT L
[2]  
[Anonymous], 2017, document 3GPP TR 36.777
[3]  
[Anonymous], 2017, Tech. Rep.
[4]  
Bulut Eyuphan, 2018, IEEE INT C COMMUNICA
[5]   Wideband Channel Modeling and Intercarrier Interference Cancellation for Vehicle-to-Vehicle Communication Systems [J].
Cheng, Xiang ;
Yao, Qi ;
Wen, Miaowen ;
Wang, Cheng-Xiang ;
Song, Ling-Yang ;
Jiao, Bing-Li .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2013, 31 (09) :434-448
[6]   Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks [J].
Cui, Jingjing ;
Liu, Yuanwei ;
Nallanathan, Arumugam .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (02) :729-743
[7]   Quantum Speedup for Active Learning Agents [J].
Davide Paparo, Giuseppe ;
Dunjko, Vedran ;
Makmal, Adi ;
Angel Martin-Delgado, Miguel ;
Briegel, Hans J. .
PHYSICAL REVIEW X, 2014, 4 (03)
[8]   Quantum reinforcement learning [J].
Dong, Daoyi ;
Chen, Chunlin ;
Li, Hanxiong ;
Tarn, Tzyh-Jong .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (05) :1207-1220
[9]   Robust Quantum-Inspired Reinforcement Learning for Robot Navigation [J].
Dong, Daoyi ;
Chen, Chunlin ;
Chu, Jian ;
Tarn, Tzyh-Jong .
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2012, 17 (01) :86-97
[10]   Optimal insertion of pilot symbols for transmissions over time-varying flat fading channels [J].
Dong, M ;
Tong, L ;
Sadler, BM .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (05) :1403-1418