Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引：8

作者：

Xing, Xiaojun ^{[1
,2
]}

Zhou, Zhiwei ^{[1
]}

Li, Yan ^{[1
]}

Xiao, Bing ^{[1
]}

Xun, Yilin ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China

[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2024年 / 73卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;

D O I：

10.1109/TVT.2024.3389555

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.

引用

页码：12484 / 12499

页数：16

共 50 条

[41] Deep Reinforcement Learning for Trajectory Path Planning and Distributed Inference in Resource-Constrained UAV Swarms
Dhuheir, Marwan
Baccour, Emna
Erbad, Aiman
Al-Obaidi, Sinan Sabeeh
Hamdi, Mounir
IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (09) : 8185 - 8201
[42] Multi-Agent Deep Reinforcement Learning for Joint Decoupled User Association and Trajectory Design in Full-Duplex Multi-UAV Networks
Dai, Chen
Zhu, Kun
Hossain, Ekram
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 6056 - 6070
[43] UAV Trajectory Planning for Complex Open Storage Environments Based on an Improved RRT Algorithm
Zhang, Jingcheng
An, Yuqiang
Cao, Jianing
Ouyang, Shibo
Wang, Lei
IEEE ACCESS, 2023, 11 : 23189 - 23204
[44] Reconfigurable Intelligent Surface-Assisted Multi-UAV Networks: Efficient Resource Allocation With Deep Reinforcement Learning
Khoi Khac Nguyen
Khosravirad, Saeed R.
da Costa, Daniel Benevides
Nguyen, Long D.
Duong, Trung Q.
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (03) : 358 - 368
[45] Deep Reinforcement Learning Based Energy Efficient Multi-UAV Data Collection for IoT Networks
Khodaparast, Seyed Saeed
Lu, Xiao
Wang, Ping
Uyen Trang Nguyen
IEEE OPEN JOURNAL OF VEHICULAR TECHNOLOGY, 2021, 2 : 249 - 260
[46] Throughput Maximization in NOMA Enhanced RIS-Assisted Multi-UAV Networks: A Deep Reinforcement Learning Approach
Tang, Runzhi
Wang, Junxuan
Zhang, Yanyan
Jiang, Fan
Zhang, Xuewei
Du, Jianbo
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (01) : 730 - 745
[47] Deep reinforcement learning-based reactive trajectory planning method for UAVs
Cao, Lijia
Wang, Lin
Liu, Yang
Xu, Weihong
Geng, Chuang
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2024, 238 (10) : 1018 - 1037
[48] Optimal formation tracking control based on reinforcement learning for multi-UAV systems
Wang, Weizhen
Chen, Xin
Jia, Jiangbo
Wu, Kaili
Xie, Mingyang
CONTROL ENGINEERING PRACTICE, 2023, 141
[49] QoE-Driven Adaptive Deployment Strategy of Multi-UAV Networks Based on Hybrid Deep Reinforcement Learning
Zhou, Yi
Ma, Xiaoyong
Hu, Shuting
Zhou, Danyang
Cheng, Nan
Lu, Ning
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (08) : 5868 - 5881
[50] Reinforcement Learning-Based Collision Avoidance and Optimal Trajectory Planning in UAV Communication Networks
Hsu, Yu-Hsin
Gau, Rung-Hung
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (01) : 306 - 320

← 1 2 3 4 5 →