Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引：8

作者：

Xing, Xiaojun ^{[1
,2
]}

Zhou, Zhiwei ^{[1
]}

Li, Yan ^{[1
]}

Xiao, Bing ^{[1
]}

Xun, Yilin ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China

[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2024年 / 73卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;

D O I：

10.1109/TVT.2024.3389555

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.

引用

页码：12484 / 12499

页数：16

共 50 条

[21] Multi-UAV Cooperative Target Assignment Method Based on Reinforcement Learning
Ding, Yunlong
Kuang, Minchi
Shi, Heng
Gao, Jiazhan
DRONES, 2024, 8 (10)
[22] Multi-UAV Formation Transformation Based on Improved Heuristically-Accelerated Reinforcement Learning
Xiao, Yanbing
Zhang, Yingzhou
Sun, Yuxin
Qian, Junyan
2019 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2019, : 341 - 347
[23] Game of Drones: Multi-UAV Pursuit-Evasion Game With Online Motion Planning by Deep Reinforcement Learning
Zhang, Ruilong
Zong, Qun
Zhang, Xiuyun
Dou, Liqian
Tian, Bailing
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 7900 - 7909
[24] A Deep Reinforcement Learning Method for Collision Avoidance with Dense Speed-Constrained Multi-UAV
Han, Jiale
Zhu, Yi
Yang, Jian
IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2152 - 2159
[25] Deep Reinforcement Learning-Based Distributed 3D UAV Trajectory Design
He, Huasen
Yuan, Wenke
Chen, Shuangwu
Jiang, Xiaofeng
Yang, Feng
Yang, Jian
IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (06) : 3736 - 3751
[26] A Deep Reinforcement Learning Based UAV Trajectory Planning Method For Integrated Sensing And Communications Networks
Lin, Heyun
Zhang, Zhihai
Wei, Longkun
Zhou, Zihao
Zheng, Tian
2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
[27] Deep Reinforcement Learning for Real-Time Trajectory Planning in UAV Networks
Li, Kai
Ni, Wei
Tovar, Eduardo
Guizani, Mohsen
2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 958 - 963
[28] Reinforcement Learning based Approach for Multi-UAV Cooperative Searching in Unknown Environments
Yue, Wei
Guan, Xianhe
Xi, Yun
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 2018 - 2023
[29] Flexible multi-UAV formation control via integrating deep reinforcement learning and affine transformations
Liu, Yunhao
Liu, Zhihong
Wang, Guanzheng
Yan, Chao
Wang, Xiangke
Huang, Zhiping
AEROSPACE SCIENCE AND TECHNOLOGY, 2025, 157
[30] Multi-UAV Mobile Edge Computing and Path Planning Platform Based on Reinforcement Learning
Chang, Huan
Chen, Yicheng
Zhang, Baochang
Doermann, David
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (03): : 489 - 498

← 1 2 3 4 5 →