Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引:7
|
作者
Xing, Xiaojun [1 ,2 ]
Zhou, Zhiwei [1 ]
Li, Yan [1 ]
Xiao, Bing [1 ]
Xun, Yilin [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;
D O I
10.1109/TVT.2024.3389555
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.
引用
收藏
页码:12484 / 12499
页数:16
相关论文
共 50 条
  • [31] Multi-UAV trajectory optimizer: A sustainable system for wireless data harvesting with deep reinforcement learning
    Seong, Mincheol
    Jo, Ohyun
    Shin, Kyungseop
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120
  • [32] Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning
    Zhang, Wenqi
    Wang, Qiang
    Liu, Xiao
    Liu, Yuanwei
    Chen, Yue
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (01) : 600 - 612
  • [33] Q-Learning-based Multi-UAV Cooperative Path Planning Method
    Yin Y.
    Wang X.
    Zhou J.
    Binggong Xuebao/Acta Armamentarii, 2023, 44 (02): : 484 - 495
  • [34] A deep reinforcement learning based distributed multi-UAV dynamic area coverage algorithm for complex environment
    Xiao, Jian
    Yuan, Guohui
    Xue, Yuxi
    He, Jinhui
    Wang, Yaoting
    Zou, Yuanjiang
    Wang, Zhuoran
    NEUROCOMPUTING, 2024, 595
  • [35] A Self-Adaptive Improved Slime Mold Algorithm for Multi-UAV Path Planning
    Ma, Yuelin
    Zhang, Zeren
    Yao, Meng
    Fan, Guoliang
    DRONES, 2025, 9 (03)
  • [36] Multi-UAV roundup strategy method based on deep reinforcement learning CEL-MADDPG algorithm
    Li, Bo
    Wang, Jianmei
    Song, Chao
    Yang, Zhipeng
    Wan, Kaifang
    Zhang, Qingfu
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245
  • [37] Cooperative Multiagent Deep Reinforcement Learning for Reliable Surveillance via Autonomous Multi-UAV Control
    Yun, Won Joon
    Park, Soohyun
    Kim, Joongheon
    Shin, MyungJae
    Jung, Soyi
    Mohaisen, David A.
    Kim, Jae-Hyun
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 7086 - 7096
  • [38] Multi-UAV Dynamic Wireless Networking With Deep Reinforcement Learning
    Wang, Qiang
    Zhang, Wenqi
    Liu, Yuanwei
    Liu, Ying
    IEEE COMMUNICATIONS LETTERS, 2019, 23 (12) : 2243 - 2246
  • [39] A Deep Reinforcement Learning Algorithm for Trajectory Planning of Swarm UAV Fulfilling Wildfire Reconnaissance
    Demir, Kubilay
    Tumen, Vedat
    Kosunalp, Selahattin
    Iliev, Teodor
    ELECTRONICS, 2024, 13 (13)
  • [40] Multi-UAV cooperative path planning based on improved MOFA evolution of interactive strategy
    Lai L.
    Zou K.
    Wu D.
    Li B.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2021, 43 (08): : 2282 - 2289