Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引:8
|
作者
Xing, Xiaojun [1 ,2 ]
Zhou, Zhiwei [1 ]
Li, Yan [1 ]
Xiao, Bing [1 ]
Xun, Yilin [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;
D O I
10.1109/TVT.2024.3389555
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.
引用
收藏
页码:12484 / 12499
页数:16
相关论文
共 50 条
  • [1] Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks
    Gong, Shimin
    Wang, Meng
    Gu, Bo
    Zhang, Wenjie
    Dinh Thai Hoang
    Niyato, Dusit
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (08) : 10933 - 10948
  • [2] Deep Reinforcement Learning Based Computation Offloading and Trajectory Planning for Multi-UAV Cooperative Target Search
    Luo, Quyuan
    Luan, Tom H.
    Shi, Weisong
    Fan, Pingzhi
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2023, 41 (02) : 504 - 520
  • [3] Trajectory Planning and Resource Allocation for Multi-UAV Cooperative Computation
    Xu, Wenlong
    Zhang, Tiankui
    Mu, Xidong
    Liu, Yuanwei
    Wang, Yapeng
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (07) : 4305 - 4318
  • [4] Reinforcement Learning Based Trajectory Planning for Multi-UAV Load Transportation
    Estevez, Julian
    Manuel Lopez-Guede, Jose
    del Valle-Echavarri, Javier
    Grana, Manuel
    IEEE ACCESS, 2024, 12 : 144009 - 144016
  • [5] Deep Reinforcement Learning Multi-UAV Trajectory Control for Target Tracking
    Moon, Jiseon
    Papaioannou, Savvas
    Laoudias, Christos
    Kolios, Panayiotis
    Kim, Sunwoo
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (20) : 15441 - 15455
  • [6] Trajectory Design and Resource Allocation for Multi-UAV Networks: Deep Reinforcement Learning Approaches
    Chang, Zheng
    Deng, Hengwei
    You, Li
    Min, Geyong
    Garg, Sahil
    Kaddoum, Georges
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (05): : 2940 - 2951
  • [7] A Multiagent Deep Reinforcement Learning Approach for Multi-UAV Cooperative Search in Multilayered Aerial Computing Networks
    Wu, Jiaqi
    Luo, Jingjing
    Jiang, Changkun
    Gao, Lin
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (05): : 5807 - 5821
  • [8] Collision Detection and Avoidance for Multi-UAV based on Deep Reinforcement Learning
    Wang, Guanzheng
    Liu, Zhihong
    Xiao, Kun
    Xu, Yinbo
    Yang, Lingjie
    Wang, Xiangke
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 7783 - 7789
  • [9] Research on Multi-Robot Formation Control Based on MATD3 Algorithm
    Zhou, Conghang
    Li, Jianxing
    Shi, Yujing
    Lin, Zhirui
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [10] Multi-UAV Dynamic Wireless Networking With Deep Reinforcement Learning
    Wang, Qiang
    Zhang, Wenqi
    Liu, Yuanwei
    Liu, Ying
    IEEE COMMUNICATIONS LETTERS, 2019, 23 (12) : 2243 - 2246