Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引:7
|
作者
Xing, Xiaojun [1 ,2 ]
Zhou, Zhiwei [1 ]
Li, Yan [1 ]
Xiao, Bing [1 ]
Xun, Yilin [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;
D O I
10.1109/TVT.2024.3389555
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.
引用
收藏
页码:12484 / 12499
页数:16
相关论文
共 50 条
  • [1] Deep Reinforcement Learning Based Computation Offloading and Trajectory Planning for Multi-UAV Cooperative Target Search
    Luo, Quyuan
    Luan, Tom H.
    Shi, Weisong
    Fan, Pingzhi
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2023, 41 (02) : 504 - 520
  • [2] Multi-UAV Adaptive Path Planning Using Deep Reinforcement Learning
    Westheider, Jonas
    Rueckin, Julius
    Popovic, Marija
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 649 - 656
  • [3] Reinforcement Learning Based Trajectory Planning for Multi-UAV Load Transportation
    Estevez, Julian
    Manuel Lopez-Guede, Jose
    del Valle-Echavarri, Javier
    Grana, Manuel
    IEEE ACCESS, 2024, 12 : 144009 - 144016
  • [4] Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks
    Gong, Shimin
    Wang, Meng
    Gu, Bo
    Zhang, Wenjie
    Dinh Thai Hoang
    Niyato, Dusit
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (08) : 10933 - 10948
  • [5] Multi-UAV Cooperative Trajectory Planning Based on the Modified Cheetah Optimization Algorithm
    Fu, Yuwen
    Yang, Shuai
    Liu, Bo
    Xia, E.
    Huang, Duan
    ENTROPY, 2023, 25 (09)
  • [6] Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning
    Zhang C.Y.
    Liang S.Y.
    He C.L.
    Wang K.Z.
    Journal of Communications and Information Networks, 2022, 7 (02): : 192 - 201
  • [7] Track planning of multi-UAV cooperative reconnaissance based on improved genetic algorithm
    Li W.
    Hu Y.
    Pang Q.
    Li Y.
    Jia H.
    Hu, Yongjiang (huyongjiang_jxxy@163.com), 1600, Editorial Department of Journal of Chinese Inertial Technology (28): : 248 - 255
  • [8] Research on Multi-Robot Formation Control Based on MATD3 Algorithm
    Zhou, Conghang
    Li, Jianxing
    Shi, Yujing
    Lin, Zhirui
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [9] Multi-UAV Cooperative Trajectory Planning Based on Many-Objective Evolutionary Algorithm
    Bai H.
    Fan T.
    Niu Y.
    Cui Z.
    Complex System Modeling and Simulation, 2022, 2 (02): : 130 - 141
  • [10] Deep Reinforcement Learning Multi-UAV Trajectory Control for Target Tracking
    Moon, Jiseon
    Papaioannou, Savvas
    Laoudias, Christos
    Kolios, Panayiotis
    Kim, Sunwoo
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (20) : 15441 - 15455