Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引:8
|
作者
Xing, Xiaojun [1 ,2 ]
Zhou, Zhiwei [1 ]
Li, Yan [1 ]
Xiao, Bing [1 ]
Xun, Yilin [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;
D O I
10.1109/TVT.2024.3389555
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.
引用
收藏
页码:12484 / 12499
页数:16
相关论文
共 50 条
  • [21] Multi-UAV Cooperative Target Assignment Method Based on Reinforcement Learning
    Ding, Yunlong
    Kuang, Minchi
    Shi, Heng
    Gao, Jiazhan
    DRONES, 2024, 8 (10)
  • [22] Multi-UAV Formation Transformation Based on Improved Heuristically-Accelerated Reinforcement Learning
    Xiao, Yanbing
    Zhang, Yingzhou
    Sun, Yuxin
    Qian, Junyan
    2019 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2019, : 341 - 347
  • [23] Game of Drones: Multi-UAV Pursuit-Evasion Game With Online Motion Planning by Deep Reinforcement Learning
    Zhang, Ruilong
    Zong, Qun
    Zhang, Xiuyun
    Dou, Liqian
    Tian, Bailing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 7900 - 7909
  • [24] A Deep Reinforcement Learning Method for Collision Avoidance with Dense Speed-Constrained Multi-UAV
    Han, Jiale
    Zhu, Yi
    Yang, Jian
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2152 - 2159
  • [25] Deep Reinforcement Learning-Based Distributed 3D UAV Trajectory Design
    He, Huasen
    Yuan, Wenke
    Chen, Shuangwu
    Jiang, Xiaofeng
    Yang, Feng
    Yang, Jian
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (06) : 3736 - 3751
  • [26] A Deep Reinforcement Learning Based UAV Trajectory Planning Method For Integrated Sensing And Communications Networks
    Lin, Heyun
    Zhang, Zhihai
    Wei, Longkun
    Zhou, Zihao
    Zheng, Tian
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [27] Deep Reinforcement Learning for Real-Time Trajectory Planning in UAV Networks
    Li, Kai
    Ni, Wei
    Tovar, Eduardo
    Guizani, Mohsen
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 958 - 963
  • [28] Reinforcement Learning based Approach for Multi-UAV Cooperative Searching in Unknown Environments
    Yue, Wei
    Guan, Xianhe
    Xi, Yun
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 2018 - 2023
  • [29] Flexible multi-UAV formation control via integrating deep reinforcement learning and affine transformations
    Liu, Yunhao
    Liu, Zhihong
    Wang, Guanzheng
    Yan, Chao
    Wang, Xiangke
    Huang, Zhiping
    AEROSPACE SCIENCE AND TECHNOLOGY, 2025, 157
  • [30] Multi-UAV Mobile Edge Computing and Path Planning Platform Based on Reinforcement Learning
    Chang, Huan
    Chen, Yicheng
    Zhang, Baochang
    Doermann, David
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (03): : 489 - 498