Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

被引:8
作者
Xing, Xiaojun [1 ,2 ]
Zhou, Zhiwei [1 ]
Li, Yan [1 ]
Xiao, Bing [1 ]
Xun, Yilin [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
[2] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Trajectory planning; Trajectory; Deep reinforcement learning; Planning; Reinforcement learning; Long short term memory; Multi-unmanned aerial vehicle (multi-UAV) cooperative formation trajectory planning; deep reinforcement learning; potential field-based dense reward; adaptive formation strategy; hierarchical training mechanism; SPACECRAFT; NAVIGATION; AVOIDANCE; DESIGN;
D O I
10.1109/TVT.2024.3389555
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-unmanned aerial vehicle (multi-UAV) cooperative trajectory planning is an extremely challenging issue in UAV research field due to its NP-hard characteristic, collision avoiding constraints, close formation requirement, consensus convergence and high-dimensional action space etc. Especially, the difficulty of multi-UAV trajectory planning will boost comparatively when there are complex obstacles and narrow passages in unknown environments. Accordingly, a novel multi-UAV adaptive cooperative formation trajectory planning approach is proposed in this article based on an improved deep reinforcement learning algorithm in unknown obstacle environments, which innovatively introduces long short-term memory (LSTM) recurrent neural network (RNN) into the environment perception end of multi-agent twin delayed deep deterministic policy gradient (MATD3) network, and develops an improved potential field-based dense reward function to strengthen the policy learning efficiency and accelerates the convergence respectively. Moreover, a hierarchical deep reinforcement learning training mechanism, including adaptive formation layer, trajectory planning layer and action execution layer is implemented to explore an optimal trajectory planning policy. Additionally, an adaptive formation maintaining and transformation strategy is presented for UAV swarm to comply with the environment with narrow passages. Simulation results show that the proposed approach is better in policy learning efficiency, optimality of trajectory planning policy and adaptability to narrow passages than that using multi-agent deep deterministic policy gradient (MADDPG) and MATD3.
引用
收藏
页码:12484 / 12499
页数:16
相关论文
共 50 条
  • [41] Deep Reinforcement Learning for Trajectory Path Planning and Distributed Inference in Resource-Constrained UAV Swarms
    Dhuheir, Marwan
    Baccour, Emna
    Erbad, Aiman
    Al-Obaidi, Sinan Sabeeh
    Hamdi, Mounir
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (09) : 8185 - 8201
  • [42] Multi-Agent Deep Reinforcement Learning for Joint Decoupled User Association and Trajectory Design in Full-Duplex Multi-UAV Networks
    Dai, Chen
    Zhu, Kun
    Hossain, Ekram
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 6056 - 6070
  • [43] UAV Trajectory Planning for Complex Open Storage Environments Based on an Improved RRT Algorithm
    Zhang, Jingcheng
    An, Yuqiang
    Cao, Jianing
    Ouyang, Shibo
    Wang, Lei
    IEEE ACCESS, 2023, 11 : 23189 - 23204
  • [44] Reconfigurable Intelligent Surface-Assisted Multi-UAV Networks: Efficient Resource Allocation With Deep Reinforcement Learning
    Khoi Khac Nguyen
    Khosravirad, Saeed R.
    da Costa, Daniel Benevides
    Nguyen, Long D.
    Duong, Trung Q.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (03) : 358 - 368
  • [45] Deep Reinforcement Learning Based Energy Efficient Multi-UAV Data Collection for IoT Networks
    Khodaparast, Seyed Saeed
    Lu, Xiao
    Wang, Ping
    Uyen Trang Nguyen
    IEEE OPEN JOURNAL OF VEHICULAR TECHNOLOGY, 2021, 2 : 249 - 260
  • [46] Throughput Maximization in NOMA Enhanced RIS-Assisted Multi-UAV Networks: A Deep Reinforcement Learning Approach
    Tang, Runzhi
    Wang, Junxuan
    Zhang, Yanyan
    Jiang, Fan
    Zhang, Xuewei
    Du, Jianbo
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (01) : 730 - 745
  • [47] Deep reinforcement learning-based reactive trajectory planning method for UAVs
    Cao, Lijia
    Wang, Lin
    Liu, Yang
    Xu, Weihong
    Geng, Chuang
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2024, 238 (10) : 1018 - 1037
  • [48] Optimal formation tracking control based on reinforcement learning for multi-UAV systems
    Wang, Weizhen
    Chen, Xin
    Jia, Jiangbo
    Wu, Kaili
    Xie, Mingyang
    CONTROL ENGINEERING PRACTICE, 2023, 141
  • [49] QoE-Driven Adaptive Deployment Strategy of Multi-UAV Networks Based on Hybrid Deep Reinforcement Learning
    Zhou, Yi
    Ma, Xiaoyong
    Hu, Shuting
    Zhou, Danyang
    Cheng, Nan
    Lu, Ning
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (08) : 5868 - 5881
  • [50] Reinforcement Learning-Based Collision Avoidance and Optimal Trajectory Planning in UAV Communication Networks
    Hsu, Yu-Hsin
    Gau, Rung-Hung
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (01) : 306 - 320