UAV Swarm Cooperative Dynamic Target Search: A MAPPO-Based Discrete Optimal Control Method

被引:2
作者
Wei, Dexing [1 ]
Zhang, Lun [1 ]
Liu, Quan [1 ]
Chen, Hao [1 ]
Huang, Jian [1 ]
机构
[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China
关键词
UAVs; optimal control; dynamic target search; multi-agents; MAPPO;
D O I
10.3390/drones8060214
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Unmanned aerial vehicles (UAVs) are commonly employed in pursuit and rescue missions, where the target's trajectory is unknown. Traditional methods, such as evolutionary algorithms and ant colony optimization, can generate a search route in a given scenario. However, when the scene changes, the solution needs to be recalculated. In contrast, more advanced deep reinforcement learning methods can train an agent that can be directly applied to a similar task without recalculation. Nevertheless, there are several challenges when the agent learns how to search for unknown dynamic targets. In this search task, the rewards are random and sparse, which makes learning difficult. In addition, because of the need for the agent to adapt to various scenario settings, interactions required between the agent and the environment are more comparable to typical reinforcement learning tasks. These challenges increase the difficulty of training agents. To address these issues, we propose the OC-MAPPO method, which combines optimal control (OC) and Multi-Agent Proximal Policy Optimization (MAPPO) with GPU parallelization. The optimal control model provides the agent with continuous and stable rewards. Through parallelized models, the agent can interact with the environment and collect data more rapidly. Experimental results demonstrate that the proposed method can help the agent learn faster, and the algorithm demonstrated a 26.97% increase in the success rate compared to genetic algorithms.
引用
收藏
页数:20
相关论文
共 38 条
  • [1] Andrychowicz M., 2017, Advances in Neural Information Processing Systems, V30, P5048
  • [2] Cao L, 2014, 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS IEEE-ROBIO 2014, P2368, DOI 10.1109/ROBIO.2014.7090692
  • [3] Adaptive Search Control Applied to Search and Rescue Operations Using Unmanned Aerial Vehicles (UAVs)
    Chaves, A. N.
    Cugnasca, P. S.
    Neto, J. J.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2014, 12 (07) : 1278 - 1283
  • [4] Review of agricultural spraying technologies for plant protection using unmanned aerial vehicle (UAV)
    Chen, Haibo
    Lan, Yubin
    Fritz, Bradley K.
    Hoffmann, W. Clint
    Liu, Shengbo
    [J]. INTERNATIONAL JOURNAL OF AGRICULTURAL AND BIOLOGICAL ENGINEERING, 2021, 14 (01) : 38 - 49
  • [5] Hierarchical Task Assignment Strategy for Heterogeneous Multi-UAV System in Large-Scale Search and Rescue Scenarios
    Chen, Jie
    Xiao, Kai
    You, Kai
    Qing, Xianguo
    Ye, Fang
    Sun, Qian
    [J]. INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2021, 2021
  • [6] UAV trajectory planning based on bi-directional APF-RRT* algorithm with goal-biased
    Fan, Jiaming
    Chen, Xia
    Liang, Xiao
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [7] UAV Swarm Search Path Planning Method Based on Probability of Containment
    Fan, Xiangyu
    Li, Hao
    Chen, You
    Dong, Danna
    [J]. DRONES, 2024, 8 (04)
  • [8] A novel hybrid particle swarm optimization for multi-UAV cooperate path planning
    He, Wenjian
    Qi, Xiaogang
    Liu, Lifang
    [J]. APPLIED INTELLIGENCE, 2021, 51 (10) : 7350 - 7364
  • [9] UAV Swarm Cooperative Target Search: A Multi-Agent Reinforcement Learning Approach
    Hou, Yukai
    Zhao, Jin
    Zhang, Rongqing
    Cheng, Xiang
    Yang, Liuqing
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 568 - 578
  • [10] Cooperative aerial search by an innovative optimized map-sharing algorithm
    Karimi, Samine
    Saghafi, Fariborz
    [J]. DRONE SYSTEMS AND APPLICATIONS, 2024, 12 : 1 - 18