A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning

被引:0
作者
Wei, Dexing [1 ]
Zhang, Lun [1 ]
Yang, Mei [1 ]
Deng, Hanqiang [1 ]
Huang, Jian [1 ]
机构
[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China
关键词
UAVs; reinforcement learning; long-term search task; multi-agent;
D O I
10.3390/drones8100536
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Unmanned aerial vehicles (UAVs) are increasingly being employed in search operations. Deep reinforcement learning (DRL), owing to its robust self-learning and adaptive capabilities, has been extensively applied to drone search tasks. However, traditional DRL approaches often suffer from long training times, especially in long-term search missions for UAVs, where the interaction cycles between the agent and the environment are extended. This paper addresses this critical issue by introducing a novel method-temporally asynchronous grouped environment reinforcement learning (TAGRL). Our key innovation lies in recognizing that as the number of training environments increases, agents can learn knowledge from discontinuous trajectories. This insight leads to the design of grouped environments, allowing agents to explore only a limited number of steps within each interaction cycle rather than completing full sequences. Consequently, TAGRL demonstrates faster learning speeds and lower memory consumption compared to existing parallel environment learning methods. The results indicate that this framework enhances the efficiency of UAV search tasks, paving the way for more scalable and effective applications of RL in complex scenarios.
引用
收藏
页数:18
相关论文
共 27 条
  • [1] Roadmap-based path planning - Using the Voronoi diagram for a clearance-based shortest path
    Bhattacharya, Priyadarshi
    Gavrilova, Marina L.
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2008, 15 (02) : 58 - 66
  • [2] Dalton S., 2020, Adv. Neural Inf. Process. Syst, V33, P19773, DOI DOI 10.48550/ARXIV.1907.08467
  • [3] UAV Swarm Search Path Planning Method Based on Probability of Containment
    Fan, Xiangyu
    Li, Hao
    Chen, You
    Dong, Danna
    [J]. DRONES, 2024, 8 (04)
  • [4] Freeman C.D., 2021, Brax-a Differentiable Physics Engine for Large Scale Rigid Body Simulation, V6
  • [5] Imanberdiyev N, 2016, I C CONT AUTOMAT ROB
  • [6] Cooperative aerial search by an innovative optimized map-sharing algorithm
    Karimi, Samine
    Saghafi, Fariborz
    [J]. DRONE SYSTEMS AND APPLICATIONS, 2024, 12 : 1 - 18
  • [7] Research on Dynamic Target Search for Multi-UAV Based on Cooperative Coevolution Motion-Encoded Particle Swarm Optimization
    Li, Yiyuan
    Chen, Weiyi
    Fu, Bing
    Wu, Zhonghong
    Hao, Lingjun
    Yang, Guang
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [8] Liang E, 2018, PR MACH LEARN RES, V80
  • [9] FishGym: A High-Performance Physics-based Simulation Framework for Underwater Robot Learning
    Liu, Wenji
    Bai, Kai
    He, Xuming
    Song, Shuran
    Zheng, Changxi
    Liu, Xiaopei
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 6268 - 6275
  • [10] Liu Xiang., 2011, International Conference on Electric Information and Control Engineering, P24, DOI [DOI 10.1109/ICEICE.2011.5777723, 10.1109/ICEICE.2011.5777723]