UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning

被引:56
作者
Li, Bo [1 ]
Gan, Zhigang [1 ]
Chen, Daqing [2 ]
Sergey Aleksandrovich, Dyachenko [3 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710072, Peoples R China
[2] London South Bank Univ, Sch Engn, London SE1 0AA, England
[3] Moscow Inst Aviat Technol, Sch Robot & Intelligent Syst, Moscow 125993, Russia
关键词
UAV; maneuvering target tracking; deep reinforcement learning; meta-learning; multi-tasks; SYSTEM;
D O I
10.3390/rs12223789
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
This paper combines deep reinforcement learning (DRL) with meta-learning and proposes a novel approach, named meta twin delayed deep deterministic policy gradient (Meta-TD3), to realize the control of unmanned aerial vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider a multi-task experience replay buffer to provide data for the multi-task learning of the DRL algorithm, and we combine meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, namely the deep deterministic policy gradient (DDPG) and twin delayed deep deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [21] Dynamic Target Tracking of Autonomous Underwater Vehicle Based on Deep Reinforcement Learning
    Shi, Jiaxiang
    Fang, Jianer
    Zhang, Qizhong
    Wu, Qiuxuan
    Zhang, Botao
    Gao, Farong
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (10)
  • [22] UAV Autonomous Target Search Based on Deep Reinforcement Learning in Complex Disaster Scene
    Wu, Chunxue
    Ju, Bobo
    Wu, Yan
    Lin, Xiao
    Xiong, Naixue
    Xu, Guangquan
    Li, Hongyan
    Liang, Xuefeng
    IEEE ACCESS, 2019, 7 : 117227 - 117245
  • [23] SAR Target Recognition Based on Probabilistic Meta-Learning
    Wang, Ke
    Zhang, Gong
    Xu, Yanbing
    Leung, Henry
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (04) : 682 - 686
  • [24] DeepMTT: A deep learning maneuvering target-tracking algorithm based on bidirectional LSTM network
    Liu, Jingxian
    Wang, Zulin
    Xu, Mai
    INFORMATION FUSION, 2020, 53 : 289 - 304
  • [25] Energy Saving Strategy of UAV in MEC Based on Deep Reinforcement Learning
    Dai, Zhiqiang
    Xu, Gaochao
    Liu, Ziqi
    Ge, Jiaqi
    Wang, Wei
    FUTURE INTERNET, 2022, 14 (08)
  • [26] Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments
    Kong, Xiaoran
    Zhou, Yatong
    Li, Zhe
    Wang, Shaohai
    FRONTIERS IN NEUROROBOTICS, 2024, 17
  • [27] Deep reinforcement learning based trajectory optimization for UAV-enabled IoT with SWIPT
    Yang, Yuwen
    Liu, Xin
    AD HOC NETWORKS, 2024, 159
  • [28] Towards Continual Reinforcement Learning through Evolutionary Meta-Learning
    Grbic, Djordje
    Risi, Sebastian
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 119 - 120
  • [29] UAV Maneuvering Decision-Making Algorithm Based on Deep Reinforcement Learning Under the Guidance of Expert Experience
    Zhan, Guang
    Zhang, Kun
    Li, Ke
    Piao, Haiyin
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2024, 35 (03) : 644 - 665
  • [30] Autonomous navigation of UAV in multi-obstacle environments based on a Deep Reinforcement Learning approach
    Zhang, Sitong
    Li, Yibing
    Dong, Qianhui
    APPLIED SOFT COMPUTING, 2022, 115