Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Meta-Reinforcement Learning

被引:0
作者
Jiang W. [1 ,2 ]
Wu J. [1 ,2 ]
Wang Y. [1 ,2 ]
机构
[1] College of Electrical and Information Engineering, Hunan Unviersity, Changsha
[2] National Engineering Research Center of Robot Visual Perception & Control Technology, Hunan University, Changsha
来源
Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences | 2022年 / 49卷 / 06期
基金
中国国家自然科学基金;
关键词
autonomous obstacle avoidance; meta-reinforcement learning; path planning; target tracking; Unmanned Aerial Vehicle(UAV);
D O I
10.16339/j.cnki.hdxbzkb.2022290
中图分类号
学科分类号
摘要
There are some problems with traditional deep reinforcement learning in solving autonomous obstacle avoidance and target tracking tasks for unmanned aerial vehicles(UAV),such as low training efficiency and weak adaptability to variable environments. To overcome these problems,this paper designs an internal and external meta-parameter update rule by incorporating Model-Agnostic Meta-Learning(MAML)into Deep Deterministic Policy Gradient(DDPG)algorithm and proposes a Meta-Deep Deterministic Policy Gradient(Meta-DDPG)algorithm inovder to improve the convergence speed and generalization ability of the model. Furthermore,the basic meta-task sets are constructed in the model’s pre-training stage to improve the efficiency of pre-training in practical engineering. Finally,the proposed algorithm is simulated and verified in Various testing environments. The results show that the introduction of the basic meta-task sets can make the model’s pre-training more efficient,Meta-DDPG algorithm has better convergence characteristics and environmental adaptability when compared with the DDPG algorithm. Furthermore,the meta-learning and the basic meta-task sets are universal to deterministic policy reinforcement learning. © 2022 Hunan University. All rights reserved.
引用
收藏
页码:101 / 109
页数:8
相关论文
共 17 条
[11]  
WANG J X, KURTH-NELSON Z, TIRUMALA D, Learning to reinforcement learn[EB/OL], (2016)
[12]  
XU J Y,, YAO L, LI L, Argumentation based reinforcement learning for meta-knowledge extraction[J], Information Sciences, 506, pp. 258-272, (2020)
[13]  
ZHANG Y Z, YAO K J, Pursuit missions for UAV swarms based on DDPG algorithm, Acta Aeronautica et Astronautica Sinica, 41, 10, (2020)
[14]  
LU J Y, LIU Q, Meta-reinforcement learning al⁃ gorithm based on automating policy entropy[J], Computer Science, 48, 6, pp. 168-174, (2021)
[15]  
HU Y, CHEN M Z,, SAAD W, Distributed multi-agent meta learning for trajectory design in wireless drone networks[J], IEEE Journal on Selected Areas in Communications, 39, 10, pp. 3177-3192, (2021)
[16]  
BELKHALE S, LI R, KAHN G, Model-based meta-reinforcement learning for flight with suspended payloads, IEEE Robotics and Automation Letters, 6, 2, pp. 1471-1478, (2021)
[17]  
FUJIMOTO S, VAN HOOF H, MEGER D., Addressing function approximation error in actor-critic methods, (2018)