Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning

被引:135
作者
Li, Bohao
Wu, Yunjie [1 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
DDPG; deep reinforcement learning; obstacle avoidance; target tracking; UAV;
D O I
10.1109/ACCESS.2020.2971780
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we focus on the study of UAV ground target tracking under obstacle environments using deep reinforcement learning, and an improved deep deterministic policy gradient (DDPG) algorithm is presented. A reward function based on line of sight and artificial potential field is constructed to guide the behavior of UAV to achieve target tracking, and a penalty term of action makes the trajectory smooth. In order to improve the exploration ability, multiple UAVs, which controlled by the same policy network, are used to perform tasks in each episode. Taking into account that the history observations have a great degree of correlation with the policy, long short-term memory networks are used to approximate the state of environments, which improve the approximation accuracy and the efficiency of data utilization. The simulation results show that the propose method can make the UAV keep target tracking and obstacle avoidance effectively.
引用
收藏
页码:29064 / 29074
页数:11
相关论文
共 41 条
[1]  
[Anonymous], 2020, INT T OPER RES, DOI DOI 10.1111/itor.12653
[2]  
[Anonymous], 2019, ARXIV190307435
[3]  
[Anonymous], [No title captured]
[4]  
[Anonymous], [No title captured]
[5]  
[Anonymous], [No title captured]
[6]  
[Anonymous], 2016, ARXIV161102247
[7]  
[Anonymous], [No title captured]
[8]  
[Anonymous], 2017, ARXIV170404651
[9]  
[Anonymous], 2017, ARXIV171002896
[10]  
Castro Pablo Samuel, 2018, DOPAMINE RES FRAMEWO