End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning

被引:84
作者
Luo, Wenhan [1 ]
Sun, Peng [1 ]
Zhong, Fangwei [2 ,3 ]
Liu, Wei [1 ]
Zhang, Tong [1 ]
Wang, Yizhou [2 ,3 ]
机构
[1] Tencent AI Lab, Shenzhen 518057, Peoples R China
[2] Peking Univ, Natl Engn Lab Video Technol, Key Lab Machine Percept MoE, Comp Sci Dept, Beijing 100871, Peoples R China
[3] Peng Cheng Lab, Cooperat Medianet Innovat Ctr, Shenzhen, Peoples R China
关键词
Object tracking; Cameras; Target tracking; Reinforcement learning; Robot vision systems; Active object tracking; reinforcement learning; environment augmentation; VISUAL TRACKING; NETWORKS;
D O I
10.1109/TPAMI.2019.2899570
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study active object tracking, where a tracker takes visual observations (i.e., frame sequences) as input and produces the corresponding camera control signals as output (e.g., move forward, turn left, etc.). Conventional methods tackle tracking and camera control tasks separately, and the resulting system is difficult to tune jointly. These methods also require significant human efforts for image labeling and expensive trial-and-error system tuning in the real world. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning. A ConvNet-LSTM function approximator is adopted for the direct frame-to-action prediction. We further propose an environment augmentation technique and a customized reward function, which are crucial for successful training. The tracker trained in simulators (ViZDoom and Unreal Engine) demonstrates good generalization behaviors in the case of unseen object moving paths, unseen object appearances, unseen backgrounds, and distracting objects. The system is robust and can restore tracking after occasional lost of the target being tracked. We also find that the tracking ability, obtained solely from simulators, can potentially transfer to real-world scenarios. We demonstrate successful examples of such transfer, via experiments over the VOT dataset and the deployment of a real-world robot using the proposed active tracker trained in simulation.
引用
收藏
页码:1317 / 1332
页数:16
相关论文
共 64 条
[1]  
[Anonymous], 2018, IEEE C COMPUTER VISI
[2]  
[Anonymous], 1989, LEARNING DELAYED REW
[3]  
[Anonymous], 2011, BLUE SERIES
[4]  
[Anonymous], 2016, ARXIV160502097
[5]  
[Anonymous], 2016, C COMPUTER VISION PA
[6]  
[Anonymous], 2017, Gym-unrealcv: realistic virtual worlds for visual reinforcement learning
[7]  
[Anonymous], ADV COMPUTATIONAL CO
[8]  
[Anonymous], ROBOTICS SCI SYSTEMS
[9]  
[Anonymous], 2020, Reinforcement Learning: An Introduction
[10]  
[Anonymous], P IEEE INT C LEARN R