Learning transform-aware attentive network for object tracking

被引：28

作者：

Lu, Xiankai ^{[1
]}

Ni, Bingbing ^{[1
]}

Ma, Chao ^{[1
]}

Yang, Xiaokang ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China

来源：

NEUROCOMPUTING | 2019年 / 349卷

关键词：

Transform-aware; Visual attention; Spatial Transformer Networks; Object tracking; VISUAL TRACKING; FILTER;

D O I：

10.1016/j.neucom.2019.02.021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing trackers often decompose the task of visual tracking into multiple independent components, such as target appearance sampling, classifier learning, and target state inferring. In this paper, we present a transform-aware attentive tracking framework, which uses a deep attentive network to directly predict the target states via spatial transform parameters. During off-line training, the proposed network learns generic motion patterns of target objects from auxiliary large-scale videos. These leaned motion patterns are then applied to track target objects on test sequences. Built on the Spatial Transform Network (STN), the proposed attentive network is fully differentiable and can be trained in an end-to-end manner. Notably, we only fine-tune the pre-trained network in the initial frame. The proposed tracker requires neither online model update nor appearance sampling during the tracking process. Extensive experiments on OTB-2013, OTB-2015, VOT-2014 and UAV-123 datasets demonstrate the competitive performance of our method against state-of-the-art attentive tracking methods. (C) 2019 Published by Elsevier B.V.

引用

页码：133 / 144

页数：12

共 65 条

[1] [Anonymous], 2014, P BMVC
[2] [Anonymous], 2016, CVPR
[3] [Anonymous], 2014, Neural Information Processing Systems
[4] Ba J., 2015, INT C LEARN REPR, P13
[5] Lucas-Kanade 20 years on: A unifying framework
Baker, S
Matthews, I
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 56 (03) : 221 - 255
[6] Fully-Convolutional Siamese Networks for Object Tracking
Bertinetto, Luca
Valmadre, Jack
Henriques, Joao F.
Vedaldi, Andrea
Torr, Philip H. S.
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
[7] Tracking multiple targets with multifocal attention
Cavanagh, P
Alvarez, GA
[J]. TRENDS IN COGNITIVE SCIENCES, 2005, 9 (07) : 349 - 354
[8] Visual Tracking Using Attention-Modulated Disintegration and Integration
Choi, Jongwon
Chang, Hyung Jin
Jeong, Jiyeoup
Demiris, Yiannis
Choi, Jin Young
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4321 - 4330
[9] Recurrently Target-Attending Tracking
Cui, Zhen
Xiao, Shengtao
Feng, Jiashi
Yan, Shuicheng
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1449 - 1458
[10] Learning Where to Attend with Deep Architectures for Image Tracking
Denil, Misha
Bazzani, Loris
Larochelle, Hugo
de Freitas, Nando
[J]. NEURAL COMPUTATION, 2012, 24 (08) : 2151 - 2184

← 1 2 3 4 5 6 7 →