Learning transform-aware attentive network for object tracking

被引:28
作者
Lu, Xiankai [1 ]
Ni, Bingbing [1 ]
Ma, Chao [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
关键词
Transform-aware; Visual attention; Spatial Transformer Networks; Object tracking; VISUAL TRACKING; FILTER;
D O I
10.1016/j.neucom.2019.02.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing trackers often decompose the task of visual tracking into multiple independent components, such as target appearance sampling, classifier learning, and target state inferring. In this paper, we present a transform-aware attentive tracking framework, which uses a deep attentive network to directly predict the target states via spatial transform parameters. During off-line training, the proposed network learns generic motion patterns of target objects from auxiliary large-scale videos. These leaned motion patterns are then applied to track target objects on test sequences. Built on the Spatial Transform Network (STN), the proposed attentive network is fully differentiable and can be trained in an end-to-end manner. Notably, we only fine-tune the pre-trained network in the initial frame. The proposed tracker requires neither online model update nor appearance sampling during the tracking process. Extensive experiments on OTB-2013, OTB-2015, VOT-2014 and UAV-123 datasets demonstrate the competitive performance of our method against state-of-the-art attentive tracking methods. (C) 2019 Published by Elsevier B.V.
引用
收藏
页码:133 / 144
页数:12
相关论文
共 65 条
  • [1] [Anonymous], 2014, P BMVC
  • [2] [Anonymous], 2016, CVPR
  • [3] [Anonymous], 2014, Neural Information Processing Systems
  • [4] Ba J., 2015, INT C LEARN REPR, P13
  • [5] Lucas-Kanade 20 years on: A unifying framework
    Baker, S
    Matthews, I
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 56 (03) : 221 - 255
  • [6] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [7] Tracking multiple targets with multifocal attention
    Cavanagh, P
    Alvarez, GA
    [J]. TRENDS IN COGNITIVE SCIENCES, 2005, 9 (07) : 349 - 354
  • [8] Visual Tracking Using Attention-Modulated Disintegration and Integration
    Choi, Jongwon
    Chang, Hyung Jin
    Jeong, Jiyeoup
    Demiris, Yiannis
    Choi, Jin Young
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4321 - 4330
  • [9] Recurrently Target-Attending Tracking
    Cui, Zhen
    Xiao, Shengtao
    Feng, Jiashi
    Yan, Shuicheng
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1449 - 1458
  • [10] Learning Where to Attend with Deep Architectures for Image Tracking
    Denil, Misha
    Bazzani, Loris
    Larochelle, Hugo
    de Freitas, Nando
    [J]. NEURAL COMPUTATION, 2012, 24 (08) : 2151 - 2184