Adversarial Feature Sampling Learning for Efficient Visual Tracking

被引:9
作者
Yin, Yingjie [1 ,2 ,3 ]
Xu, De [1 ,3 ]
Wang, Xingang [1 ,3 ]
Zhang, Lei [2 ]
机构
[1] Chinese Acad Sci, Res Ctr Precis Sensing & Control, Inst Automat, Beijing 100190, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Adversarial learning; deep convolution neural network; feature sampling; visual tracking; OBJECT TRACKING;
D O I
10.1109/TASE.2019.2948402
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The tracking-by-detection tracking framework usually consists of two stages: drawing samples around the target object and classifying each sample as either the target object or background. Current popular trackers under this framework typically draw many samples from the raw image and feed them into the deep neural networks, resulting in high computational burden and low tracking speed. In this article, we propose an adversarial feature sampling learning (AFSL) method to address this problem. A convolutional neural network is designed, which takes only one cropped image around the target object as input, and samples are collected from the feature maps with spatial bilinear resampling. To enrich the appearance variations of positive samples in the feature space, which has limited spatial resolution, we fuse the high-level features and low-level features to better describe the target by using a generative adversarial network. Extensive experiments on benchmark data sets demonstrate that the proposed ASFL achieves leading tracking accuracy while significantly accelerating the speed of tracking-by-detection trackers. Note to Practitioners-Visual tracking can be applied in many intelligent automation systems, such as robotic intelligent navigation system, intelligent human-computer interaction system, and so on. In a robotic intelligent navigation system, visual tracking can generate target's motion trajectory from image sequences. Visual tracking can also obtain body movement information automatically during the interactive process in the intelligent human-computer interaction system. Accuracy and speed are two key indicators for visual tracking, and intelligent automation systems usually need a tracker with more accuracy and faster speed. This article aims to develop a fast and accurate tracking method by adversarial feature sampling learning (AFSL). In the concrete implementation process, AFSL gets samples by sampling in the feature space rather than on raw images to reduce computation. Then, an adversarial learning mechanism is adopted to boost the sampling features and enrich the target appearances in the feature space to improve the tracking accuracy. The proposed tracker is proven to be effective to keep leading tracking accuracy while significantly accelerating the tracking speed.
引用
收藏
页码:847 / 857
页数:11
相关论文
共 52 条
  • [1] [Anonymous], 2016, ARXIV PREPRINT ARXIV
  • [2] [Anonymous], 2015, PROC CVPR IEEE
  • [3] [Anonymous], P WORKSH VIS OBJ TRA
  • [4] [Anonymous], 2017, P IEEE INT C COMPUTE
  • [5] Staple: Complementary Learners for Real-Time Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Golodetz, Stuart
    Miksik, Ondrej
    Torr, Philip H. S.
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1401 - 1409
  • [6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [7] Attentional Correlation Filter Network for Adaptive Visual Tracking
    Choi, Jongwon
    Chang, Hyung Jin
    Yun, Sangdoo
    Fischer, Tobias
    Demiris, Yiannis
    Choi, Jin Young
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4828 - 4837
  • [8] Choi W, 2012, LECT NOTES COMPUT SC, V7575, P215, DOI 10.1007/978-3-642-33765-9_16
  • [9] Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
  • [10] Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
    Danelljan, Martin
    Hager, Gustav
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1430 - 1438