Real-Time Correlation Tracking Via Joint Model Compression and Transfer

被引:39
作者
Wang, Ning [1 ]
Zhou, Wengang [1 ]
Song, Yibing [2 ]
Ma, Chao [3 ]
Li, Houqiang [1 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, CAS Key Lab Technol Geospatial Informat Proc & Ap, Hefei 230026, Anhui, Peoples R China
[2] Tencent AI Lab, Shenzhen 518000, Peoples R China
[3] Shanghai Jiao Tong Univ, MoE Key Lab Artificial Intelligence, AI Inst, Shanghai 200240, Peoples R China
关键词
Correlation tracking; model transfer; knowledge distillation; real-time tracking; OBJECT TRACKING;
D O I
10.1109/TIP.2020.2989544
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Correlation filters (CF) have received considerable attention in visual tracking because of their computational efficiency. Leveraging deep features via off-the-shelf CNN models (e.g., VGG), CF trackers achieve state-of-the-art performance while consuming a large number of computing resources. This limits deep CF trackers to be deployed to many mobile platforms on which only a single-core CPU is available. In this paper, we propose to jointly compress and transfer off-the-shelf CNN models within a knowledge distillation framework. We formulate a CNN model pretrained from the image classification task as a teacher network, and distill this teacher network into a lightweight student network as the feature extractor to speed up CF trackers. In the distillation process, we propose a fidelity loss to enable the student network to maintain the representation capability of the teacher network. Meanwhile, we design a tracking loss to adapt the objective of the student network from object recognition to visual tracking. The distillation process is performed offline on multiple layers and adaptively updates the student network using a background-aware online learning scheme. The online adaptation stage exploits the background contents to improve the feature discrimination of the student network. Extensive experiments on six standard datasets demonstrate that the lightweight student network accelerates the speed of state-of-the-art deep CF trackers to real-time on a single-core CPU while maintaining almost the same tracking accuracy.
引用
收藏
页码:6123 / 6135
页数:13
相关论文
共 71 条
  • [1] [Anonymous], 2014, P BRIT MACH VIS C BM
  • [2] [Anonymous], 2016, INT C LEARNING REPRE
  • [3] [Anonymous], 2018, P ECCV
  • [4] Ba LJ, 2014, ADV NEUR IN, V27
  • [5] Staple: Complementary Learners for Real-Time Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Golodetz, Stuart
    Miksik, Ondrej
    Torr, Philip H. S.
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1401 - 1409
  • [6] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [7] Unveiling the Power of Deep Tracking
    Bhat, Goutam
    Johnander, Joakim
    Danelljan, Martin
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 493 - 509
  • [8] Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
  • [9] The devil is in the details: an evaluation of recent feature encoding methods
    Chatfield, Ken
    Lempitsky, Victor
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [10] Context-aware Deep Feature Compression for High-speed Visual Tracking
    Choi, Jongwon
    Chang, Hyung Jin
    Fischer, Tobias
    Yun, Sangdoo
    Lee, Kyuewang
    Jeong, Jiyeoup
    Demiris, Yiannis
    Choi, Jin Young
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 479 - 488