Siamese Box Adaptive Network for Visual Tracking

被引:676
作者
Chen, Zedu [1 ]
Zhong, Bineng [1 ,6 ]
Li, Guorong [2 ]
Zhang, Shengping [3 ,4 ]
Ji, Rongrong [4 ,5 ]
机构
[1] Huaqiao Univ, Dept Comp Sci & Technol, Quanzhou, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing, Peoples R China
[3] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China
[4] Peng Cheng Lab, Shenzhen, Guangdong, Peoples R China
[5] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Xiamen, Fujian, Peoples R China
[6] Nanjing Univ Sci & Technol, Key Lab Intelligent Percept & Syst High Dimens In, Minist Educ, Nanjing, Jiangsu, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR42600.2020.00670
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target. Unfortunately, they typically call for tedious and heuristic configurations. To address this issue, we propose a simple yet effective visual tracking framework (named Siamese Box Adaptive Network, SiamBAN) by exploiting the expressive power of the fully convolutional network (FCN). SiamBAN views the visual tracking problem as a parallel classification and regression problem, and thus directly classifies objects and regresses their bounding boxes in a unified FCN. The no-prior box design avoids hyper-parameters associated with the candidate boxes, making SiamBAN more flexible and general. Extensive experiments on visual tracking benchmarks including VOT2018, VOT2019, OTB100, NFS, UAV123, and LaSOT demonstrate that SiamBAN achieves state-of-the-art performance and runs at 40 FPS, confirming its effectiveness and efficiency. The code will be available at https: //github. com/hqucv/siamban.
引用
收藏
页码:6667 / 6676
页数:10
相关论文
共 52 条
  • [1] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [2] Learning Discriminative Model Prediction for Tracking
    Bhat, Goutam
    Danelljan, Martin
    Van Gool, Luc
    Timofte, Radu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
  • [3] Unveiling the Power of Deep Tracking
    Bhat, Goutam
    Johnander, Joakim
    Danelljan, Martin
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 493 - 509
  • [4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [5] ATOM: Accurate Tracking by Overlap Maximization
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4655 - 4664
  • [6] ECO: Efficient Convolution Operators for Tracking
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6931 - 6939
  • [7] Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
    Danelljan, Martin
    Robinson, Andreas
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 472 - 488
  • [8] Learning Spatially Regularized Correlation Filters for Visual Tracking
    Danelljan, Martin
    Hager, Gustav
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4310 - 4318
  • [9] Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking
    Fan, Heng
    Ling, Haibin
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7944 - 7953
  • [10] LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking
    Fan, Heng
    Lin, Liting
    Yang, Fan
    Chu, Peng
    Deng, Ge
    Yu, Sijia
    Bai, Hexin
    Xu, Yong
    Liao, Chunyuan
    Ling, Haibin
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5369 - 5378