SiamCPN: Visual tracking with the Siamese center-prediction network

被引:7
作者
Chen, Dong [1 ,2 ,4 ]
Tang, Fan [3 ]
Dong, Weiming [1 ,2 ,4 ]
Yao, Hanxing [4 ,5 ]
Xu, Changsheng [1 ,2 ,4 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100040, Peoples R China
[2] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China
[3] Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Peoples R China
[4] CASIA LLVISION Joint Lab, Beijing 100190, Peoples R China
[5] LLVISION Technol Co LTD, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Siamese network; single object tracking; anchor-free; center point detection; OBJECT TRACKING;
D O I
10.1007/s41095-021-0212-1
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Object detection is widely used in object tracking; anchor-free object tracking provides an end-to-end single-object-tracking approach. In this study, we propose a new anchor-free network, the Siamese center-prediction network (SiamCPN). Given the presence of referenced object features in the initial frame, we directly predict the center point and size of the object in subsequent frames in a Siamese-structure network without the need for perframe post-processing operations. Unlike other anchor-free tracking approaches that are based on semantic segmentation and achieve anchor-free tracking by pixel-level prediction, SiamCPN directly obtains all information required for tracking, greatly simplifying the model. A center-prediction sub-network is applied to multiple stages of the backbone to adaptively learn from the experience of different branches of the Siamese net. The model can accurately predict object location, implement appropriate corrections, and regress the size of the target bounding box. Compared to other leading Siamese networks, SiamCPN is simpler, faster, and more efficient as it uses fewer hyperparameters. Experiments demonstrate that our method outperforms other leading Siamese networks on GOT-10K and UAV123 benchmarks, and is comparable to other excellent trackers on LaSOT, VOT2016, and OTB-100 while improving inference speed 1.5 to 2 times.
引用
收藏
页码:253 / 265
页数:13
相关论文
共 48 条
  • [1] Staple: Complementary Learners for Real-Time Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Golodetz, Stuart
    Miksik, Ondrej
    Torr, Philip H. S.
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1401 - 1409
  • [2] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [3] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
  • [4] Danelljan M., 2014, BRIT MACH VIS C
  • [5] ATOM: Accurate Tracking by Overlap Maximization
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4655 - 4664
  • [6] ECO: Efficient Convolution Operators for Tracking
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6931 - 6939
  • [7] Discriminative Scale Space Tracking
    Danelljan, Martin
    Hager, Gustav
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (08) : 1561 - 1575
  • [8] Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
    Danelljan, Martin
    Robinson, Andreas
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 472 - 488
  • [9] Learning Spatially Regularized Correlation Filters for Visual Tracking
    Danelljan, Martin
    Hager, Gustav
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4310 - 4318
  • [10] LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking
    Fan, Heng
    Lin, Liting
    Yang, Fan
    Chu, Peng
    Deng, Ge
    Yu, Sijia
    Bai, Hexin
    Xu, Yong
    Liao, Chunyuan
    Ling, Haibin
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5369 - 5378