SiamCPN: Visual tracking with the Siamese center-prediction network

被引：7

作者：

Chen, Dong ^{[1
,2
,4
]}

Tang, Fan ^{[3
]}

Dong, Weiming ^{[1
,2
,4
]}

Yao, Hanxing ^{[4
,5
]}

Xu, Changsheng ^{[1
,2
,4
]}

机构：

[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100040, Peoples R China

[2] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China

[3] Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Peoples R China

[4] CASIA LLVISION Joint Lab, Beijing 100190, Peoples R China

[5] LLVISION Technol Co LTD, Beijing 100190, Peoples R China

来源：

COMPUTATIONAL VISUAL MEDIA | 2021年 / 7卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Siamese network; single object tracking; anchor-free; center point detection; OBJECT TRACKING;

D O I：

10.1007/s41095-021-0212-1

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Object detection is widely used in object tracking; anchor-free object tracking provides an end-to-end single-object-tracking approach. In this study, we propose a new anchor-free network, the Siamese center-prediction network (SiamCPN). Given the presence of referenced object features in the initial frame, we directly predict the center point and size of the object in subsequent frames in a Siamese-structure network without the need for perframe post-processing operations. Unlike other anchor-free tracking approaches that are based on semantic segmentation and achieve anchor-free tracking by pixel-level prediction, SiamCPN directly obtains all information required for tracking, greatly simplifying the model. A center-prediction sub-network is applied to multiple stages of the backbone to adaptively learn from the experience of different branches of the Siamese net. The model can accurately predict object location, implement appropriate corrections, and regress the size of the target bounding box. Compared to other leading Siamese networks, SiamCPN is simpler, faster, and more efficient as it uses fewer hyperparameters. Experiments demonstrate that our method outperforms other leading Siamese networks on GOT-10K and UAV123 benchmarks, and is comparable to other excellent trackers on LaSOT, VOT2016, and OTB-100 while improving inference speed 1.5 to 2 times.

引用

页码：253 / 265

页数：13

共 48 条

[1] Staple: Complementary Learners for Real-Time Tracking
Bertinetto, Luca
Valmadre, Jack
Golodetz, Stuart
Miksik, Ondrej
Torr, Philip H. S.
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1401 - 1409
[2] Fully-Convolutional Siamese Networks for Object Tracking
Bertinetto, Luca
Valmadre, Jack
Henriques, Joao F.
Vedaldi, Andrea
Torr, Philip H. S.
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
[3] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
Cao, Zhe
Hidalgo, Gines
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
[4] Danelljan M., 2014, BRIT MACH VIS C
[5] ATOM: Accurate Tracking by Overlap Maximization
Danelljan, Martin
Bhat, Goutam
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4655 - 4664
[6] ECO: Efficient Convolution Operators for Tracking
Danelljan, Martin
Bhat, Goutam
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6931 - 6939
[7] Discriminative Scale Space Tracking
Danelljan, Martin
Hager, Gustav
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (08) : 1561 - 1575
[8] Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
Danelljan, Martin
Robinson, Andreas
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 472 - 488
[9] Learning Spatially Regularized Correlation Filters for Visual Tracking
Danelljan, Martin
Hager, Gustav
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4310 - 4318
[10] LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking
Fan, Heng
Lin, Liting
Yang, Fan
Chu, Peng
Deng, Ge
Yu, Sijia
Bai, Hexin
Xu, Yong
Liao, Chunyuan
Ling, Haibin
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5369 - 5378

← 1 2 3 4 5 →