A Twofold Siamese Network for Real-Time Object Tracking

被引:567
作者
He, Anfeng [1 ,3 ]
Luo, Chong [2 ]
Tian, Xinmei [1 ]
Zeng, Wenjun [2 ]
机构
[1] Univ Sci & Technol China, CAS Key Lab Technol Geospatial Informat Proc & Ap, Hefei, Anhui, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
[3] MSRA, Beijing, Peoples R China
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
关键词
D O I
10.1109/CVPR.2018.00508
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observing that Semantic features learned in an image classification task and Appearance features learned in a similarity matching task complement each other, we build a twofold Siamese network, named SA-Siam, for real-time object tracking. SA-Siam is composed of a semantic branch and an appearance branch. Each branch is a similarity-learning Siamese network. An important design choice in SA-Siam is to separately train the two branches to keep the heterogeneity of the two types of features. In addition, we propose a channel attention mechanism for the semantic branch. Channel-wise weights are computed according to the channel activations around the target position. While the inherited architecture from SiamFC [3] allows our tracker to operate beyond real-time, the twofold design and the attention mechanism significantly improve the tracking performance. The proposed SA-Siam outperforms all other real-time trackers by a large margin on OTB-2013/50/100 benchmarks.
引用
收藏
页码:4834 / 4843
页数:10
相关论文
共 41 条
[1]  
Abadi M., 2016, TENSORFLOW LARGESCAL
[2]  
[Anonymous], 2017, arXiv preprint arXiv:1704.04057
[3]  
[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.465
[4]  
[Anonymous], 2015, ARXIV151008660
[5]   Staple: Complementary Learners for Real-Time Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Golodetz, Stuart ;
Miksik, Ondrej ;
Torr, Philip H. S. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1401-1409
[6]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[7]   Attentional Correlation Filter Network for Adaptive Visual Tracking [J].
Choi, Jongwon ;
Chang, Hyung Jin ;
Yun, Sangdoo ;
Fischer, Tobias ;
Demiris, Yiannis ;
Choi, Jin Young .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4828-4837
[8]   Visual Tracking Using Attention-Modulated Disintegration and Integration [J].
Choi, Jongwon ;
Chang, Hyung Jin ;
Jeong, Jiyeoup ;
Demiris, Yiannis ;
Choi, Jin Young .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4321-4330
[9]   ECO: Efficient Convolution Operators for Tracking [J].
Danelljan, Martin ;
Bhat, Goutam ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6931-6939
[10]   Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking [J].
Danelljan, Martin ;
Robinson, Andreas ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :472-488