A New Siamese Heterogeneous Convolutional Neural Networks Based on Attention Mechanism and Feature Pyramid

被引:9
作者
Lu, Zhenyu [1 ]
Bian, Yuelou [2 ]
Yang, Tingya [3 ]
Ge, Quanbo [4 ]
Wang, Yuanliang [5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch AI, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Sch Elect & Informat Engn, Nanjing 210044, Peoples R China
[3] Jiangsu Meteorol Observ, Nanjing 210008, Peoples R China
[4] Nanjing Univ Informat Sci & Technol, Sch Automat, Nanjing 210044, Peoples R China
[5] Shanghai Maritime Univ, Sch Logist Engn, Shanghai 201306, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Convolution; Object tracking; Deep learning; Radio frequency; Neural networks; Kernel; Attention mechanism; feature fusion; heterogeneous convolution kernel (HetConv); object tracking; siamese network; CORRELATION FILTER; VISUAL TRACKING; OBJECT TRACKING;
D O I
10.1109/TCYB.2022.3207431
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accuracy and speed are the most important indexes for evaluating many object tracking algorithms. However, when constructing a deep fully convolutional neural network (CNN), the use of deep network feature tracking will cause tracking drift due to the effects of convolution padding, receptive field (RF), and overall network step size. The speed of the tracker will also decrease. This article proposes a fully convolutional siamese network object tracking algorithm that combines the attention mechanism with the feature pyramid network (FPN), and uses heterogeneous convolution kernels to reduce the amount of calculations (FLOPs) and parameters. The tracker first uses a new fully CNN to extract image features, and introduces a channel attention mechanism in the feature extraction process to improve the representation ability of convolutional features. Then use the FPN to fuse the convolutional features of high and low layers, learn the similarity of the fused features, and train the fully CNNs. Finally, the heterogeneous convolutional kernel is used to replace the standard convolution kernel to improve the speed of the algorithm, thereby making up for the efficiency loss caused by the feature pyramid model. In this article, the tracker is experimentally verified and analyzed on the VOT-2017, VOT-2018, OTB-2013, and OTB-2015 datasets. The results show that our tracker has achieved better results than the state-of-the-art trackers.
引用
收藏
页码:13 / 24
页数:12
相关论文
共 55 条
[1]   Staple: Complementary Learners for Real-Time Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Golodetz, Stuart ;
Miksik, Ondrej ;
Torr, Philip H. S. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1401-1409
[2]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[3]  
Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
[4]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[5]   ECO: Efficient Convolution Operators for Tracking [J].
Danelljan, Martin ;
Bhat, Goutam ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6931-6939
[6]   Discriminative Scale Space Tracking [J].
Danelljan, Martin ;
Hager, Gustav ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (08) :1561-1575
[7]   Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking [J].
Danelljan, Martin ;
Robinson, Andreas ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :472-488
[8]   Learning Spatially Regularized Correlation Filters for Visual Tracking [J].
Danelljan, Martin ;
Hager, Gustav ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4310-4318
[9]   Adaptive Color Attributes for Real-Time Visual Tracking [J].
Danelljan, Martin ;
Khan, Fahad Shahbaz ;
Felsberg, Michael ;
van de Weijer, Joost .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1090-1097
[10]  
Danelljan Martin., 2014, BRIT MACHINE VISION, DOI DOI 10.5244/C.28.65