A fast long-term visual tracking algorithm based on deep learning

被引：0

作者：

Hou, Zhiqiang ^{[1
,2
]}

Ma, Jingyuan ^{[1
,2
]}

Han, Ruoxue ^{[1
,2
]}

Ma, Sugang ^{[1
,2
]}

Yu, Wangsheng ^{[3
]}

Fan, Jiulun ^{[1
]}

机构：

[1] School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an

[2] Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an University of Posts and Telecommunications, Xi’an

[3] College of Information and Navigation, Air Force Engineering University, Xi’an

来源：

Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics | 2024年 / 50卷 / 08期

基金：

中国国家自然科学基金;

关键词：

deep learning; global re-detection; long-term visual tracking; regional spatial attention; second-order channel attention;

D O I：

10.13700/j.bh.1001-5965.2022.0645

中图分类号：

学科分类号：

摘要：

Current deep learning-based visual tracking algorithms have difficulty tracking the target accurately in real-time in complex long-term monitoring environments including target size change, occlusion, and out-of-view. To solve this problem, a fast long-term visual tracking algorithm is proposed, which consists of a fast short-term tracking algorithm and a fast global re-detection module. First, as a short-term tracking algorithm, the attention module of second-order channel and region spatial fusion is added to the base algorithm SiamRPN. Then, in order to make the improved short-term tracking algorithm have a fast long-term tracking ability, the global re-detection module based on template matching proposed in this paper is added to the algorithm, which uses a lightweight network and fast similarity judgment method to speed up the re-detection rate. The proposed algorithm is tested on five datasets (OTB100, LaSOT, UAV20L, VOT2018-LT, and VOT2020-LT). With an average tracking speed of 104 frames per second, the experimental findings demonstrate the algorithm's outstanding long-term tracking performance. © 2024 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.

引用

页码：2391 / 2403

页数：12

共 55 条

[21] KALAL Z, MIKOLAJCZYK K, MATAS J., Tracking-learning-detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 7, pp. 1409-1422, (2012)
[22] ZHU G, PORIKLI F, LI H D., Beyond local search: Tracking objects everywhere with instance-specific proposals, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 943-951, (2016)
[23] KRIZHEVSKY A, SUTSKEVER I, HINTON G E., ImageNet classification with deep convolutional neural networks, Communications of the ACM, 60, 6, pp. 84-90, (2017)
[24] WOO S, PARK J, LEE J Y, Et al., CBAM: Convolutional block attention module, Proceedings of the European Conference on Computer Vision, pp. 3-19, (2018)
[25] HE K M, ZHANG X Y, REN S Q, Et al., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[26] HU J, SHEN L, SUN G., Squeeze-and-excitation networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, (2018)
[27] LI P H, XIE J T, WANG Q L, Et al., Is second-order information helpful for large-scale visual recognition?, Proceedings of the IEEE International Conference on Computer Vision, pp. 2089-2097, (2017)
[28] LIN T Y, ROYCHOWDHURY A, MAJI S., Bilinear CNN models for fine-grained visual recognition, Proceedings of the IEEE International Conference on Computer Vision, pp. 1449-1457, (2015)
[29] JADERBERG M, SIMONYAN K, ZISSERMAN A, Et al., Spatial transformernetworks
[30] WANG X L, GIRSHICK R, GUPTA A, Et al., Non-local neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794-7803, (2018)

← 1 2 3 4 5 6 →