CTIFTrack: Continuous Temporal Information Fusion for object track

被引：1

作者：

Zhang, Zhiguo ^{[1
]}

Guo, Zhiqing ^{[1
]}

Wang, Liejun ^{[1
]}

Li, Yongming ^{[1
]}

机构：

[1] Xinjiang Univ, Sch Comp Sci & Technol, Huarui Rd, Urumqi 830017, Xinjiang, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2025年 / 262卷

关键词：

Object tracking; Temporal information fusion; Multi-level feature extraction; Feature refinement;

D O I：

10.1016/j.eswa.2024.125654

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In visual tracking tasks, researchers usually focus on increasing the complexity of the model or only discretely focusing on the changes in the object itself to achieve accurate recognition and tracking of the moving object. However, they often overlook the significant contribution of video-level linear temporal information fusion and continuous spatiotemporal mapping to tracking tasks. This oversight may lead to poor tracking performance or insufficient real-time ability of the model in complex scenes. Therefore, this paper proposes a real-time tracker, namely Continuous Temporal Information Fusion Tracker (CTIFTrack). The key of CTIFTrack lies in its well-designed Temporal Information Fusion (TIF) module, which cleverly performs a linear fusion of the temporal information between the ( t- 1)- th and the t-th frames and completes the spatiotemporal mapping. This enables the tracker to better understand the overall spatiotemporal information and contextual spatiotemporal correlations within the video, thereby having a positive impact on the tracking task. In addition, this paper also proposes the Object Template Feature Refinement (OTFR) module, which effectively captures the global information and local details of the object, and further improves the tracker's understanding of the object features. Extensive experiments are conducted on seven benchmarks, such as LaSOT, GOT-10K, UAV123, NFS, TrackingNet, VOT2018 and OTB-100. The experimental results validate the significant contribution of the TIF module and OTFR module to the tracking task, as well as the effectiveness of CTIFTrack. It is worth noting that while maintaining excellent tracking performance, CTIFTrack also shows outstanding real-time tracking speed. On the Nvidia Tesla T4-16GB GPU, the FPS of CTIFTrack reaches 71.98. The code and demo materials will be available at https://github.com/vpsg-research/CTIFTrack.

引用

页数：13

共 50 条

[1] ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe
Bai, Yifan
Zhao, Zeyang
Gong, Yihong
Wei, Xing
[J]. 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 19048 - 19057
[2] Fully-Convolutional Siamese Networks for Object Tracking
Bertinetto, Luca
Valmadre, Jack
Henriques, Joao F.
Vedaldi, Andrea
Torr, Philip H. S.
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
[3] Learning Discriminative Model Prediction for Tracking
Bhat, Goutam
Danelljan, Martin
Van Gool, Luc
Timofte, Radu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
[4] Efficient Visual Tracking with Exemplar Transformers
Blatter, Philippe
Kanakis, Menelaos
Danelljan, Martin
Van Gool, Luc
[J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1571 - 1581
[5] FEAR: Fast, Efficient, Accurate and Robust Visual Tracker
Borsuk, Vasyl
Vei, Roman
Kupyn, Orest
Martyniuk, Tetiana
Krashenyi, Igor
Matas, Jiri
[J]. COMPUTER VISION, ECCV 2022, PT XXII, 2022, 13682 : 644 - 663
[6] Robust Object Modeling for Visual Tracking
Cai, Yidong
Liu, Jie
Tang, Jie
Wu, Gangshan
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9555 - 9566
[7] HiFT: Hierarchical Feature Transformer for Aerial Tracking
Cao, Ziang
Fu, Changhong
Ye, Junjie
Li, Bowen
Li, Yiming
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15437 - 15446
[8] Enhanced the moving object detection and object tracking for traffic surveillance using RBF-FDLNN and CBF algorithm
Chandrakar, Ramakant
Raja, Rohit
Miri, Rohit
Sinha, Upasana
Kushwaha, Alok Kumar Singh
Raja, Hiral
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
[9] Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking
Chen, Boyu
Li, Peixia
Bai, Lei
Qiao, Lei
Shen, Qiuhong
Li, Bo
Gan, Weihao
Wu, Wei
Ouyang, Wanli
[J]. COMPUTER VISION, ECCV 2022, PT XXII, 2022, 13682 : 375 - 392
[10] Chen Xin, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13808), P461, DOI 10.1007/978-3-031-25085-9_26

← 1 2 3 4 5 →