HIPTrack: Visual Tracking with Historical Prompts

被引：19

作者：

Cai, Wenrui ^{[1
]}

Liu, Qingjie ^{[1
,2
,3
]}

Wang, Yunhong ^{[1
,3
]}

机构：

[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China

[2] Zhongguancun Lab, Beijing, Peoples R China

[3] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52733.2024.01822

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Trackers that follow Siamese paradigm utilize similarity matching between template and search region features for tracking. Many methods have been explored to enhance tracking performance by incorporating tracking history to better handle scenarios involving target appearance variations such as deformation and occlusion. However, the utilization of historical information in existing methods is insufficient and incomprehensive, which typically requires repetitive training and introduces a large amount of computation. In this paper, we show that by providing a tracker that follows Siamese paradigm with precise and updated historical information, a significant performance improvement can be achieved with completely unchanged parameters. Based on this, we propose a historical prompt network that uses refined historical foreground masks and historical visual features of the target to provide comprehensive and precise prompts for the tracker. We build a novel tracker called HIPTrack based on the historical prompt network, which achieves considerable performance improvements without the need to retrain the entire model. We conduct experiments on seven datasets and experimental results demonstrate that our method surpasses the current state-of-the-art trackers on LaSOT, LaSOText, GOT-10k and NfS. Furthermore, the historical prompt network can seamlessly integrate as a plug-and-play module into existing trackers, providing performance enhancements. The source code is available at https://github.com/WenRuiCai/HIPTrack.

引用

页码：19258 / 19267

页数：10

共 52 条

[1]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01356

[2]

[Anonymous], 2022, P IEEE CVF C COMP VI, DOI DOI 10.1109/CVPR52688.2022.01755

[3] Fully-Convolutional Siamese Networks for Object Tracking [J].

Bertinetto, Luca ;

Valmadre, Jack ;

Henriques, Joao F. ;

Vedaldi, Andrea ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865

[4] Learning Discriminative Model Prediction for Tracking [J].

Bhat, Goutam ;

Danelljan, Martin ;

Van Gool, Luc ;

Timofte, Radu .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6181-6190

[5] Robust Object Modeling for Visual Tracking [J].

Cai, Yidong ;

Liu, Jie ;

Tang, Jie ;

Wu, Gangshan .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :9555-9566

[6]

Chen Boyu, 2022, COMP VIS ECCV 202 22, P375

[7] SeqTrack: Sequence to Sequence Learning for Visual Object Tracking [J].

Chen, Xin ;

Peng, Houwen ;

Wang, Dong ;

Lu, Huchuan ;

Hu, Han .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :14572-14581

[8] Transformer Tracking [J].

Chen, Xin ;

Yan, Bin ;

Zhu, Jiawen ;

Wang, Dong ;

Yang, Xiaoyun ;

Lu, Huchuan .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8122-8131

[9]

Cheng HK, 2021, ADV NEUR IN, V34

[10] MixFormer: End-to-End Tracking with Iterative Mixed Attention [J].

Cui, Yutao ;

Jiang, Cheng ;

Wang, Limin ;

Wu, Gangshan .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :13598-13608

← 1 2 3 4 5 6 →