HIPTrack: Visual Tracking with Historical Prompts

被引:22
作者
Cai, Wenrui [1 ]
Liu, Qingjie [1 ,2 ,3 ]
Wang, Yunhong [1 ,3 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China
[2] Zhongguancun Lab, Beijing, Peoples R China
[3] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52733.2024.01822
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Trackers that follow Siamese paradigm utilize similarity matching between template and search region features for tracking. Many methods have been explored to enhance tracking performance by incorporating tracking history to better handle scenarios involving target appearance variations such as deformation and occlusion. However, the utilization of historical information in existing methods is insufficient and incomprehensive, which typically requires repetitive training and introduces a large amount of computation. In this paper, we show that by providing a tracker that follows Siamese paradigm with precise and updated historical information, a significant performance improvement can be achieved with completely unchanged parameters. Based on this, we propose a historical prompt network that uses refined historical foreground masks and historical visual features of the target to provide comprehensive and precise prompts for the tracker. We build a novel tracker called HIPTrack based on the historical prompt network, which achieves considerable performance improvements without the need to retrain the entire model. We conduct experiments on seven datasets and experimental results demonstrate that our method surpasses the current state-of-the-art trackers on LaSOT, LaSOText, GOT-10k and NfS. Furthermore, the historical prompt network can seamlessly integrate as a plug-and-play module into existing trackers, providing performance enhancements. The source code is available at https://github.com/WenRuiCai/HIPTrack.
引用
收藏
页码:19258 / 19267
页数:10
相关论文
共 52 条
[21]   Visual Prompt Tuning [J].
Jia, Menglin ;
Tang, Luming ;
Chen, Bor-Chun ;
Cardie, Claire ;
Belongie, Serge ;
Hariharan, Bharath ;
Lim, Ser-Nam .
COMPUTER VISION - ECCV 2022, PT XXXIII, 2022, 13693 :709-727
[22]   MaPLe: Multi-modal Prompt Learning [J].
Khattak, Muhammad Uzair ;
Rasheed, Hanoona ;
Maaz, Muhammad ;
Khan, Salman ;
Khan, Fahad Shahbaz .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :19113-19122
[23]  
Lester B, 2021, ARXIV
[24]   SiamRPN plus plus : Evolution of Siamese Visual Tracking with Very Deep Networks [J].
Li, Bo ;
Wu, Wei ;
Wang, Qiang ;
Zhang, Fangyi ;
Xing, Junliang ;
Yan, Junjie .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4277-4286
[25]  
Li XLS, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P4582
[26]  
Lin LT, 2022, ADV NEUR IN
[27]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[28]   Transforming Model Prediction for Tracking [J].
Mayer, Christoph ;
Danelljan, Martin ;
Bhat, Goutam ;
Paul, Matthieu ;
Paudel, Danda Pani ;
Yu, Fisher ;
Gool, Luc Van .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8721-8730
[29]   Learning Target Candidate Association to Keep Track of What Not to Track [J].
Mayer, Christoph ;
Danelljan, Martin ;
Paudel, Danda Pani ;
Van Gool, Luc .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13424-13434
[30]   A Benchmark and Simulator for UAV Tracking [J].
Mueller, Matthias ;
Smith, Neil ;
Ghanem, Bernard .
COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :445-461