RPformer: A Robust Parallel Transformer for Visual Tracking in Complex Scenes

被引：44

作者：

Gu, Fengwei ^{[1
,2
]}

Lu, Jun ^{[1
,2
]}

Cai, Chengtao ^{[1
,2
]}

机构：

[1] Harbin Engn Univ, Coll Intelligent Syst Sci & Engn, Harbin 150001, Peoples R China

[2] Harbin Engn Univ, Minist Educ, Key Lab Intelligent Technol & Applicat Marine Equ, Harbin 150001, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2022年 / 71卷

基金：

中国国家自然科学基金; 黑龙江省自然科学基金;

关键词：

Target tracking; Transformers; Correlation; Visualization; Feature extraction; Information filters; Kernel; Attention mechanism; complex scenes; feature fusion head (FFH); parallel Transformer network; visual tracking; NETWORK;

D O I：

10.1109/TIM.2022.3170972

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The Siamese architecture has shown remarkable performance in the field of visual tracking. Although the existing Siamese-based tracking methods have achieved a relative balance between accuracy and speed, the performance of many trackers in complex scenes is often unsatisfactory, which is mainly caused by interference factors, such as target scale changes, occlusion, and fast movement. In these cases, excessive trackers cannot employ sufficiently the target feature information and face the dilemma of information loss. In this work, we propose a novel parallel Transformer network architecture to achieve robust visual tracking. The proposed method designs the Transformer-1 module, the Transformer-2 module, and the feature fusion head (FFH) based on the attention mechanism. The Transformer-1 module and the Transformer-2 module are regarded as corresponding complementary branches in the parallel architecture. The FFH is used to integrate the feature information of the two parallel branches, which can efficiently exploit the feature dependence relationship between the template and the search region, and comprehensively explore rich contextual information. Finally, by combining the core ideas of Siamese and Transformer, we present a simple and robust tracking framework called RPformer, which does not require any prior knowledge and avoids the trouble of adjusting hyperparameters. Numerous experiments show that the proposed tracking method achieves more outstanding performance than the state-of-the-art trackers on seven tracking benchmarks, which can meet the real-time requirements at a running speed exceeding 50.0 frames/s.

引用

页数：14

共 50 条

[1] A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
Chen, Hui
Wang, Zhenhai
Tian, Hongyu
Yuan, Lutao
Wang, Xing
Leng, Peng
SENSORS, 2022, 22 (17)
[2] A Fusion Approach for Robust Visual Object Tracking in Crowd Scenes
Oh, Tae-Hyun
Joo, Kyungdon
Kim, Junsik
Park, Jaesik
Kweon, In So
2014 11TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAI), 2014, : 558 - 560
[3] Robust Detection and Tracking Algorithm of Multiple Objects in Complex Scenes
Hu, Hong-Yu
Qu, Zhao-Wei
Li, Zhi-Hui
Wang, Qing-Nian
APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (05): : 2485 - 2490
[4] Robust Object Tracking Algorithm for Autonomous Vehicles in Complex Scenes
Cao, Jingwei
Song, Chuanxue
Song, Shixin
Xiao, Feng
Zhang, Xu
Liu, Zhiyang
Ang, Marcelo H., Jr.
REMOTE SENSING, 2021, 13 (16)
[5] Propagating prior information with transformer for robust visual object tracking
Wu, Yue
Cai, Chengtao
Yeo, Chai Kiat
MULTIMEDIA SYSTEMS, 2024, 30 (05)
[6] A robust attention-enhanced network with transformer for visual tracking
Gu, Fengwei
Lu, Jun
Cai, Chengtao
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40761 - 40782
[7] A robust attention-enhanced network with transformer for visual tracking
Fengwei Gu
Jun Lu
Chengtao Cai
Multimedia Tools and Applications, 2023, 82 : 40761 - 40782
[8] Robust Visual Tracking based on Deep Spatial Transformer Features
Zhang, Ximing
Wang, Mingang
Wei, Jinkang
Cui, Can
PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 5036 - 5041
[9] RTSformer: A Robust Toroidal Transformer With Spatiotemporal Features for Visual Tracking
Gu, Fengwei
Lu, Jun
Cai, Chengtao
Zhu, Qidan
Ju, Zhaojie
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2024, 54 (02) : 214 - 225
[10] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
Wang, Ning
Zhou, Wengang
Wang, Jie
Li, Houqiang
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1571 - 1580

← 1 2 3 4 5 →