Efficient transformer tracking with adaptive attention

被引：0

作者：

Xiao, Dingkun ^{[1
]}

Wei, Zhenzhong ^{[1
]}

Zhang, Guangjun ^{[1
]}

机构：

[1] Beihang Univ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing, Peoples R China

来源：

IET COMPUTER VISION | 2024年

基金：

中国国家自然科学基金;

关键词：

computer vision; convolution; convolutional neural nets; object tracking; target tracking; tracking;

D O I：

10.1049/cvi2.12315

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, several trackers utilising Transformer architecture have shown significant performance improvement. However, the high computational cost of multi-head attention, a core component in the Transformer, has limited real-time running speed, which is crucial for tracking tasks. Additionally, the global mechanism of multi-head attention makes it susceptible to distractors with similar semantic information to the target. To address these issues, the authors propose a novel adaptive attention that enhances features through the spatial sparse attention mechanism with less than 1/4 of the computational complexity of multi-head attention. Our adaptive attention sets a perception range around each element in the feature map based on the target scale in the previous tracking result and adaptively searches for the information of interest. This allows the module to focus on the target region rather than background distractors. Based on adaptive attention, the authors build an efficient transformer tracking framework. It can perform deep interaction between search and template features to activate target information and aggregate multi-level interaction features to enhance the representation ability. The evaluation results on seven benchmarks show that the authors' tracker achieves outstanding performance with a speed of 43 fps and significant advantages in hard circumstances.

引用

页码：1338 / 1350

页数：13

共 50 条

[41] Online Adaptive Siamese Network Tracking Algorithm Based on Attention Mechanism
Dong Jifu
Liu Chang
Cao Fangwei
Ling Yuan
Gao Xiang
LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (02)
[42] FFTransMOT: Feature-Fused Transformer for Enhanced Multi-Object Tracking
Hu, Xufeng
Jeon, Younghoon
Gwak, Jeonghwan
IEEE ACCESS, 2023, 11 : 130060 - 130071
[43] Invertible Attention-Guided Adaptive Convolution and Dual-Domain Transformer for Pansharpening
Song, Qun
Lu, Hangyuan
Xu, Chang
Liu, Rixian
Wan, Weiguo
Tu, Wei
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 5217 - 5231
[44] An efficient object tracking method based on adaptive nonparametric approach
Li, L
Feng, Z
OPTO-ELECTRONICS REVIEW, 2005, 13 (04) : 325 - 330
[45] Energy-Efficient Object Tracking Using Adaptive ROI Subsampling and Deep Reinforcement Learning
Katoch, Sameeksha
Iqbal, Odrika
Spanias, Andreas
Jayasuriya, Suren
IEEE ACCESS, 2023, 11 : 41995 - 42011
[46] Spatial-Temporal Sequence Attention Based Efficient Transformer for Video Snow Removal
Gao, Tao
Zhang, Qianxi
Chen, Ting
Wen, Yuanbo
BIG DATA MINING AND ANALYTICS, 2025, 8 (03): : 551 - 562
[47] BinaryFormer: A Hierarchical-Adaptive Binary Vision Transformer (ViT) for Efficient Computing
Wang, Miaohui
Xu, Zhuowei
Zheng, Bin
Xie, Wuyuan
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (08) : 10657 - 10668
[48] A Transformer-Based Network for Hyperspectral Object Tracking
Gao, Long
Chen, Langkun
Liu, Pan
Jiang, Yan
Xie, Weiying
Li, Yunsong
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[49] SiamUT: Siamese Unsymmetrical Transformer-like Tracking
Yang, Lingyu
Zhou, Hao
Yuan, Guowu
Xia, Mengen
Chen, Dong
Shi, Zhiliang
Chen, Enbang
ELECTRONICS, 2023, 12 (14)
[50] Transformer Tracking for Satellite Video: Matching, Propagation, and Prediction
Zhao, Manqi
Li, Shengyang
Yang, Jian
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62

← 1 2 3 4 5 →