Efficient transformer tracking with adaptive attention

被引:0
|
作者
Xiao, Dingkun [1 ]
Wei, Zhenzhong [1 ]
Zhang, Guangjun [1 ]
机构
[1] Beihang Univ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
computer vision; convolution; convolutional neural nets; object tracking; target tracking; tracking;
D O I
10.1049/cvi2.12315
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, several trackers utilising Transformer architecture have shown significant performance improvement. However, the high computational cost of multi-head attention, a core component in the Transformer, has limited real-time running speed, which is crucial for tracking tasks. Additionally, the global mechanism of multi-head attention makes it susceptible to distractors with similar semantic information to the target. To address these issues, the authors propose a novel adaptive attention that enhances features through the spatial sparse attention mechanism with less than 1/4 of the computational complexity of multi-head attention. Our adaptive attention sets a perception range around each element in the feature map based on the target scale in the previous tracking result and adaptively searches for the information of interest. This allows the module to focus on the target region rather than background distractors. Based on adaptive attention, the authors build an efficient transformer tracking framework. It can perform deep interaction between search and template features to activate target information and aggregate multi-level interaction features to enhance the representation ability. The evaluation results on seven benchmarks show that the authors' tracker achieves outstanding performance with a speed of 43 fps and significant advantages in hard circumstances.
引用
收藏
页码:1338 / 1350
页数:13
相关论文
共 50 条
  • [41] Online Adaptive Siamese Network Tracking Algorithm Based on Attention Mechanism
    Dong Jifu
    Liu Chang
    Cao Fangwei
    Ling Yuan
    Gao Xiang
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (02)
  • [42] FFTransMOT: Feature-Fused Transformer for Enhanced Multi-Object Tracking
    Hu, Xufeng
    Jeon, Younghoon
    Gwak, Jeonghwan
    IEEE ACCESS, 2023, 11 : 130060 - 130071
  • [43] Invertible Attention-Guided Adaptive Convolution and Dual-Domain Transformer for Pansharpening
    Song, Qun
    Lu, Hangyuan
    Xu, Chang
    Liu, Rixian
    Wan, Weiguo
    Tu, Wei
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 5217 - 5231
  • [44] An efficient object tracking method based on adaptive nonparametric approach
    Li, L
    Feng, Z
    OPTO-ELECTRONICS REVIEW, 2005, 13 (04) : 325 - 330
  • [45] Energy-Efficient Object Tracking Using Adaptive ROI Subsampling and Deep Reinforcement Learning
    Katoch, Sameeksha
    Iqbal, Odrika
    Spanias, Andreas
    Jayasuriya, Suren
    IEEE ACCESS, 2023, 11 : 41995 - 42011
  • [46] Spatial-Temporal Sequence Attention Based Efficient Transformer for Video Snow Removal
    Gao, Tao
    Zhang, Qianxi
    Chen, Ting
    Wen, Yuanbo
    BIG DATA MINING AND ANALYTICS, 2025, 8 (03): : 551 - 562
  • [47] BinaryFormer: A Hierarchical-Adaptive Binary Vision Transformer (ViT) for Efficient Computing
    Wang, Miaohui
    Xu, Zhuowei
    Zheng, Bin
    Xie, Wuyuan
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (08) : 10657 - 10668
  • [48] A Transformer-Based Network for Hyperspectral Object Tracking
    Gao, Long
    Chen, Langkun
    Liu, Pan
    Jiang, Yan
    Xie, Weiying
    Li, Yunsong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [49] SiamUT: Siamese Unsymmetrical Transformer-like Tracking
    Yang, Lingyu
    Zhou, Hao
    Yuan, Guowu
    Xia, Mengen
    Chen, Dong
    Shi, Zhiliang
    Chen, Enbang
    ELECTRONICS, 2023, 12 (14)
  • [50] Transformer Tracking for Satellite Video: Matching, Propagation, and Prediction
    Zhao, Manqi
    Li, Shengyang
    Yang, Jian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62