Sparse Transformer-Based Sequence Generation for Visual Object Tracking

被引:0
|
作者
Tian, Dan [1 ]
Liu, Dong-Xin [2 ]
Wang, Xiao [2 ]
Hao, Ying [2 ]
机构
[1] Shenyang Univ, Sch Intelligent Syst Sci & Engn, Shenyang 110044, Liaoning, Peoples R China
[2] Shenyang Univ, Sch Informat Engn, Shenyang 110044, Liaoning, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Visualization; Target tracking; Decoding; Feature extraction; Attention mechanisms; Object tracking; Training; Interference; Attention mechanism; sequence generation; sparse attention; visual object tracking; vision transformer;
D O I
10.1109/ACCESS.2024.3482468
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In visual object tracking, attention mechanisms can flexibly and efficiently handle complex dependencies and global information, which improves tracking accuracy. However, when dealing with scenarios that contain a large amount of background information or other complex information, its global attention ability can dilute the weight of important information, allocate unnecessary attention to background information, and thus reduce tracking performance. To relieve this problem, this paper proposes a visual object tracking framework based on a sparse transformer. Our tracking framework is a simple encoder-decoder structure that realizes the prediction of the target in an autoregressive manner, eliminating the additional head network and simplifying the tracking architecture. Furthermore, we introduce a Sparse Attention Mechanism (SMA) in the cross-attention layer of the decoder. Unlike traditional attention mechanisms, SMA focuses only on the top K pixel values that are most relevant to the current pixel when calculating attention weights. This allows the model to focus more on key information and improve foreground and background discrimination, resulting in more accurate and robust tracking. We conduct tests on six tracking benchmarks, and the experimental results prove the effectiveness of our method.
引用
收藏
页码:154418 / 154425
页数:8
相关论文
共 50 条
  • [41] Unsupervised Nighttime Object Tracking Based on Transformer and Domain Adaptation Fusion Network
    Wei, Haoran
    Fu, Yanyun
    Wang, Deyong
    Guo, Rui
    Zhao, Xueyi
    Fang, Jian
    IEEE ACCESS, 2024, 12 : 130896 - 130913
  • [42] Transformer-Based Efficient Salient Instance Segmentation Networks With Orientative Query
    Pei, Jialun
    Cheng, Tianyang
    Tang, He
    Chen, Chuanbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1964 - 1978
  • [43] SeTransformer: A Transformer-Based Code Semantic Parser for Code Comment Generation
    Li, Zheng
    Wu, Yonghao
    Peng, Bin
    Chen, Xiang
    Sun, Zeyu
    Liu, Yong
    Paul, Doyle
    IEEE TRANSACTIONS ON RELIABILITY, 2023, 72 (01) : 258 - 273
  • [44] eMoE-Tracker: Environmental MoE-Based Transformer for Robust Event-Guided Object Tracking
    Chen, Yucheng
    Wang, Lin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (02): : 1393 - 1400
  • [45] Siamese Attentional Cascade Keypoints Network for Visual Object Tracking
    Wang, Ershen
    Wang, Donglei
    Huang, Yufeng
    Tong, Gang
    Xu, Song
    Pang, Tao
    IEEE ACCESS, 2021, 9 : 7243 - 7254
  • [46] A Novel Algorithm Based on a Common Subspace Fusion for Visual Object Tracking
    Javed, Sajid
    Mahmood, Arif
    Ullah, Ihsan
    Bouwmans, Thierry
    Khonji, Majid
    Dias, Jorge Manuel Miranda
    Werghi, Naoufel
    IEEE ACCESS, 2022, 10 : 24690 - 24703
  • [47] A Lightweight Object Tracking Method Based on Transformer
    Sun, Ziwen
    Yang, Chuandong
    Ling, Chong
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 796 - 801
  • [48] Knowledge Distillation and Transformer-Based Framework for Automatic Spine CT Report Generation
    Batool, Humaira
    Mukhtar, Asmat
    Gul Khawaja, Sajid
    Alghamdi, Norah Saleh
    Mansoor Khan, Asad
    Qayyum, Adil
    Adil, Ruqqayia
    Khan, Zawar
    Usman Akram, Muhammad
    Usman Akbar, Muhammad
    Eklund, Anders
    IEEE ACCESS, 2025, 13 : 42949 - 42964
  • [49] Boosting Salient Object Detection With Transformer-Based Asymmetric Bilateral U-Net
    Qiu, Yu
    Liu, Yun
    Zhang, Le
    Lu, Haotian
    Xu, Jing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2332 - 2345
  • [50] A GENERAL BAYESIAN ALGORITHM FOR VISUAL OBJECT TRACKING BASED ON SPARSE FEATURES
    Soto, Mauricio
    Regazzoni, Carlo S.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 1181 - 1184