Evota: an enhanced visual object tracking network with attention mechanism

被引:0
|
作者
An Zhao
Yi Zhang
机构
[1] Sichuan University,Department of Computer Science
来源
关键词
Attention mechanism; Visual tracking; Transformer;
D O I
暂无
中图分类号
学科分类号
摘要
Transformer architecture has made breakthrough in various downstream computer vision tasks and has shown its great potential in visual object tracking. However, existing transformer-based approaches adopt pixel-to-pixel attention strategy to integrate the domain knowledge, but fail to explore the channel and location information from object features, which limits the expressivity of the tracker. To address the above problems, we propose a novel tracking framework, where we propose 2 attention blocks that fuses with Transformer (dubbed EVOTA). It has 4 modules: the feature extraction module, the enhanced attention module, a transformer module and a model predictor. Specifically, a channel-wise attention module re-calibrates the channel-wise feature responses in an adaptive way by modelling interdependencies explicitly between channels. A local cross-channel interaction scheme learns strong channel context information. Meanwhile, an energy function is developed to analyze the importance of each neuron and infers their 3D weights. Extensive experiments have been carried out on 5 prevalent tracking benchmarks to testify the effectiveness of our model, in which EVOTA outperforms several state-of-the-art methods.
引用
收藏
页码:24939 / 24960
页数:21
相关论文
共 50 条
  • [31] APR-Net Tracker: Attention Pyramidal Residual Network for Visual Object Tracking
    Liu, Bing
    Yuan, Di
    Li, Xiaofang
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 552 - 564
  • [32] Multi-level Cross-attention Siamese Network For Visual Object Tracking
    Zhang, Jianwei
    Wang, Jingchao
    Zhang, Huanlong
    Miao, Mengen
    Cai, Zengyu
    Chen, Fuguo
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (12): : 3976 - 3990
  • [33] Residual LSTM Attention Network for Object Tracking
    Kim, Hong-In
    Park, Rae-Hong
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (07) : 1029 - 1033
  • [34] Deformable Siamese Attention Networks for Visual Object Tracking
    Yu, Yuechen
    Xiong, Yilei
    Huang, Weilin
    Scott, Matthew R.
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6727 - 6736
  • [35] PERCEPTION ENHANCED FRAME FOR VISUAL OBJECT TRACKING
    Song, Binpeng
    Liu, Jianfeng
    Ye, Jian
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 658 - 662
  • [36] SiamCAM: A Real-Time Siamese Network for Object Tracking with Compensating Attention Mechanism
    Huang, Kai
    Qin, Peixuan
    Tu, Xuji
    Leng, Lu
    Chu, Jun
    APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [37] Siamese Network Object Tracking Algorithm Combining Attention Mechanism and Correlation Filter Theory
    Hu, Xiuhua
    Liu, Huan
    Chen, Yuan
    Hui, Yan
    Liang, Yingyu
    Wu, Xi
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [38] Target-Aware Siamese Networks Based on Masked Attention Mechanism for Visual Object Tracking
    Su, Yao-Hui
    Shieh, Ming-Der
    Tsai, Chia-Chi
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 28 - 34
  • [39] Siamese Network Based on MLP and Multi-head Cross Attention for Visual Object Tracking
    Li, Piaoyang
    Lan, Shiyong
    Sun, Shipeng
    Wang, Wenwu
    Gao, Yongyang
    Yang, Yongyu
    Yu, Guangyu
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PART X, 2023, 14263 : 420 - 431
  • [40] Object semantic-guided graph attention feature fusion network for Siamese visual tracking
    Zhang, Jianwei
    Miao, Mengen
    Zhang, Huanlong
    Wang, Jingchao
    Zhao, Yanchun
    Chen, Zhiwu
    Qiao, Jianwei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90