EANTrack: An Efficient Attention Network for Visual Tracking

被引:27
作者
Gu, Fengwei [1 ,2 ]
Lu, Jun [1 ,2 ]
Cai, Chengtao [1 ,2 ]
Zhu, Qidan [1 ,2 ]
Ju, Zhaojie [3 ]
机构
[1] Harbin Engn Univ, Key Lab Intelligent Technol & Applicat Marine Equi, Minist Educ, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Coll Intelligent Syst Sci & Engn, Harbin 150001, Peoples R China
[3] Univ Portsmouth, Sch Comp, Portsmouth PO1 3HE, England
基金
黑龙江省自然科学基金; 中国国家自然科学基金;
关键词
Video sequences; Visual tracking; challenging scenarios; target features; attention network; OBJECT TRACKING; CORRELATION FILTERS;
D O I
10.1109/TASE.2023.3319676
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, Siamese trackers have gained widespread attention in visual tracking due to their exceptional performance. However, many trackers still suffer from limitations in challenging scenarios, such as fast motion and scale variation, which hinder the full exploitation of target features. Consequently, the accuracy and efficiency of the trackers are limited. Therefore, this paper proposes an efficient attention network, called EAN, to improve tracking performance. The EAN comprises three primary components, namely a Transformer-s subnetwork, a Transformer-t subnetwork, and a Feature-Fused Attention Module (FFAM). The designed Transformer-s and Transformer-t subnetworks adopt complementary structures and functions to fully integrate and emphasize the relevant feature information, including channel and spatial features. The FFAM is responsible for fusing the multi-level features from both subnetworks, which establishes the global dependencies between the templates and search regions and enhances the discriminative power of the model. To further improve the tracking accuracy, a novel Feature-Aware Attention Module (FAAM) is introduced into the tracking prediction head to enhance the feature representation capability of the model. Finally, we propose an efficient EANTrack tracker based on EAN for robust tracking in complex scenarios, which exhibits significant advantages in challenging attributes. Experimental results on multiple benchmarks indicate that our approach achieves remarkable tracking performance with a real-time running speed of 55.6fps.
引用
收藏
页码:5911 / 5928
页数:18
相关论文
共 79 条
  • [1] [Anonymous], 2016, CVPR, DOI DOI 10.1109/CVPR.2016.465
  • [2] Bao J., 2023, IEEE T CIRC SYST VID, DOI [10.1109/TCSVT.2023.3266485, DOI 10.1109/TCSVT.2023.3266485]
  • [3] Fast re-OBJ: real-time object re-identification in rigid scenes
    Bayraktar, Ertugrul
    Wang, Yiming
    DelBue, Alessio
    [J]. MACHINE VISION AND APPLICATIONS, 2022, 33 (06)
  • [4] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [5] Learning Discriminative Model Prediction for Tracking
    Bhat, Goutam
    Danelljan, Martin
    Van Gool, Luc
    Timofte, Radu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
  • [6] Unveiling the Power of Deep Tracking
    Bhat, Goutam
    Johnander, Joakim
    Danelljan, Martin
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 493 - 509
  • [7] Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
  • [8] SiamATTRPN: Enhance Visual Tracking With Channel and Spatial Attention
    Cai, Huayue
    Zhang, Xiang
    Lan, Long
    Xu, Liyang
    Shen, Wenxin
    Chen, Junyang
    Leung, Victor C. M.
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02) : 1958 - 1966
  • [9] Carion N., 2020, ECCV
  • [10] Transformer Tracking
    Chen, Xin
    Yan, Bin
    Zhu, Jiawen
    Wang, Dong
    Yang, Xiaoyun
    Lu, Huchuan
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8122 - 8131