EANTrack: An Efficient Attention Network for Visual Tracking

被引：27

作者：

Gu, Fengwei ^{[1
,2
]}

Lu, Jun ^{[1
,2
]}

Cai, Chengtao ^{[1
,2
]}

Zhu, Qidan ^{[1
,2
]}

Ju, Zhaojie ^{[3
]}

机构：

[1] Harbin Engn Univ, Key Lab Intelligent Technol & Applicat Marine Equi, Minist Educ, Harbin 150001, Peoples R China

[2] Harbin Engn Univ, Coll Intelligent Syst Sci & Engn, Harbin 150001, Peoples R China

[3] Univ Portsmouth, Sch Comp, Portsmouth PO1 3HE, England

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2024年 / 21卷 / 04期

基金：

黑龙江省自然科学基金; 中国国家自然科学基金;

关键词：

Video sequences; Visual tracking; challenging scenarios; target features; attention network; OBJECT TRACKING; CORRELATION FILTERS;

D O I：

10.1109/TASE.2023.3319676

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, Siamese trackers have gained widespread attention in visual tracking due to their exceptional performance. However, many trackers still suffer from limitations in challenging scenarios, such as fast motion and scale variation, which hinder the full exploitation of target features. Consequently, the accuracy and efficiency of the trackers are limited. Therefore, this paper proposes an efficient attention network, called EAN, to improve tracking performance. The EAN comprises three primary components, namely a Transformer-s subnetwork, a Transformer-t subnetwork, and a Feature-Fused Attention Module (FFAM). The designed Transformer-s and Transformer-t subnetworks adopt complementary structures and functions to fully integrate and emphasize the relevant feature information, including channel and spatial features. The FFAM is responsible for fusing the multi-level features from both subnetworks, which establishes the global dependencies between the templates and search regions and enhances the discriminative power of the model. To further improve the tracking accuracy, a novel Feature-Aware Attention Module (FAAM) is introduced into the tracking prediction head to enhance the feature representation capability of the model. Finally, we propose an efficient EANTrack tracker based on EAN for robust tracking in complex scenarios, which exhibits significant advantages in challenging attributes. Experimental results on multiple benchmarks indicate that our approach achieves remarkable tracking performance with a real-time running speed of 55.6fps.

引用

页码：5911 / 5928

页数：18

共 79 条

[1] [Anonymous], 2016, CVPR, DOI DOI 10.1109/CVPR.2016.465
[2] Bao J., 2023, IEEE T CIRC SYST VID, DOI [10.1109/TCSVT.2023.3266485, DOI 10.1109/TCSVT.2023.3266485]
[3] Fast re-OBJ: real-time object re-identification in rigid scenes
Bayraktar, Ertugrul
Wang, Yiming
DelBue, Alessio
[J]. MACHINE VISION AND APPLICATIONS, 2022, 33 (06)
[4] Fully-Convolutional Siamese Networks for Object Tracking
Bertinetto, Luca
Valmadre, Jack
Henriques, Joao F.
Vedaldi, Andrea
Torr, Philip H. S.
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
[5] Learning Discriminative Model Prediction for Tracking
Bhat, Goutam
Danelljan, Martin
Van Gool, Luc
Timofte, Radu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
[6] Unveiling the Power of Deep Tracking
Bhat, Goutam
Johnander, Joakim
Danelljan, Martin
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 493 - 509
[7] Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
[8] SiamATTRPN: Enhance Visual Tracking With Channel and Spatial Attention
Cai, Huayue
Zhang, Xiang
Lan, Long
Xu, Liyang
Shen, Wenxin
Chen, Junyang
Leung, Victor C. M.
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02) : 1958 - 1966
[9] Carion N., 2020, ECCV
[10] Transformer Tracking
Chen, Xin
Yan, Bin
Zhu, Jiawen
Wang, Dong
Yang, Xiaoyun
Lu, Huchuan
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8122 - 8131

← 1 2 3 4 5 6 7 8 →