Multi-granularity Feature Fusion for Transformer-Based Single Object Tracking

被引：0

作者：

Wang, Ziye ^{[1
]}

Miao, Duoqian ^{[1
]}

机构：

[1] Tongji Univ, Dept Comp Sci & Technol, 4800 Caoan Highway, Shanghai 201804, Peoples R China

来源：

ROUGH SETS, IJCRS 2023 | 2023年 / 14481卷

基金：

中国国家自然科学基金; 美国国家科学基金会;

关键词：

Computer vision; Single object tracking; Multi granularity; Rough set; Transformer; VISUAL TRACKING;

D O I：

10.1007/978-3-031-50959-9_22

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The recently developed transformer has been largely explored in the research field of computer vision and especially improve the performance of single object tracking. However, the majority of current efforts concentrate on combining and enhancing convolutional neural network (CNN)-generated features and cannot fully excavating the potential of transformer. Motivated by this, we introduce multi-granularity theory into the pure transformer-based single object tracker and design a multi-granularity feature fusion module. With a view to fuse the feature of different granularity and enhance the feature representation, we design the double-branch transformer feature extractor and utilize cross-attention mechanism to fuse the feature. In our extensive experiments on multiple tracking benchmarks, including OTB2015, VOT2020, TrackingNet, GOT-10k, LaSOT, our proposed method named MGTT, the results could demonstrate that the proposed tracker achieves better performance than multiple state-of-the-art trackers.

引用

页码：311 / 323

页数：13

共 75 条

[1] [Anonymous], 2007, Granular computing: past, present and future prospects
[2] Attention Augmented Convolutional Networks
Bello, Irwan
Zoph, Barret
Vaswani, Ashish
Shlens, Jonathon
Le, Quoc V.
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3285 - 3294
[3] Fully-Convolutional Siamese Networks for Object Tracking
Bertinetto, Luca
Valmadre, Jack
Henriques, Joao F.
Vedaldi, Andrea
Torr, Philip H. S.
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
[4] Learning Discriminative Model Prediction for Tracking
Bhat, Goutam
Danelljan, Martin
Van Gool, Luc
Timofte, Radu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
[5] Unveiling the Power of Deep Tracking
Bhat, Goutam
Johnander, Joakim
Danelljan, Martin
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 493 - 509
[6] CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chen, Chun-Fu
Fan, Quanfu
Panda, Rameswar
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 347 - 356
[7] Transformer Tracking
Chen, Xin
Yan, Bin
Zhu, Jiawen
Wang, Dong
Yang, Xiaoyun
Lu, Huchuan
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8122 - 8131
[8] Chen Y., 2023, Appl Intell, P1
[9] ATOM: Accurate Tracking by Overlap Maximization
Danelljan, Martin
Bhat, Goutam
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4655 - 4664
[10] ECO: Efficient Convolution Operators for Tracking
Danelljan, Martin
Bhat, Goutam
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6931 - 6939

← 1 2 3 4 5 6 7 8 →