CATrack: Convolution and Attention Feature Fusion for Visual Object Tracking

被引：0

作者：

Zhang, Longkun ^{[1
]}

Wen, Jiajun ^{[1
]}

Dai, Zichen ^{[1
]}

Zhou, Rouyi ^{[1
]}

Lai, Zhihui ^{[1
]}

机构：

[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX | 2024年 / 14433卷

关键词：

Visual object tracking; Attention learning; Feature fusion; SIAMESE NETWORKS;

D O I：

10.1007/978-981-99-8546-3_38

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In visual object tracking, information embedding and feature fusion between the target template and the search have been hot research spots in the past decades. Linear convolution is a common way to perform correlation operations. The convolution operation is good at processing local information, while ignoring global information. By contrast, the attention mechanism has the advantages of innate global information modeling. To model the local information of the target template and the global information of the search area, we propose a convolution and attention feature fusion module (CAM). Thus, the efficient information embedding and feature fusion can be achieved in parallel. Moreover, a bi-directional information flow bridge is constructed to realize information embedding and feature fusion between the target template and the search area. Specifically, it includes a convolution-to-attention bridge module (CABM) and an attention-to-convolutional bridge module(ACBM). Finally, we present a novel tracker based on convolution and attention (CATrack), which combines the advantages of convolution and attention operators, and has enhanced ability for accurate target positioning. Comprehensive experiments have been conducted on four tracking benchmarks: LaSOT, TrackingNet, GOT-10k and UAV123. Experiments show that the performance of our CATrack is more competitive than the state-of-the-art trackers.

引用

页码：469 / 480

页数：12

共 50 条

[1] Adaptive feature fusion for visual object tracking
Zhao, Shaochuan
Xu, Tianyang
Wu, Xiao-Jun
Zhu, Xue-Feng
PATTERN RECOGNITION, 2021, 111
[2] Object semantic-guided graph attention feature fusion network for Siamese visual tracking
Zhang, Jianwei
Miao, Mengen
Zhang, Huanlong
Wang, Jingchao
Zhao, Yanchun
Chen, Zhiwu
Qiao, Jianwei
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
[3] Multilayer feature fusion and saliency-attention object tracking
Wang, Lichao
Shang, Yongjian
Cheng, Qingyang
Dong, Jiahui
Geng, Shuqiao
JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
[4] Visual Perception based Adaptive Feature Fusion for Visual Object Tracking
Krieger, Evan
Asari, Vijayan K.
2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 1345 - 1350
[5] Object tracking with shallow convolution feature
Wang, Wei
Shi, Mingquan
Li, Weiguang
2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 97 - 100
[6] Adaptive cascaded and parallel feature fusion for visual object tracking
Wang, Jun
Li, Sixuan
Li, Kunlun
Zhu, Qizhen
VISUAL COMPUTER, 2024, 40 (03): : 2119 - 2138
[7] Adaptive cascaded and parallel feature fusion for visual object tracking
Jun Wang
Sixuan Li
Kunlun Li
Qizhen Zhu
The Visual Computer, 2024, 40 : 2119 - 2138
[8] Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking
Fiaz, Mustansar
Mahmood, Arif
Jung, Soon Ki
SENSORS, 2020, 20 (14) : 1 - 25
[9] Visual Tracking Combining Attention and Feature Fusion Network Modulation
Xu Keying
Shu Ping
Bao Hua
LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (12)
[10] Visual Object Tracking via Cascaded RPN Fusion and Coordinate Attention
Zhang, Jianming
Wang, Kai
He, Yaoqi
Kuang, Lidan
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2022, 132 (03): : 909 - 927

← 1 2 3 4 5 →