CATrack: Convolution and Attention Feature Fusion for Visual Object Tracking

被引:0
|
作者
Zhang, Longkun [1 ]
Wen, Jiajun [1 ]
Dai, Zichen [1 ]
Zhou, Rouyi [1 ]
Lai, Zhihui [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX | 2024年 / 14433卷
关键词
Visual object tracking; Attention learning; Feature fusion; SIAMESE NETWORKS;
D O I
10.1007/978-981-99-8546-3_38
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In visual object tracking, information embedding and feature fusion between the target template and the search have been hot research spots in the past decades. Linear convolution is a common way to perform correlation operations. The convolution operation is good at processing local information, while ignoring global information. By contrast, the attention mechanism has the advantages of innate global information modeling. To model the local information of the target template and the global information of the search area, we propose a convolution and attention feature fusion module (CAM). Thus, the efficient information embedding and feature fusion can be achieved in parallel. Moreover, a bi-directional information flow bridge is constructed to realize information embedding and feature fusion between the target template and the search area. Specifically, it includes a convolution-to-attention bridge module (CABM) and an attention-to-convolutional bridge module(ACBM). Finally, we present a novel tracker based on convolution and attention (CATrack), which combines the advantages of convolution and attention operators, and has enhanced ability for accurate target positioning. Comprehensive experiments have been conducted on four tracking benchmarks: LaSOT, TrackingNet, GOT-10k and UAV123. Experiments show that the performance of our CATrack is more competitive than the state-of-the-art trackers.
引用
收藏
页码:469 / 480
页数:12
相关论文
共 50 条
  • [1] Adaptive feature fusion for visual object tracking
    Zhao, Shaochuan
    Xu, Tianyang
    Wu, Xiao-Jun
    Zhu, Xue-Feng
    PATTERN RECOGNITION, 2021, 111
  • [2] Object semantic-guided graph attention feature fusion network for Siamese visual tracking
    Zhang, Jianwei
    Miao, Mengen
    Zhang, Huanlong
    Wang, Jingchao
    Zhao, Yanchun
    Chen, Zhiwu
    Qiao, Jianwei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [3] Multilayer feature fusion and saliency-attention object tracking
    Wang, Lichao
    Shang, Yongjian
    Cheng, Qingyang
    Dong, Jiahui
    Geng, Shuqiao
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
  • [4] Visual Perception based Adaptive Feature Fusion for Visual Object Tracking
    Krieger, Evan
    Asari, Vijayan K.
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 1345 - 1350
  • [5] Object tracking with shallow convolution feature
    Wang, Wei
    Shi, Mingquan
    Li, Weiguang
    2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 97 - 100
  • [6] Adaptive cascaded and parallel feature fusion for visual object tracking
    Wang, Jun
    Li, Sixuan
    Li, Kunlun
    Zhu, Qizhen
    VISUAL COMPUTER, 2024, 40 (03): : 2119 - 2138
  • [7] Adaptive cascaded and parallel feature fusion for visual object tracking
    Jun Wang
    Sixuan Li
    Kunlun Li
    Qizhen Zhu
    The Visual Computer, 2024, 40 : 2119 - 2138
  • [8] Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking
    Fiaz, Mustansar
    Mahmood, Arif
    Jung, Soon Ki
    SENSORS, 2020, 20 (14) : 1 - 25
  • [9] Visual Tracking Combining Attention and Feature Fusion Network Modulation
    Xu Keying
    Shu Ping
    Bao Hua
    LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (12)
  • [10] Visual Object Tracking via Cascaded RPN Fusion and Coordinate Attention
    Zhang, Jianming
    Wang, Kai
    He, Yaoqi
    Kuang, Lidan
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2022, 132 (03): : 909 - 927