Online Multi-Scale Classification and Global Feature Modulation for Robust Visual Tracking

被引:3
|
作者
Gao, Qi [1 ]
Yin, Mingfeng [2 ]
Wu, Xiang [3 ]
Liu, Di [4 ]
Bo, Yuming [3 ]
机构
[1] Jiangsu Univ Technol, Coll Mech Engn, Changzhou 213001, Peoples R China
[2] Jiangsu Univ Technol, Sch Automobile & Traff Engn, Changzhou 213001, Peoples R China
[3] Nanjing Univ Sci & Technol, Sch Automat, Nanjing 210094, Peoples R China
[4] Nanjing Inst Technol, Sch Automat, Nanjing 211167, Peoples R China
基金
中国国家自然科学基金;
关键词
Visualization; Target tracking; Accuracy; Fuses; Modulation; Transformers; Real-time systems; Visual object tracking; coordinate attention; online multi-scale classification; global feature modulation; OBJECT TRACKING;
D O I
10.1109/TCSVT.2023.3343949
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recent advanced trackers, composed of discriminative classification and dedicated bounding box estimation, have achieved remarkable advancements in performance of visual object tracking. However, existing methods cannot satisfy the demands of tracking tasks in complex scenes, such as occlusion, scale variations, and etc. To this end, we propose a novel online multi-scale classification and global feature modulation for robust visual tracking, which is developed over accurate tracking by overlap maximization, named ATOM+. First, coordinate attention (CA) is applied to enhance the target features in the channel dimension and spatial dimension, which can effectively optimize the feature representation ability of the backbone network. Second, an online multi-scale classification (OMC) module is designed. During the online tracking phase, more reliable matching responses are comprehensively generated by aggregating information from different scales related to the target. This new operation enables stable perception of the target by the tracker, particularly when severe changes in the appearance and posture of the target are encountered. Third, a global feature modulation (GFM) mechanism is constructed, which requires only a small amount of computational resources, to fuse the spatial contextual information of the template image into the search region. This integration refines the bounding box to obtain an accurate estimate of the target state. Finally, comprehensive experiments on conventional tracking benchmarks of OTB100, LaSOT, and VOT2018 show that our tracker can sufficiently address different challenging scenarios, and achieves state-of-the-art performance. For the average running speed, our tracker can achieve 37 FPS in real time.
引用
收藏
页码:5321 / 5334
页数:14
相关论文
共 50 条
  • [1] Robust visual tracking via identifying multi-scale patches
    Liang, Yun
    Li, Ke
    Zhang, Jian
    Wang, Meihua
    Lin, Chen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (11) : 14195 - 14230
  • [2] Robust Visual Tracking Using Multi-Frame Multi-Feature Joint Modeling
    Zhang, Peng
    Yu, Shujian
    Xu, Jiamiao
    You, Xinge
    Jiang, Xiubao
    Jing, Xiao-Yuan
    Tao, Dacheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (12) : 3673 - 3686
  • [3] MSTrack: Visual Tracking with Multi-scale Attention
    Song, Chunlin
    Yao, Yu
    Guo, Jianhui
    Li, Lunbo
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 337 - 344
  • [4] Multi-Scale Feature Fusion and Distribution Similarity Network for Few-Shot Automatic Modulation Classification
    Tan, Haoyue
    Zhang, Zhenxi
    Li, Yu
    Shi, Xiaoran
    Zhou, Feng
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2890 - 2894
  • [5] Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning
    Xue, Wanli
    Xu, Chao
    Feng, Zhiyong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 2849 - 2860
  • [6] Using Segmentation With Multi-Scale Selective Kernel for Visual Object Tracking
    Bao, Feng
    Cao, Yifei
    Zhang, Shunli
    Lin, Beibei
    Zhao, Sicong
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 553 - 557
  • [7] Multi-Scale Kernelized Least Squares for Visual Tracking
    Liu, Junbin
    Xie, Weixin
    Li, Liangqun
    PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 914 - 918
  • [8] MSST-ResNet: Deep multi-scale spatiotemporal features for robust visual object tracking
    Liu, Bing
    Liu, Qiao
    Zhu, Zhengyu
    Zhang, Taiping
    Yang, Yong
    KNOWLEDGE-BASED SYSTEMS, 2019, 164 : 235 - 252
  • [9] Adaptive multi-scale color feature target tracking algorithm
    Li Xiao-yun
    He Qiu-sheng
    Zhang Wei-feng
    Liang Hui-hui
    Chen Wei
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2019, 34 (03) : 291 - 301
  • [10] DeepTrack: Learning Discriminative Feature Representations Online for Robust Visual Tracking
    Li, Hanxi
    Li, Yi
    Porikli, Fatih
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (04) : 1834 - 1848