Online Multi-Scale Classification and Global Feature Modulation for Robust Visual Tracking

被引:3
作者
Gao, Qi [1 ]
Yin, Mingfeng [2 ]
Wu, Xiang [3 ]
Liu, Di [4 ]
Bo, Yuming [3 ]
机构
[1] Jiangsu Univ Technol, Coll Mech Engn, Changzhou 213001, Peoples R China
[2] Jiangsu Univ Technol, Sch Automobile & Traff Engn, Changzhou 213001, Peoples R China
[3] Nanjing Univ Sci & Technol, Sch Automat, Nanjing 210094, Peoples R China
[4] Nanjing Inst Technol, Sch Automat, Nanjing 211167, Peoples R China
基金
中国国家自然科学基金;
关键词
Visualization; Target tracking; Accuracy; Fuses; Modulation; Transformers; Real-time systems; Visual object tracking; coordinate attention; online multi-scale classification; global feature modulation; OBJECT TRACKING;
D O I
10.1109/TCSVT.2023.3343949
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recent advanced trackers, composed of discriminative classification and dedicated bounding box estimation, have achieved remarkable advancements in performance of visual object tracking. However, existing methods cannot satisfy the demands of tracking tasks in complex scenes, such as occlusion, scale variations, and etc. To this end, we propose a novel online multi-scale classification and global feature modulation for robust visual tracking, which is developed over accurate tracking by overlap maximization, named ATOM+. First, coordinate attention (CA) is applied to enhance the target features in the channel dimension and spatial dimension, which can effectively optimize the feature representation ability of the backbone network. Second, an online multi-scale classification (OMC) module is designed. During the online tracking phase, more reliable matching responses are comprehensively generated by aggregating information from different scales related to the target. This new operation enables stable perception of the target by the tracker, particularly when severe changes in the appearance and posture of the target are encountered. Third, a global feature modulation (GFM) mechanism is constructed, which requires only a small amount of computational resources, to fuse the spatial contextual information of the template image into the search region. This integration refines the bounding box to obtain an accurate estimate of the target state. Finally, comprehensive experiments on conventional tracking benchmarks of OTB100, LaSOT, and VOT2018 show that our tracker can sufficiently address different challenging scenarios, and achieves state-of-the-art performance. For the average running speed, our tracker can achieve 37 FPS in real time.
引用
收藏
页码:5321 / 5334
页数:14
相关论文
共 50 条
  • [31] Robust Multi-feature Visual Tracking with a Saliency-based Target Descriptor
    Zhu Su
    Bo Yuming
    He Liang
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 5008 - 5013
  • [32] Learning Multi-feature Based Spatially Regularized and Scale Adaptive Correlation Filters for Visual Tracking
    She, Ying
    Yi, Yang
    MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 480 - 491
  • [33] Robust object tracking via multi-scale patch based sparse coding histogram
    Zhongpei Wang
    Hao Wang
    Jieqing Tan
    Peng Chen
    Chengjun Xie
    Multimedia Tools and Applications, 2017, 76 : 12181 - 12203
  • [34] Robust object tracking via multi-scale patch based sparse coding histogram
    Wang, Zhongpei
    Wang, Hao
    Tan, Jieqing
    Chen, Peng
    Xie, Chengjun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (10) : 12181 - 12203
  • [35] Robust object tracking based on ridge regression and multi-scale local sparse coding
    Zhiqiang Zhao
    Liwen Xiong
    Zhuolin Mei
    Bin Wu
    Zongmin Cui
    Tianjiang Wang
    Zhijian Zhao
    Multimedia Tools and Applications, 2020, 79 : 785 - 804
  • [36] ROBUST AND REAL-TIME DEEP TRACKING VIA MULTI-SCALE DOMAIN ADAPTATION
    Wang, Xinyu
    Li, Hanxi
    Li, Yi
    Shen, Fumin
    Porikli, Fatih
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1338 - 1343
  • [37] Robust object tracking based on ridge regression and multi-scale local sparse coding
    Zhao, Zhiqiang
    Xiong, Liwen
    Mei, Zhuolin
    Wu, Bin
    Cui, Zongmin
    Wang, Tianjiang
    Zhao, Zhijian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (1-2) : 785 - 804
  • [38] Feature Complement for Visual Tracking Based on Global Feature Comparison
    He, Xiaowei
    Chen, Yue
    Zheng, Zhonglong
    Bian, Pengcheng
    IEEE ACCESS, 2023, 11 : 11840 - 11848
  • [39] Multi-scale patch-based sparse appearance model for robust object tracking
    Xie, Chengjun
    Tan, Jieqing
    Chen, Peng
    Zhang, Jie
    He, Lei
    MACHINE VISION AND APPLICATIONS, 2014, 25 (07) : 1859 - 1876
  • [40] Multi-scale patch-based sparse appearance model for robust object tracking
    Chengjun Xie
    Jieqing Tan
    Peng Chen
    Jie Zhang
    Lei He
    Machine Vision and Applications, 2014, 25 : 1859 - 1876