Spatio-temporal interactive fusion based visual object tracking method

被引:0
作者
Huang, Dandan [1 ]
Yu, Siyu [1 ]
Duan, Jin [1 ]
Wang, Yingzhi [1 ]
Yao, Anni [1 ]
Wang, Yiwen [1 ]
Xi, Junhan [1 ]
机构
[1] Changchun Univ Sci & Technol, Coll Elect Informat Engn, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
object tracking; spatio-temporal context; feature enhancement; feature fusion; attention mechanism;
D O I
10.3389/fphy.2023.1269638
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Visual object tracking tasks often struggle with utilizing inter-frame correlation information and handling challenges like local occlusion, deformations, and background interference. To address these issues, this paper proposes a spatio-temporal interactive fusion (STIF) based visual object tracking method. The goal is to fully utilize spatio-temporal background information, enhance feature representation for object recognition, improve tracking accuracy, adapt to object changes, and reduce model drift. The proposed method incorporates feature-enhanced networks in both temporal and spatial dimensions. It leverages spatio-temporal background information to extract salient features that contribute to improved object recognition and tracking accuracy. Additionally, the model's adaptability to object changes is enhanced, and model drift is minimized. A spatio-temporal interactive fusion network is employed to learn a similarity metric between the memory frame and the query frame by utilizing feature enhancement. This fusion network effectively filters out stronger feature representations through the interactive fusion of information. The proposed tracking method is evaluated on four challenging public datasets. The results demonstrate that the method achieves state-of-the-art (SOTA) performance and significantly improves tracking accuracy in complex scenarios affected by local occlusion, deformations, and background interference. Finally, the method achieves a remarkable success rate of 78.8% on TrackingNet, a large-scale tracking dataset.
引用
收藏
页数:14
相关论文
共 33 条
  • [1] 24Zhang D., 2022, P INT C MACH LEARN P
  • [2] [Anonymous], 2020, PROC AAAI, DOI DOI 10.1609/AAAI.V34I07.6944
  • [3] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [4] Learning Discriminative Model Prediction for Tracking
    Bhat, Goutam
    Danelljan, Martin
    Van Gool, Luc
    Timofte, Radu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
  • [5] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [6] Danelljan M., 2019, CVPR, P4660
  • [7] LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking
    Fan, Heng
    Lin, Liting
    Yang, Fan
    Chu, Peng
    Deng, Ge
    Yu, Sijia
    Bai, Hexin
    Xu, Yong
    Liao, Chunyuan
    Ling, Haibin
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5369 - 5378
  • [8] STMTrack: Template-free Visual Tracking with Space-time Memory Networks
    Fu, Zhihong
    Liu, Qingjie
    Fu, Zehua
    Wang, Yunhong
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13769 - 13778
  • [9] Graph Attention Tracking
    Guo, Dongyan
    Shao, Yanyan
    Cui, Ying
    Wang, Zhenhua
    Zhang, Liyan
    Shen, Chunhua
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9538 - 9547
  • [10] Learning Dynamic Siamese Network for Visual Object Tracking
    Guo, Qing
    Feng, Wei
    Zhou, Ce
    Huang, Rui
    Wan, Liang
    Wang, Song
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1781 - 1789