Joint spatio-temporal modeling for visual tracking

被引:5
|
作者
Sun, Yumei [1 ,2 ,3 ,4 ,5 ]
Tang, Chuanming [1 ,2 ,3 ,4 ,5 ]
Luo, Hui [1 ,2 ,3 ,4 ,5 ]
Li, Qingqing [1 ,2 ,3 ,5 ]
Peng, Xiaoming [5 ]
Zhang, Jianlin [1 ,2 ,3 ,4 ,5 ]
Li, Meihui [1 ,2 ,3 ,5 ]
Wei, Yuxing [1 ,2 ,3 ,5 ]
机构
[1] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 108408, Peoples R China
[2] Chinese Acad Sci, Key Lab Opt Engn, Chengdu 610209, Peoples R China
[3] Chinese Acad Sci, Inst Opt & Elect, Chengdu 610209, Peoples R China
[4] Chinese Acad Sci, Natl Key Lab Opt Field Manipulat Sci & Technol, Chengdu 610209, Peoples R China
[5] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu 611731, Peoples R China
关键词
Visual tracking; Siamese trackers; Sequence prediction; Spatio-temporal model;
D O I
10.1016/j.knosys.2023.111206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Similarity-based approaches have made significant progress in visual object tracking (VOT). Although these methods work well in simple scenes, they ignore the continuous spatio-temporal connection of the object in the video sequence. For this reason, tracking by spatial matching solely can lead to tracking failures because of distractors and occlusion. In this paper, we propose a spatio-temporal joint-modeling tracker named STTrack which implicitly builds continuous connections between the temporal and spatial aspects of the sequence. Specifically, we first design a time-sequence iteration strategy (TSIS) to concentrate on the temporal connection of the object in the video sequence. Then, we propose a novel spatial temporal interaction Transformer network (STIN) to capture the spatio-temporal correlation of the object between frames. The proposed STIN module is robust in object occlusion because it explores the dynamic state change dependencies of the object. Finally, we introduce a spatio-temporal query to suppress distractors by iteratively propagating the target prior. Extensive experiments on six tracking benchmark datasets demonstrate that the proposed STTrack achieves excellent performance while operating in real-time. The code is publicly available at https://github.com/nubsym/STTrack.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Joint Spatio-Temporal Similarity and Discrimination Learning for Visual Tracking
    Liang, Yanjie
    Chen, Haosheng
    Wu, Qiangqiang
    Xia, Changqun
    Li, Jia
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7284 - 7300
  • [2] Spatio-temporal joint aberrance suppressed correlation filter for visual tracking
    Libin Xu
    Pyoungwon Kim
    Mengjie Wang
    Jinfeng Pan
    Xiaomin Yang
    Mingliang Gao
    Complex & Intelligent Systems, 2022, 8 : 3765 - 3777
  • [3] Spatio-temporal joint aberrance suppressed correlation filter for visual tracking
    Xu, Libin
    Kim, Pyoungwon
    Wang, Mengjie
    Pan, Jinfeng
    Yang, Xiaomin
    Gao, Mingliang
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (05) : 3765 - 3777
  • [4] Hypothesis Testing Based Tracking With Spatio-Temporal Joint Interaction Modeling
    Sheng, Hao
    Zhang, Yang
    Wu, Yubin
    Wang, Shuai
    Lyu, Weifeng
    Ke, Wei
    Xiong, Zhang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (09) : 2971 - 2983
  • [5] Spatio-temporal Active Learning for Visual Tracking
    Liu, Chenfeng
    Zhu, Pengfei
    Hu, Qinghua
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [6] Learning Spatio-Temporal Transformer for Visual Tracking
    Yan, Bin
    Peng, Houwen
    Fu, Jianlong
    Wang, Dong
    Lu, Huchuan
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10428 - 10437
  • [7] Spatio-temporal matching for siamese visual tracking
    Zhang, Jinpu
    Dai, Kaiheng
    Li, Ziwen
    Wei, Ruonan
    Wang, Yuehuan
    NEUROCOMPUTING, 2023, 522 : 73 - 88
  • [8] Online visual tracking by integrating spatio-temporal cues
    He, Yang
    Pei, Mingtao
    Yang, Min
    Wu, Yuwei
    Jia, Yunde
    IET COMPUTER VISION, 2015, 9 (01) : 124 - 137
  • [9] Learning spatio-temporal correlation filter for visual tracking
    Yan, Youmin
    Guo, Xixian
    Tang, Jin
    Li, Chenglong
    Wang, Xin
    NEUROCOMPUTING, 2021, 436 : 273 - 282
  • [10] Deep learning of spatio-temporal information for visual tracking
    Gwangmin Choe
    Ilmyong Son
    Chunhwa Choe
    Hyoson So
    Hyokchol Kim
    Gyongnam Choe
    Multimedia Tools and Applications, 2022, 81 : 17283 - 17302