TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

被引:104
|
作者
Chu, Peng [1 ]
Wang, Jiang [1 ]
You, Quanzeng [1 ]
Ling, Haibin [2 ]
Liu, Zicheng [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
来源
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年
基金
美国国家科学基金会;
关键词
D O I
10.1109/WACV56688.2023.00485
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tracking multiple objects in videos relies on modeling the spatial-temporal interactions of the objects. In this paper, we propose TransMOT, which leverages powerful graph transformers to efficiently model the spatial and temporal interactions among the objects. TransMOT is capable of effectively modeling the interactions of a large number of objects by arranging the trajectories of the tracked targets and detection candidates as a set of sparse weighted graphs, and constructing a spatial graph transformer encoder layer, a temporal transformer encoder layer, and a spatial graph transformer decoder layer based on the graphs. Through end-to-end learning, TransMOT can exploit the spatial-temporal clues to directly estimate association from a large number of loosely filtered detection predictions for robust MOT in complex scenes. The proposed method is evaluated on multiple benchmark datasets, including MOT15, MOT16, MOT17, and MOT20, and it achieves state-of-the-art performance on all the datasets.
引用
收藏
页码:4859 / 4869
页数:11
相关论文
共 50 条
  • [1] STDFormer: Spatial-Temporal Motion Transformer for Multiple Object Tracking
    Hu, Mengjie
    Zhu, Xiaotong
    Wang, Haotian
    Cao, Shixiang
    Liu, Chun
    Song, Qing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6571 - 6594
  • [2] Spatial-temporal graph Transformer for object tracking against noise interference
    Li, Ning
    Sang, Haiwei
    Zheng, Jiamin
    Ma, Huawei
    Wang, Xiaoying
    Xiao, Fu'an
    INFORMATION SCIENCES, 2024, 678
  • [3] Spatial-temporal Graph Transformer Network for Spatial-temporal Forecasting
    Dao, Minh-Son
    Zetsu, Koji
    Hoang, Duy-Tang
    Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024, 2024, : 1276 - 1281
  • [4] Concurrent Transformer for Spatial-Temporal Graph Modeling
    Xie, Yi
    Xiong, Yun
    Zhu, Yangyong
    Yu, Philip S.
    Jin, Cheng
    Wang, Qiang
    Li, Haihong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 314 - 321
  • [5] Object tracking in surveillance videos using spatial-temporal correlation graph model
    Zhang, Cheng
    Ma, Huadong
    Fu, Huiyuan
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2015, 41 (04): : 713 - 720
  • [6] Modeling of Multiple Spatial-Temporal Relations for Robust Visual Object Tracking
    Wang, Shilei
    Wang, Zhenhua
    Sun, Qianqian
    Cheng, Gong
    Ning, Jifeng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5073 - 5085
  • [7] SPRTracker: Learning Spatial-Temporal Pixel Aggregations for Multiple Object Tracking
    Liu, Jialin
    Kong, Jun
    Jiang, Min
    Liu, Tianshan
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2732 - 2736
  • [8] MASK GUIDED SPATIAL-TEMPORAL FUSION NETWORK FOR MULTIPLE OBJECT TRACKING
    Zhao, Shuangye
    Wu, Yubin
    Wang, Shuai
    Ke, Wei
    Sheng, Hao
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3231 - 3235
  • [9] A spatial-temporal contexts network for object tracking
    Huang, Kai
    Xiao, Kai
    Chu, Jun
    Leng, Lu
    Dong, Xingbo
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [10] A spatial-temporal graph gated transformer for traffic forecasting
    Bouchemoukha, Haroun
    Zennir, Mohamed Nadjib
    Alioua, Ahmed
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2024, 35 (07):