TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking

被引：104

作者：

Chu, Peng ^{[1
]}

Wang, Jiang ^{[1
]}

You, Quanzeng ^{[1
]}

Ling, Haibin ^{[2
]}

Liu, Zicheng ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

[2] SUNY Stony Brook, Stony Brook, NY 11794 USA

来源：

2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/WACV56688.2023.00485

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Tracking multiple objects in videos relies on modeling the spatial-temporal interactions of the objects. In this paper, we propose TransMOT, which leverages powerful graph transformers to efficiently model the spatial and temporal interactions among the objects. TransMOT is capable of effectively modeling the interactions of a large number of objects by arranging the trajectories of the tracked targets and detection candidates as a set of sparse weighted graphs, and constructing a spatial graph transformer encoder layer, a temporal transformer encoder layer, and a spatial graph transformer decoder layer based on the graphs. Through end-to-end learning, TransMOT can exploit the spatial-temporal clues to directly estimate association from a large number of loosely filtered detection predictions for robust MOT in complex scenes. The proposed method is evaluated on multiple benchmark datasets, including MOT15, MOT16, MOT17, and MOT20, and it achieves state-of-the-art performance on all the datasets.

引用

页码：4859 / 4869

页数：11

共 50 条

[1] STDFormer: Spatial-Temporal Motion Transformer for Multiple Object Tracking
Hu, Mengjie
Zhu, Xiaotong
Wang, Haotian
Cao, Shixiang
Liu, Chun
Song, Qing
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6571 - 6594
[2] Spatial-temporal graph Transformer for object tracking against noise interference
Li, Ning
Sang, Haiwei
Zheng, Jiamin
Ma, Huawei
Wang, Xiaoying
Xiao, Fu'an
INFORMATION SCIENCES, 2024, 678
[3] Spatial-temporal Graph Transformer Network for Spatial-temporal Forecasting
Dao, Minh-Son
Zetsu, Koji
Hoang, Duy-Tang
Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024, 2024, : 1276 - 1281
[4] Concurrent Transformer for Spatial-Temporal Graph Modeling
Xie, Yi
Xiong, Yun
Zhu, Yangyong
Yu, Philip S.
Jin, Cheng
Wang, Qiang
Li, Haihong
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 314 - 321
[5] Object tracking in surveillance videos using spatial-temporal correlation graph model
Zhang, Cheng
Ma, Huadong
Fu, Huiyuan
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2015, 41 (04): : 713 - 720
[6] Modeling of Multiple Spatial-Temporal Relations for Robust Visual Object Tracking
Wang, Shilei
Wang, Zhenhua
Sun, Qianqian
Cheng, Gong
Ning, Jifeng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5073 - 5085
[7] SPRTracker: Learning Spatial-Temporal Pixel Aggregations for Multiple Object Tracking
Liu, Jialin
Kong, Jun
Jiang, Min
Liu, Tianshan
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2732 - 2736
[8] MASK GUIDED SPATIAL-TEMPORAL FUSION NETWORK FOR MULTIPLE OBJECT TRACKING
Zhao, Shuangye
Wu, Yubin
Wang, Shuai
Ke, Wei
Sheng, Hao
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3231 - 3235
[9] A spatial-temporal contexts network for object tracking
Huang, Kai
Xiao, Kai
Chu, Jun
Leng, Lu
Dong, Xingbo
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
[10] A spatial-temporal graph gated transformer for traffic forecasting
Bouchemoukha, Haroun
Zennir, Mohamed Nadjib
Alioua, Ahmed
TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2024, 35 (07):

← 1 2 3 4 5 →