Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

被引:160
作者
Wang, Yongxin [1 ]
Kitani, Kris [1 ]
Weng, Xinshuo [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021) | 2021年
关键词
MULTITARGET;
D O I
10.1109/ICRA48506.2021.9561110
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection and data association are critical components in multi-object tracking (MOT) systems. Despite the fact that the two components are dependent on each other, prior works often design detection and data association modules separately which are trained with separate objectives. As a result, one cannot back-propagate the gradients and optimize the entire MOT system, which leads to sub-optimal performance. To address this issue, recent works simultaneously optimize detection and data association modules under a joint MOT framework, which has shown improved performance in both modules. In this work, we propose a new instance of joint MOT approach based on Graph Neural Networks (GNNs). The key idea is that GNNs can model relations between variable-sized objects in both the spatial and temporal domains, which is essential for learning discriminative features for detection and data association. Through extensive experiments on the MOT15/16/17/20 datasets, we demonstrate the effectiveness of our GNN-based joint MOT approach and show state-of-the-art performance for both detection and MOT tasks.
引用
收藏
页码:13708 / 13715
页数:8
相关论文
共 87 条
  • [71] Video Object Detection with an Aligned Spatial-Temporal Memory
    Xiao, Fanyi
    Lee, Yong Jae
    [J]. COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 494 - 510
  • [72] Joint Detection and Identification Feature Learning for Person Search
    Xiao, Tong
    Li, Shuang
    Wang, Bochao
    Lin, Liang
    Wang, Xiaogang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3376 - 3385
  • [73] Xingyi Zhou, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12349), P474, DOI 10.1007/978-3-030-58548-8_28
  • [74] Yang F., 2016, CVPR
  • [75] POI: Multiple Object Tracking with High Performance Detection and Appearance Feature
    Yu, Fengwei
    Li, Wenbo
    Li, Quanquan
    Liu, Yu
    Shi, Xiaohua
    Yan, Junjie
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 36 - 42
  • [76] Deep Layer Aggregation
    Yu, Fisher
    Wang, Dequan
    Shelhamer, Evan
    Darrell, Trevor
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2403 - 2412
  • [77] Yurtsever E., 2019, ARXIV190605113
  • [78] Zagoruyko Sergey, 2016, P BRIT MACH VIS C BM, DOI DOI 10.5244/C.30.87
  • [79] GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs
    Zamir, Amir Roshan
    Dehghan, Afshin
    Shah, Mubarak
    [J]. COMPUTER VISION - ECCV 2012, PT II, 2012, 7573 : 343 - 356
  • [80] Zeyu Wang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12352), P629, DOI 10.1007/978-3-030-58571-6_37