Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

被引:160
作者
Wang, Yongxin [1 ]
Kitani, Kris [1 ]
Weng, Xinshuo [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021) | 2021年
关键词
MULTITARGET;
D O I
10.1109/ICRA48506.2021.9561110
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection and data association are critical components in multi-object tracking (MOT) systems. Despite the fact that the two components are dependent on each other, prior works often design detection and data association modules separately which are trained with separate objectives. As a result, one cannot back-propagate the gradients and optimize the entire MOT system, which leads to sub-optimal performance. To address this issue, recent works simultaneously optimize detection and data association modules under a joint MOT framework, which has shown improved performance in both modules. In this work, we propose a new instance of joint MOT approach based on Graph Neural Networks (GNNs). The key idea is that GNNs can model relations between variable-sized objects in both the spatial and temporal domains, which is essential for learning discriminative features for detection and data association. Through extensive experiments on the MOT15/16/17/20 datasets, we demonstrate the effectiveness of our GNN-based joint MOT approach and show state-of-the-art performance for both detection and MOT tasks.
引用
收藏
页码:13708 / 13715
页数:8
相关论文
共 87 条
  • [21] Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor
    Choi, Wongun
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 3029 - 3037
  • [22] Virtual to Real Adaptation of Pedestrian Detectors
    Ciampi, Luca
    Messina, Nicola
    Falchi, Fabrizio
    Gennaro, Claudio
    Amato, Giuseppe
    [J]. SENSORS, 2020, 20 (18) : 1 - 14
  • [23] Cipolla R., 2018, CVPR
  • [24] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [25] Dendorfer P., 2019, TPAMI
  • [26] Dendorfer P., 2020, Mot20: A benchmark for multi object tracking in crowded scenes
  • [27] Dollár P, 2009, PROC CVPR IEEE, P304, DOI 10.1109/CVPRW.2009.5206631
  • [28] CenterNet: Keypoint Triplets for Object Detection
    Duan, Kaiwen
    Bai, Song
    Xie, Lingxi
    Qi, Honggang
    Huang, Qingming
    Tian, Qi
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577
  • [29] Recurrent Autoregressive Networks for Online Multi-Object Tracking
    Fang, Kuan
    Xiang, Yu
    Li, Xiaocheng
    Savarese, Silvio
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 466 - 475
  • [30] Detect to Track and Track to Detect
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3057 - 3065