EagerMOT: 3D Multi-Object Tracking via Sensor Fusion

被引:136
作者
Kim, Aleksandr [1 ]
Osep, Aljosa [1 ]
Leal-Taixe, Laura [1 ]
机构
[1] Tech Univ Munich, Munich, Germany
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021) | 2021年
关键词
D O I
10.1109/ICRA48506.2021.9562072
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time. Existing methods rely on depth sensors (e.g., LiDAR) to detect and track targets in 3D space, but only up to a limited sensing range due to the sparsity of the signal. On the other hand, cameras provide a dense and rich visual signal that helps to localize even distant objects, but only in the image domain. In this paper, we propose EagerMOT, a simple tracking formulation that eagerly integrates all available object observations from both sensor modalities to obtain a well-informed interpretation of the scene dynamics. Using images, we can identify distant incoming objects, while depth estimates allow for precise trajectory localization as soon as objects are within the depth-sensing range. With EagerMOT, we achieve state-of-the-art results across several MOT tasks on the KITTI and NuScenes datasets. Our code is available at https://github.com/aleksandrkim61/EagerMOT
引用
收藏
页码:11315 / 11321
页数:7
相关论文
共 43 条
[1]  
[Anonymous], 2020, IROS, DOI DOI 10.1109/IROS45743.2020.9341164
[2]  
[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00653
[3]  
[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00628
[4]  
[Anonymous], 2016, CVPR WORKSH, DOI DOI 10.1109/CVPRW.2016.59
[5]  
Baser E., 2019, INT VEH S
[6]   Tracking without bells and whistles [J].
Bergmann, Philipp ;
Meinhardt, Tim ;
Leal-Taixe, Laura .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :941-951
[7]   Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics [J].
Bernardin, Keni ;
Stiefelhagen, Rainer .
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2008, 2008 (1)
[8]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[9]   Argoverse: 3D Tracking and Forecasting with Rich Maps [J].
Chang, Ming-Fang ;
Lambert, John ;
Sangkloy, Patsorn ;
Singh, Jagjeet ;
Bak, Slawomir ;
Hartnett, Andrew ;
Wang, De ;
Carr, Peter ;
Lucey, Simon ;
Ramanan, Deva ;
Hays, James .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8740-8749
[10]  
Chen K., 2019, arXiv:1906.07155