ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking

被引：11

作者：

Sadjadpour, Tara ^{[1
]}

Li, Jie ^{[2
]}

Ambrus, Rares ^{[3
]}

Bohg, Jeannette ^{[1
]}

机构：

[1] Stanford Univ, Sch Engn, Comp Sci Dept, Stanford, CA 94305 USA

[2] NVIDIA, Santa Clara, CA 95051 USA

[3] Toyota Res Inst, Los Altos, CA 94022 USA

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 05期

关键词：

Computer vision for transportation; deep learning for visual perception; visual tracking;

D O I：

10.1109/LRA.2023.3323124

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Multi-object tracking (MOT) is a cornerstone capability of any robotic system. Tracking quality is largely dependent on the quality of input detections. In many applications, such as autonomous driving, it is preferable to over-detect objects to avoid catastrophic outcomes due to missed detections. As a result, current state-of-the-art 3D detectors produce high rates of false-positives to ensure a low number of false-negatives. This can negatively affect tracking by making data association and track lifecycle management more challenging. Additionally, occasional false-negative detections due to difficult scenarios like occlusions can harm tracking performance. To address these issues in a unified framework, we propose ShaSTA which learns shape and spatio-temporal affinities between tracks and detections in consecutive frames. The affinity is a probabilistic matching that leads to robust data association, track lifecycle management, false-positive elimination, false-negative propagation, and sequential track confidence refinement. We offer the first self-contained framework that addresses all aspects of the 3D MOT problem. We quantitatively evaluate ShaSTA on the nuScenes tracking benchmark with 5 metrics, including the most common tracking accuracy metric called AMOTA, to demonstrate how ShaSTA may impact the ultimate goal of an autonomous mobile agent. ShaSTA achieves 1st place amongst LiDAR-only trackers that use CenterPoint detections.

引用

页码：4273 / 4280

页数：8

共 30 条

[11]

Kwon Y, 2019, arXiv

[12] PointPillars: Fast Encoders for Object Detection from Point Clouds [J].

Lang, Alex H. ;

Vora, Sourabh ;

Caesar, Holger ;

Zhou, Lubing ;

Yang, Jiong ;

Beijbom, Oscar .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12689-12697

[13] PnPNet: End-to-End Perception and Prediction with Tracking in the Loop [J].

Liang, Ming ;

Yang, Bin ;

Zeng, Wenyuan ;

Chen, Yun ;

Hu, Rui ;

Casas, Sergio ;

Urtasun, Raquel .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11550-11559

[14]

Liang MC, 2022, 2022 25TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2022)

[15] HOTA: A Higher Order Metric for Evaluating Multi-object Tracking [J].

Luiten, Jonathon ;

Osep, Aljosa ;

Dendorfer, Patrick ;

Torr, Philip ;

Geiger, Andreas ;

Leal-Taixe, Laura ;

Leibe, Bastian .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (02) :548-578

[16] Message Passing Algorithms for Scalable Multitarget Tracking [J].

Meyer, Florian ;

Kropfreiter, Thomas ;

Williams, Jason L. ;

Lau, Roslyn A. ;

Hlawatsch, Franz ;

Braca, Paolo ;

Win, Moe Z. .

PROCEEDINGS OF THE IEEE, 2018, 106 (02) :221-259

[17] Deep Hough Voting for 3D Object Detection in Point Clouds [J].

Qi, Charles R. ;

Litany, Or ;

He, Kaiming ;

Guibas, Leonidas J. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9276-9285

[18] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation [J].

Qi, Charles R. ;

Su, Hao ;

Mo, Kaichun ;

Guibas, Leonidas J. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :77-85

[19] Disentangling Monocular 3D Object Detection [J].

Simonelli, Andrea ;

Bulo, Samuel Rota ;

Porzi, Lorenzo ;

Lopez-Antequera, Manuel ;

Kontschieder, Peter .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1991-1999

[20] SpOT: Spatiotemporal Modeling for 3D Object Tracking [J].

Stearns, Colton ;

Rempe, Davis ;

Li, Jie ;

Ambrus, Rare ;

Zakharov, Sergey ;

Guizilini, Vitor ;

Yang, Yanchao ;

Guibas, Leonidas J. .

COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 :639-656

← 1 2 3 →