Tracking Objects as Points

被引:877
作者
Zhou, Xingyi [1 ]
Koltun, Vladlen [2 ]
Krahenbuhl, Philipp [1 ]
机构
[1] UT Austin, Austin, UT 78712 USA
[2] Intel Labs, Hillsboro, OR USA
来源
COMPUTER VISION - ECCV 2020, PT IV | 2020年 / 12349卷
基金
美国国家科学基金会;
关键词
Multi-object tracking; Conditioned detection; 3D object tracking;
D O I
10.1007/978-3-030-58548-8_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tracking has traditionally been the art of following interest points through space and time. This changed with the rise of powerful deep networks. Nowadays, tracking is dominated by pipelines that perform object detection followed by temporal association, also known as tracking-by-detection. We present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art. Our tracker, CenterTrack, applies a detection model to a pair of images and detections from the prior frame. Given this minimal input, CenterTrack localizes objects and predicts their associations with the previous frame. That's it. CenterTrack is simple, online (no peeking into the future), and real-time. It achieves 67.8% MOTA on the MOT17 challenge at 22 FPS and 89.4% MOTA on the KITTI tracking benchmark at 15 FPS, setting a new state of the art on both datasets. CenterTrack is easily extended to monocular 3D tracking by regressing additional 3D attributes. Using monocular video input, it achieves 28.3% AMOTA@0.2 on the newly released nuScenes 3D tracking benchmark, substantially outperforming the monocular baseline on this benchmark while running at 28 FPS.
引用
收藏
页码:474 / 490
页数:17
相关论文
共 58 条
[1]   Tracking without bells and whistles [J].
Bergmann, Philipp ;
Meinhardt, Tim ;
Leal-Taixe, Laura .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :941-951
[2]  
Bewley A, 2016, IEEE IMAGE PROC, P3464, DOI 10.1109/ICIP.2016.7533003
[3]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[4]   Human Pose Estimation with Iterative Error Feedback [J].
Carreira, Joao ;
Agrawal, Pulkit ;
Fragkiadaki, Katerina ;
Malik, Jitendra .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4733-4742
[5]   Factors Influencing Pediatric Emergency Department Visits for Low-Acuity Conditions [J].
Long, Christina M. ;
Mehrhoff, Casey ;
Abdel-Latief, Eman ;
Rech, Megan ;
Laubham, Matthew .
PEDIATRIC EMERGENCY CARE, 2021, 37 (05) :265-268
[6]   Multiple Target Tracking in World Coordinate with Single, Minimally Calibrated Camera [J].
Choi, Wongun ;
Savarese, Silvio .
COMPUTER VISION-ECCV 2010, PT IV, 2010, 6314 :553-567
[7]   Parametric image alignment using enhanced correlation coefficient maximization [J].
Evangelidis, Georgios D. ;
Psarakis, Emmanouil Z. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (10) :1858-1865
[8]   Recurrent Autoregressive Networks for Online Multi-Object Tracking [J].
Fang, Kuan ;
Xiang, Yu ;
Li, Xiaocheng ;
Savarese, Silvio .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :466-475
[9]   Detect to Track and Track to Detect [J].
Feichtenhofer, Christoph ;
Pinz, Axel ;
Zisserman, Andrew .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3057-3065
[10]  
Felzenszwalb Pedro F, 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence, DOI DOI 10.1109/TPAMI.2009.167