Kinematic 3D Object Detection in Monocular Video

被引:134
作者
Brazil, Garrick [1 ]
Pons-Moll, Gerard [2 ]
Liu, Xiaoming [1 ]
Schiele, Bernt [2 ]
机构
[1] Michigan State Univ, Comp Sci & Engn, E Lansing, MI 48824 USA
[2] Max Planck Inst Informat, Saarland Informatics Campus, Saarbrucken, Germany
来源
COMPUTER VISION - ECCV 2020, PT XXIII | 2020年 / 12368卷
关键词
3D object detection; Monocular; Video; Physics-based;
D O I
10.1007/978-3-030-58592-1_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Perceiving the physical world in 3D is fundamental for self-driving applications. Although temporal motion is an invaluable resource to human vision for detection, tracking, and depth perception, such features have not been thoroughly utilized in modern 3D object detectors. In this work, we propose a novel method for monocular video-based 3D object detection which leverages kinematic motion to extract scene dynamics and improve localization accuracy. We first propose a novel decomposition of object orientation and a self-balancing 3D confidence. We show that both components are critical to enable our kinematic model to work effectively. Collectively, using only a single model, we efficiently leverage 3D kinematics from monocular videos to improve the overall localization precision in 3D object detection while also producing useful by-products of scene dynamics (ego-motion and per-object velocity). We achieve state-of-the-art performance on monocular 3D object detection and the Bird's Eye View tasks within the KITTI self-driving dataset.
引用
收藏
页码:135 / 152
页数:18
相关论文
共 48 条
[21]   LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood [J].
Kumar, Abhinav ;
Marks, Tim K. ;
Mou, Wenxuan ;
Wang, Ye ;
Jones, Michael ;
Cherian, Anoop ;
Koike-Akino, Toshiaki ;
Liu, Xiaoming ;
Feng, Chen .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8233-8243
[22]   PointPillars: Fast Encoders for Object Detection from Point Clouds [J].
Lang, Alex H. ;
Vora, Sourabh ;
Caesar, Holger ;
Zhou, Lubing ;
Yang, Jiong ;
Beijbom, Oscar .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12689-12697
[23]   GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving [J].
Li, Buyu ;
Ouyang, Wanli ;
Sheng, Lu ;
Zeng, Xingyu ;
Wang, Xiaogang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1019-1028
[24]   Multi-Task Multi-Sensor Fusion for 3D Object Detection [J].
Liang, Ming ;
Yang, Bin ;
Chen, Yun ;
Hu, Rui ;
Urtasun, Raquel .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7337-7345
[25]   Deep Continuous Fusion for Multi-sensor 3D Object Detection [J].
Liang, Ming ;
Yang, Bin ;
Wang, Shenlong ;
Urtasun, Raquel .
COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :663-678
[26]   Deep Fitting Degree Scoring Network for Monocular 3D Object Detection [J].
Liu, Lijie ;
Lu, Jiwen ;
Xu, Chunjing ;
Tian, Qi ;
Zhou, Jie .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1057-1066
[27]   Geometry-aware Deep Network for Single-Image Novel View Synthesis [J].
Liu, Miaomiao ;
He, Xuming ;
Salzmann, Mathieu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4616-4624
[28]   Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving [J].
Ma, Xinzhu ;
Wang, Zhihui ;
Li, Haojie ;
Zhang, Pengbo ;
Ouyang, Wanli ;
Fan, Xin .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6850-6859
[29]   ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape [J].
Manhardt, Fabian ;
Kehl, Wadim ;
Gaidon, Adrien .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2064-2073
[30]   3D Bounding Box Estimation Using Deep Learning and Geometry [J].
Mousavian, Arsalan ;
Anguelov, Dragomir ;
Flynn, John ;
Kosecka, Jana .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5632-5640