Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation

被引：87

作者：

Sun, Jiaming ^{[1
,2
]}

Chen, Linghao ^{[1
]}

Xie, Yiming ^{[1
]}

Zhang, Siyu ^{[3
]}

Jiang, Qinhong ^{[2
]}

Zhou, Xiaowei ^{[1
]}

Bao, Hujun ^{[1
]}

机构：

[1] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou, Peoples R China

[2] SenseTime, Hong Kong, Peoples R China

[3] Southern Univ Sci & Technol, Shenzhen, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

关键词：

D O I：

10.1109/CVPR42600.2020.01056

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a novel system named Disp R-CNN for 3D object detection from stereo images. Many recent works solve this problem by first recovering a point cloud with disparity estimation and then apply a 3D detector. The disparity map is computed for the entire image, which is costly and fails to leverage category-specific prior. In contrast, we design an instance disparity estimation network (iDispNet) that predicts disparity only for pixels on objects of interest and learns a category-specific shape prior for more accurate disparity estimation. To address the challenge from scarcity of disparity annotation in training, we propose to use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds, which makes our system more widely applicable. Experiments on the KITTI dataset show that, even when LiDAR ground-truth is not available at training time, Disp R-CNN achieves competitive performance and outperforms previous state-of-the-art methods by 20% in terms of average precision. The code will be available at https://github.com/zju3dv/disprcnn.

引用

页码：10545 / 10554

页数：10

共 34 条

[1] nuScenes: A multimodal dataset for autonomous driving [J].

Caesar, Holger ;

Bankiti, Varun ;

Lang, Alex H. ;

Vora, Sourabh ;

Liong, Venice Erin ;

Xu, Qiang ;

Krishnan, Anush ;

Pan, Yu ;

Baldan, Giancarlo ;

Beijbom, Oscar .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628

[2]

Chang A X, 2015, COMPUTER SCI, V1512, P3

[3] Pyramid Stereo Matching Network [J].

Chang, Jia-Ren ;

Chen, Yong-Sheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418

[4] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[5]

Chen XZ, 2015, ADV NEUR IN, V28

[6] On Robustness of Neural Architecture Search Under Label Noise [J].

Chen, Yi-Wei ;

Song, Qingquan ;

Liu, Xi ;

Sastry, P. S. ;

Hu, Xia .

FRONTIERS IN BIG DATA, 2020, 3

[7]

Cline, 1987, COMPUT GRAPH, V21, P163, DOI DOI 10.1145/37402.37422

[8] Joint Object Pose Estimation and Shape Reconstruction in Urban Street Scenes Using 3D Shape Priors [J].

Engelmann, Francis ;

Stueckler, Joerg ;

Leibe, Bastian .

PATTERN RECOGNITION, GCPR 2016, 2016, 9796 :219-230

[9]

Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074

[10]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

← 1 2 3 4 →