Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images

被引:454
作者
Song, Shuran [1 ]
Xiao, Jianxiong [1 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
来源
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR.2016.94
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We focus on the task of amodal 3D object detection in RGB-D images, which aims to produce a 3D bounding box of an object in metric form at its full extent. We introduce Deep Sliding Shapes, a 3D ConvNet formulation that takes a 3D volumetric scene from a RGB-D image as input and outputs 3D object bounding boxes. In our approach, we propose the first 3D Region Proposal Network (RPN) to learn objectness from geometric shapes and the first joint Object Recognition Network (ORN) to extract geometric features in 3D and color features in 2D. In particular, we handle objects of various sizes by training an amodal RPN at two different scales and an ORN to regress 3D bounding boxes. Experiments show that our algorithm outperforms the state-of-the-art by 13.8 in mAP and is 200x faster than the original Sliding Shapes.
引用
收藏
页码:808 / 816
页数:9
相关论文
共 32 条
[1]  
[Anonymous], CVPR
[2]  
[Anonymous], 2013, International journal of computer vision, DOI [10.1007/s11263-013-0620-5., DOI 10.1007/S11263-013-0620-5]
[3]   Multiscale Combinatorial Grouping [J].
Arbelaez, Pablo ;
Pont-Tuset, Jordi ;
Barron, Jonathan T. ;
Marques, Ferran ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :328-335
[4]  
Bo L., 2014, IJRR
[5]  
Bo Liefeng., 2013, ISER
[6]  
Chen XZ, 2015, ADV NEUR IN, V28
[7]  
Fang Y., 2015, CVPR
[8]   Structured Prediction of Unobserved Voxels From a Single Depth Image [J].
Firman, Michael ;
Mac Aodha, Oisin ;
Julier, Simon ;
Brostow, Gabriel J. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5431-5440
[9]  
Girshick R., 2014, P IEEE C COMPUTER VI, P580, DOI [10.1109/CVPR.2014.81, DOI 10.1109/CVPR.2014.81]
[10]  
Girshick R. B., 2015, Fast R-CNN, DOI [10.1109/ICCV.2015.169, DOI 10.1109/ICCV.2015.169]