To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels

被引:49
作者
Chai, Yuning [1 ]
Sun, Pei [1 ]
Ngiam, Jiquan [2 ]
Wang, Weiyue [1 ]
Caine, Benjamin [2 ]
Vasudevan, Vijay [2 ]
Zhang, Xiao [1 ]
Anguelov, Dragomir [1 ]
机构
[1] Waymo LLC, Mountain View, CA 94043 USA
[2] Google Brain, Mountain View, CA USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.01574
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is vital for many robotics applications. For tasks where a 2D perspective range image exists, we propose to learn a 3D representation directly from this range image view. To this end, we designed a 2D convolutional network architecture that carries the 3D spherical coordinates of each pixel throughout the network. Its layers can consume any arbitrary convolution kernel in place of the default inner product kernel and exploit the underlying local geometry around each pixel. We outline four such kernels: a dense kernel according to the bag-of-words paradigm, and three graph kernels inspired by recent graph neural network advances: the Transformer, the PointNet, and the Edge Convolution. We also explore cross-modality fusion with the camera image, facilitated by operating in the perspective range image view. Our method performs competitively on the Waymo Open Dataset and improves the state-of-the-art AP for pedestrian detection from 69.7% to 75.5%. It is also efficient in that our smallest model, which still outperforms the popular PointPillars in quality, requires 180 times fewer FLOPS and model parameters.
引用
收藏
页码:15995 / 16004
页数:10
相关论文
共 33 条
[1]  
[Anonymous], 2015, CVPR
[2]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01298
[3]  
[Anonymous], 2019, Heat Treatment and Surface Engineering, DOI DOI 10.1080/25787616.2018.1560163
[4]  
Ba LJ, 2015, 2015 IEEE International Conference on Applied Superconductivity and Electromagnetic Devices (ASEMD), P3, DOI 10.1109/ASEMD.2015.7453438
[5]  
Bewley A., 2020, CORL
[6]  
Caesar Holger, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P11618, DOI 10.1109/CVPR42600.2020.01164
[7]  
Carion N., 2020, ARXIV200512872
[8]  
Chang A X, 2015, COMPUTER SCI, V1512, P3
[9]  
Cheng Shuyang, 2020, ECCV
[10]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554