HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection

被引:113
作者
Noh, Jongyoun [1 ]
Lee, Sanghoon [1 ]
Ham, Bumsub [1 ]
机构
[1] Yonsei Univ, Sch Elect & Elect Engn, Seoul, South Korea
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR46437.2021.01437
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the problem of 3D object detection, that is, estimating 3D object bounding boxes from point clouds. 3D object detection methods exploit either voxel-based or point-based features to represent 3D objects in a scene. Voxel-based features are efficient to extract, while they fail to preserve fine-grained 3D structures of objects. Point-based features, on the other hand, represent the 3D structures more accurately, but extracting these features is computationally expensive. We introduce in this paper a novel single-stage 3D detection method having the merit of both voxel-based and point-based features. To this end, we propose a new convolutional neural network (CNN) architecture, dubbed HVPR, that integrates both features into a single 3D representation effectively and efficiently. Specifically, we augment the point-based features with a memory module to reduce the computational cost. We then aggregate the features in the memory, semantically similar to each voxel-based one, to obtain a hybrid 3D representation in a form of a pseudo image, allowing to localize 3D objects in a single stage efficiently. We also propose an Attentive Multi-scale Feature Module (AMFM) that extracts scale-aware features considering the sparse and irregular patterns of point clouds. Experimental results on the KITTI dataset demonstrate the effectiveness and efficiency of our approach, achieving a better compromise in terms of speed and accuracy.
引用
收藏
页码:14600 / 14609
页数:10
相关论文
共 50 条
[1]  
[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00327
[2]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00210
[3]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01298
[4]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00752
[5]  
[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00550
[6]  
[Anonymous], 2016, ICML
[7]  
[Anonymous], 2016, ICML
[8]  
[Anonymous], 2022, ternational
[9]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[10]  
Cai MY, 2017, 2017 IEEE INTERNATIONAL SYMPOSIUM ON SYSTEMS ENGINEERING (ISSE 2017), P6, DOI 10.1109/SysEng.2017.8088250