SIEV-Net: A Structure-Information Enhanced Voxel Network for 3D Object Detection From LiDAR Point Clouds

被引：26

作者：

Yu, Chuanbo ^{[1
]}

Lei, Jianjun ^{[1
]}

Peng, Bo ^{[1
]}

Shen, Haifeng ^{[2
]}

Huang, Qingming ^{[3
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Didi Chuxing, AIoT Platform, Beijing 100193, Peoples R China

[3] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2022年 / 60卷

基金：

中国国家自然科学基金;

关键词：

Three-dimensional displays; Point cloud compression; Feature extraction; Object detection; Proposals; Laser radar; Semantics; 3D object detection; LiDAR point clouds; scene understanding; structure information;

D O I：

10.1109/TGRS.2022.3174483

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

As one of the fundamental tasks in scene understanding, 3D object detection from LiDAR point clouds has drawn extensive attention in the past few years. Although the existing voxel-based methods have achieved remarkable performance, how to effectively exploit geometric structure information of the point clouds to boost the detection performance remains to be explored. In this article, we propose a novel structure-information enhanced voxel network (SIEV-Net) for 3D object detection from LiDAR point clouds. The proposed SIEV-Net learns feature representations of 3D objects by jointly considering uneven spatial distribution and height information of the point clouds. Specifically, considering the uneven spatial distribution characteristics of point clouds, a hierarchical-voxel feature encoding module is proposed to effectively extract features of voxels in both sparse and dense regions. Besides, by utilizing the bird's eye view (BEV) map of point clouds, a height information complement module is designed to minimize the height information lost in the process of point feature aggregation in a voxel network. Experimental results on the widely used KITTI benchmark dataset have demonstrated the efficacy of the proposed SIEV-Net.

引用

页数：11

共 41 条

[1] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[2]

Chen YL, 2019, IEEE I CONF COMP VIS, P9774, DOI [10.1109/iccv.2019.00987, 10.1109/ICCV.2019.00987]

[3] Hyperspectral Image Classification With Squeeze Multibias Network [J].

Fang, Leyuan ;

Liu, Guangyun ;

Li, Shutao ;

Ghamisi, Pedram ;

Benediktsson, Jon Atli .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (03) :1291-1301

[4] Vision meets robotics: The KITTI dataset [J].

Geiger, A. ;

Lenz, P. ;

Stiller, C. ;

Urtasun, R. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237

[5] 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks [J].

Graham, Benjamin ;

Engelcke, Martin ;

van der Maaten, Laurens .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9224-9232

[6] Detection of Event of Interest for Satellite Video Understanding [J].

Gu, Yanfeng ;

Wang, Tengfei ;

Jin, Xudong ;

Gao, Guoming .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (11) :7860-7871

[7] EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection [J].

Huang, Tengteng ;

Liu, Zhe ;

Chen, Xiwu ;

Bai, Xiang .

COMPUTER VISION - ECCV 2020, PT XV, 2020, 12360 :35-52

[8]

Kingma DP, 2015, INT C LEARNING REPRE

[9]

Ku J, 2018, IEEE INT C INT ROBOT, P5750, DOI 10.1109/IROS.2018.8594049

[10] Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds [J].

Kuang, Hongwu ;

Wang, Bei ;

An, Jianping ;

Zhang, Ming ;

Zhang, Zehan .

SENSORS, 2020, 20 (03)

← 1 2 3 4 5 →