RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving

被引：0

作者：

Zhang, Xinyu ^{[1
,2
]}

Wang, Li ^{[3
,4
,5
]}

Zhang, Guoxin ^{[2
,4
]}

Lan, Tianwei ^{[2
,4
]}

Zhang, Haoming ^{[2
,4
]}

Zhao, Lijun ^{[5
]}

Li, Jun ^{[2
,4
]}

Zhu, Lei ^{[6
]}

Liu, Huaping ^{[7
,8
]}

机构：

[1] Beihang Univ, Sch Transportat Sci & Engn, Beijing 100191, Peoples R China

[2] Tsinghua Univ, State Key Lab Automot Safety & Energy, Beijing 100084, Peoples R China

[3] State Key Lab Automot Safety & Energy, Beijing 100084, Peoples R China

[4] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China

[5] Harbin Inst Technol HIT, State Key Lab Robot & Syst, Harbin 150001, Peoples R China

[6] Mogo Auto Intelligence & Telemet Informat Technol, Beijing 100011, Peoples R China

[7] State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China

[8] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2023年 / 72卷

基金：

中国博士后科学基金; 中国国家自然科学基金; 国家高技术研究发展计划(863计划);

关键词：

3D object detection; autonomous driving; feature fusion; multimodal; self-attention;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The 3D object detection is becoming indispensable for environmental perception in autonomous driving. Light detection and ranging (LiDAR) point clouds often fail to distinguish objects with similar structures and are quite sparse for distant or small objects, thereby introducing false and missed detections. To address these issues, LiDAR is often fused with cameras due to the rich textural information provided by images. However, current fusion methods suffer the inefficient data representation and inaccurate alignment of heterogeneous features, leading to poor precision and low efficiency. To this end, we propose a plug-and-play module termed range-image fusion (RI-Fusion) to achieve an effective fusion of LiDAR and camera data, designed to be easily accessible by existing mainstream LiDAR-based algorithms. In this process, we design an image and point cloud alignment method by converting a point cloud into a compact range-view representation through a spherical coordinate transformation. The range image is then integrated with a corresponding camera image utilizing an attention mechanism. The original range image is then concatenated with fusion features to retain point cloud information, and the results are projected onto a spatial point cloud. Finally, the feature-enhanced point cloud can be input into a LiDAR-based 3D object detector. The results of validation experiments involving the KITTI 3D object detection benchmark showed that our proposed fusion method significantly enhanced multiple mainstream LiDAR-based 3D object detectors, PointPillars, SECOND, and PartA(2), improving the 3D mAP (mean Average Precision) by 3.61%, 2.98%, and 1.27%, respectively, particularly for small objects such as pedestrians and cyclists.

引用

页数：13

共 58 条

[31] Deep Hough Voting for 3D Object Detection in Point Clouds [J].

Qi, Charles R. ;

Litany, Or ;

He, Kaiming ;

Guibas, Leonidas J. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9276-9285

[32] Frustum PointNets for 3D Object Detection from RGB-D Data [J].

Qi, Charles R. ;

Liu, Wei ;

Wu, Chenxia ;

Su, Hao ;

Guibas, Leonidas J. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :918-927

[33]

Qi CR, 2017, Arxiv

[34] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection [J].

Shi, Shaoshuai ;

Guo, Chaoxu ;

Jiang, Li ;

Wang, Zhe ;

Shi, Jianping ;

Wang, Xiaogang ;

Li, Hongsheng .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10526-10535

[35]

Shi SS, 2020, Arxiv

[36] PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud [J].

Shi, Shaoshuai ;

Wang, Xiaogang ;

Li, Hongsheng .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :770-779

[37]

Sindagi VA, 2019, IEEE INT CONF ROBOT, P7276, DOI 10.1109/ICRA.2019.8794195

[38] RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection [J].

Sun, Pei ;

Wang, Weiyue ;

Chai, Yuning ;

Elsayed, Gamaleldin ;

Bewley, Alex ;

Zhang, Xiao ;

Sminchisescu, Cristian ;

Anguelov, Dragomir .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :5721-5730

[39] Scalability in Perception for Autonomous Driving: Waymo Open Dataset [J].

Sun, Pei ;

Kretzschmar, Henrik ;

Dotiwalla, Xerxes ;

Chouard, Aurelien ;

Patnaik, Vijaysai ;

Tsui, Paul ;

Guo, James ;

Zhou, Yin ;

Chai, Yuning ;

Caine, Benjamin ;

Vasudevan, Vijay ;

Han, Wei ;

Ngiam, Jiquan ;

Zhao, Hang ;

Timofeev, Aleksei ;

Ettinger, Scott ;

Krivokon, Maxim ;

Gao, Amy ;

Joshi, Aditya ;

Zhang, Yu ;

Shlens, Jonathon ;

Chen, Zhifeng ;

Anguelov, Dragomir .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2443-2451

[40] EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection [J].

Huang, Tengteng ;

Liu, Zhe ;

Chen, Xiwu ;

Bai, Xiang .

COMPUTER VISION - ECCV 2020, PT XV, 2020, 12360 :35-52

← 1 2 3 4 5 6 →