PIFNet: 3D Object Detection Using Joint Image and Point Cloud Features for Autonomous Driving

被引：10

作者：

Zheng, Wenqi ^{[1
]}

Xie, Han ^{[1
]}

Chen, Yunfan ^{[2
]}

Roh, Jeongjin ^{[1
]}

Shin, Hyunchul ^{[1
]}

机构：

[1] Hanyang Univ, Dept Elect Engn, Ansan 15588, South Korea

[2] Hubei Univ Technol, Sch Elect & Elect Engn, Wuhan 430068, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 07期

关键词：

3D object detection; lidar point cloud; camera images; object detection;

D O I：

10.3390/app12073686

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Owing to its wide range of applications, 3D object detection has attracted increasing attention in computer vision tasks. Most existing 3D object detection methods are based on Lidar point cloud data. However, these methods have some limitations in localization consistency and classification confidence, due to the irregularity and sparsity of Light Detection and Ranging (LiDAR) point cloud data. Inspired by the complementary characteristics of Lidar and camera sensors, we propose a new end-to-end learnable framework named Point-Image Fusion Network (PIFNet) to integrate the LiDAR point cloud and camera images. To resolve the problem of inconsistency in the localization and classification, we designed an Encoder-Decoder Fusion (EDF) module to extract the image features effectively, while maintaining the fine-grained localization information of objects. Furthermore, a new effective fusion module is proposed to integrate the color and texture features from images and the depth information from the point cloud. This module can enhance the irregularity and sparsity problem of the point cloud features by capitalizing the fine-grained information from camera images. In PIFNet, each intermediate feature map is fed into the fusion module to be integrated with its corresponding point-wise features. Furthermore, point-wise features are used instead of voxel-wise features to reduce information loss. Extensive experiments using the KITTI dataset demonstrate the superiority of PIFNet over other state-of-the-art methods. Compared with several state-of-the-art methods, our approach outperformed by 1.97% in mean Average Precision (mAP) and by 2.86% in Average Precision (AP) for the hard cases on the KITTI 3D object detection benchmark.

引用

页数：11

共 34 条

[1]

Barrera A., 2020, 2020 IEEE 23 INT C I

[2] 3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhu, Yukun ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (05) :1259-1272

[3] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[4] Monocular 3D Object Detection for Autonomous Driving [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhang, Ziyu ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156

[5]

Dominic Z.W., 2015, P ROB SCI SYST 11 RO

[6] A Dynamic Clustering Algorithm for Lidar Obstacle Detection of Autonomous Driving System [J].

Gao, Feng ;

Li, Caihong ;

Zhang, Bowen .

IEEE SENSORS JOURNAL, 2021, 21 (22) :25922-25930

[7]

Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074

[8] Structure Aware Single-stage 3D Object Detection from Point Cloud [J].

He, Chenhang ;

Zeng, Hui ;

Huang, Jianqiang ;

Hua, Xian-Sheng ;

Zhang, Lei .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11870-11879

[9] EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection [J].

Huang, Tengteng ;

Liu, Zhe ;

Chen, Xiwu ;

Bai, Xiang .

COMPUTER VISION - ECCV 2020, PT XV, 2020, 12360 :35-52

[10]

Ioffe Sergey, 2015, P MACHINE LEARNING R, V37, P448

← 1 2 3 4 →