Dense frustum-aware fusion for 3D object detection in perception systems

被引:5
作者
Deng, Yuanzhi [1 ,2 ]
Shen, Jianhao [2 ]
Wen, Huajie [1 ,2 ]
Chi, Cheng [2 ]
Zhou, Yang [1 ,2 ]
Xu, Gang [1 ,2 ]
机构
[1] Shenzhen Univ, Coll Appl Technol, Shenzhen 518060, Peoples R China
[2] Shenzhen Technol Univ, Shenzhen Key Lab Urban Rail Transit, Shenzhen 518118, Peoples R China
基金
中国国家自然科学基金;
关键词
3D object detection; Autonomous driving; Frustum; Point clouds; Densification;
D O I
10.1016/j.eswa.2023.122061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Autonomous driving perception relies on standard sensors such as color cameras and LiDAR, but each sensor has limitations when sensing complex and diverse environments. As a result, sensor fusion techniques have attracted attention due to their stable perception performance. Previous studies have focused on fusing camera images and LiDAR point clouds for 3D object detection. However, while the camera provides rich texture information at high resolution, the features of objects at sparse point clouds are underutilized, suggesting a gap in the literature. This work proposes a flexible fusion network for 3D object detection. It includes a frustum aware decorator (FAD) that densifies and applies a textured surface to point clouds. A voxel-wise encoder is then applied, and point cloud features are extracted and aligned from a bird's-eye view before being fused with camera images. The fused features are passed through a region proposal network and detection head sequentially for 3D object detection. Our proposed network achieves leading mean average precision (mAP) of 71.49 and 72.09 in a multi-model comparison of the KITTI and nuScenes 3D object detection benchmarks, respectively. In addition, the novel FAD can also be combined with other state-of-the-art methods flexiblely. A series of comparison experiments demonstrate that integration with the FAD could at least widely increase the +2.0 mAP for LiDAR-only and Fusion-based 3D object detectors. The source code and tool are available at: https://github.com/denyz/FADN.git.
引用
收藏
页数:11
相关论文
共 44 条
[1]   Self-driving cars: A survey [J].
Badue, Claudine ;
Guidolini, Ranik ;
Carneiro, Raphael Vivacqua ;
Azevedo, Pedro ;
Cardoso, Vinicius B. ;
Forechi, Avelino ;
Jesus, Luan ;
Berriel, Rodrigo ;
Paixao, Thiago M. ;
Mutz, Filipe ;
Veronese, Lucas de Paula ;
Oliveira-Santos, Thiago ;
De Souza, Alberto F. .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
[2]   TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers [J].
Bai, Xuyang ;
Hu, Zeyu ;
Zhu, Xinge ;
Huang, Qingqiu ;
Chen, Yilun ;
Fu, Hangbo ;
Tai, Chiew-Lan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1080-1089
[3]   SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection [J].
Bhattacharyya, Prarthana ;
Huang, Chengjie ;
Czarnecki, Krzysztof .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :3022-3031
[4]  
Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
[5]   RoIFusion: 3D Object Detection From LiDAR and Vision [J].
Chen, Can ;
Fragonara, Luca Zanotti ;
Tsourdos, Antonios .
IEEE ACCESS, 2021, 9 (09) :51710-51721
[6]   LiDAR-camera fusion: Dual transformer enhancement for 3D object detection [J].
Chen, Mu ;
Liu, Pengfei ;
Zhao, Huaici .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120
[7]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[8]   Robust real-time traffic light detection and distance estimation using a single camera [J].
Diaz-Cabrera, Moises ;
Cerri, Pietro ;
Medici, Paolo .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (08) :3911-3923
[9]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[10]   A REVIEW OF POINT CLOUDS SEGMENTATION AND CLASSIFICATION ALGORITHMS [J].
Grilli, E. ;
Menna, F. ;
Remondino, F. .
3D VIRTUAL RECONSTRUCTION AND VISUALIZATION OF COMPLEX ARCHITECTURES, 2017, 42-2 (W3) :339-344