PLC-Fusion: Perspective-Based Hierarchical and Deep LiDAR Camera Fusion for 3D Object Detection in Autonomous Vehicles

被引：1

作者：

Mushtaq, Husnain ^{[1
]}

Deng, Xiaoheng ^{[1
]}

Azhar, Fizza ^{[2
]}

Ali, Mubashir ^{[3
]}

Sherazi, Hafiz Husnain Raza ^{[4
]}

机构：

[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China

[2] Univ Chenab, Dept Comp Sci, Gujrat 50700, Pakistan

[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England

[4] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, England

来源：

INFORMATION | 2024年 / 15卷 / 11期

基金：

中国国家自然科学基金;

关键词：

LiDAR-camera fusion; object perspective sampling; ViT feature fusion; 3D object detection; autonomous vehicles;

D O I：

10.3390/info15110739

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Accurate 3D object detection is essential for autonomous driving, yet traditional LiDAR models often struggle with sparse point clouds. We propose perspective-aware hierarchical vision transformer-based LiDAR-camera fusion (PLC-Fusion) for 3D object detection to address this. This efficient, multi-modal 3D object detection framework integrates LiDAR and camera data for improved performance. First, our method enhances LiDAR data by projecting them onto a 2D plane, enabling the extraction of object perspective features from a probability map via the Object Perspective Sampling (OPS) module. It incorporates a lightweight perspective detector, consisting of interconnected 2D and monocular 3D sub-networks, to extract image features and generate object perspective proposals by predicting and refining top-scored 3D candidates. Second, it leverages two independent transformers-CamViT for 2D image features and LidViT for 3D point cloud features. These ViT-based representations are fused via the Cross-Fusion module for hierarchical and deep representation learning, improving performance and computational efficiency. These mechanisms enhance the utilization of semantic features in a region of interest (ROI) to obtain more representative point features, leading to a more effective fusion of information from both LiDAR and camera sources. PLC-Fusion outperforms existing methods, achieving a mean average precision (mAP) of 83.52% and 90.37% for 3D and BEV detection, respectively. Moreover, PLC-Fusion maintains a competitive inference time of 0.18 s. Our model addresses computational bottlenecks by eliminating the need for dense BEV searches and global attention mechanisms while improving detection range and precision.

引用

页数：23

共 39 条

[1] Enhanced Object Detection in Autonomous Vehicles through LiDAR-Camera Sensor Fusion
Dai, Zhongmou
Guan, Zhiwei
Chen, Qiang
Xu, Yi
Sun, Fengyi
WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (07):
[2] A sensor fusion system with thermal infrared camera and LiDAR for autonomous vehicles and deep learning based object detection
Choi, Ji Dong
Kim, Min Young
ICT EXPRESS, 2023, 9 (02): : 222 - 227
[3] LiDAR-Camera Fusion in Perspective View for 3D Object Detection in Surface Mine
Ai, Yunfeng
Yang, Xue
Song, Ruiqi
Cui, Chenglin
Li, Xinqing
Cheng, Qi
Tian, Bin
Chen, Long
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3721 - 3730
[4] A LiDAR-Camera Fusion 3D Object Detection Algorithm
Liu, Leyuan
He, Jian
Ren, Keyan
Xiao, Zhonghua
Hou, Yibin
INFORMATION, 2022, 13 (04)
[5] BAFusion: Bidirectional Attention Fusion for 3D Object Detection Based on LiDAR and Camera
Liu, Min
Jia, Yuanjun
Lyu, Youhao
Dong, Qi
Yang, Yanyu
SENSORS, 2024, 24 (14)
[6] FS-Net: LiDAR-Camera Fusion With Matched Scale for 3D Object Detection in Autonomous Driving
Zhang, Lei
Li, Xu
Tang, Kaichen
Jiang, Yunzhe
Yang, Liu
Zhang, Yonggang
Chen, Xianyi
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 12154 - 12165
[7] FusionRCNN: LiDAR-Camera Fusion for Two-Stage 3D Object Detection
Xu, Xinli
Dong, Shaocong
Xu, Tingfa
Ding, Lihe
Wang, Jie
Jiang, Peng
Song, Liqiang
Li, Jianan
REMOTE SENSING, 2023, 15 (07)
[8] SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection
Zhang, Hongcheng
Liang, Liu
Zeng, Pengxin
Song, Xiao
Wang, Zhe
COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 109 - 128
[9] A Survey on Deep-Learning-Based LiDAR 3D Object Detection for Autonomous Driving
Alaba, Simegnew Yihunie
Ball, John E.
SENSORS, 2022, 22 (24)
[10] CoFF: Cooperative Spatial Feature Fusion for 3-D Object Detection on Autonomous Vehicles
Guo, Jingda
Carrillo, Dominic
Tang, Sihai
Chen, Qi
Yang, Qing
Fu, Song
Wang, Xi
Wang, Nannan
Palacharla, Paparao
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (14) : 11078 - 11087

← 1 2 3 4 →