PLC-Fusion: Perspective-Based Hierarchical and Deep LiDAR Camera Fusion for 3D Object Detection in Autonomous Vehicles

被引:1
|
作者
Mushtaq, Husnain [1 ]
Deng, Xiaoheng [1 ]
Azhar, Fizza [2 ]
Ali, Mubashir [3 ]
Sherazi, Hafiz Husnain Raza [4 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Univ Chenab, Dept Comp Sci, Gujrat 50700, Pakistan
[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
[4] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, England
基金
中国国家自然科学基金;
关键词
LiDAR-camera fusion; object perspective sampling; ViT feature fusion; 3D object detection; autonomous vehicles;
D O I
10.3390/info15110739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D object detection is essential for autonomous driving, yet traditional LiDAR models often struggle with sparse point clouds. We propose perspective-aware hierarchical vision transformer-based LiDAR-camera fusion (PLC-Fusion) for 3D object detection to address this. This efficient, multi-modal 3D object detection framework integrates LiDAR and camera data for improved performance. First, our method enhances LiDAR data by projecting them onto a 2D plane, enabling the extraction of object perspective features from a probability map via the Object Perspective Sampling (OPS) module. It incorporates a lightweight perspective detector, consisting of interconnected 2D and monocular 3D sub-networks, to extract image features and generate object perspective proposals by predicting and refining top-scored 3D candidates. Second, it leverages two independent transformers-CamViT for 2D image features and LidViT for 3D point cloud features. These ViT-based representations are fused via the Cross-Fusion module for hierarchical and deep representation learning, improving performance and computational efficiency. These mechanisms enhance the utilization of semantic features in a region of interest (ROI) to obtain more representative point features, leading to a more effective fusion of information from both LiDAR and camera sources. PLC-Fusion outperforms existing methods, achieving a mean average precision (mAP) of 83.52% and 90.37% for 3D and BEV detection, respectively. Moreover, PLC-Fusion maintains a competitive inference time of 0.18 s. Our model addresses computational bottlenecks by eliminating the need for dense BEV searches and global attention mechanisms while improving detection range and precision.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Adaptive Feature Fusion Based Cooperative 3D Object Detection for Autonomous Driving
    Wang, Junyong
    Zeng, Yuan
    Gong, Yi
    2022 3RD INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC 2022), 2022, : 103 - 107
  • [42] LXL: LiDAR Excluded Lean 3D Object Detection With 4D Imaging Radar and Camera Fusion
    Xiong, Weiyi
    Liu, Jianan
    Huang, Tao
    Han, Qing-Long
    Xia, Yuxuan
    Zhu, Bing
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 79 - 92
  • [43] LXL: LiDAR Excluded Lean 3D Object Detection with 4D Imaging Radar and Camera Fusion
    Xiong, Weiyi
    Liu, Jianan
    Huang, Tao
    Han, Qing-Long
    Xia, Yuxuan
    Zhu, Bing
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 3142 - 3142
  • [44] Camera and LiDAR Fusion for Robust 3D Person Detection in Indoor Environments
    Silva, Carlos A.
    Dogru, Sedat
    Marques, Lino
    2023 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS, ICARSC, 2023, : 187 - 192
  • [45] Object detection using depth completion and camera-LiDAR fusion for autonomous driving
    Carranza-Garcia, Manuel
    Javier Galan-Sales, F.
    Maria Luna-Romera, Jose
    Riquelme, Jose C.
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2022, 29 (03) : 241 - 258
  • [46] Multi-Layer Fusion 3D Object Detection via Lidar Point Cloud and Camera Image
    Guo, Yuhao
    Hu, Hui
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [47] MSGFusion: Muti-scale Semantic Guided LiDAR-Camera Fusion for 3D Object Detection
    Zhu, Huming
    Xue, Yiyu
    Cheng, Xinyue
    Hou, Biao
    2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
  • [48] Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving
    Li, Ye
    Hu, Hanjiang
    Liu, Zuxin
    Xu, Xiaohao
    Huang, Xiaonan
    Zhao, Ding
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 9018 - 9025
  • [49] A Survey on Deep-Learning-Based LiDAR 3D Object Detection for Autonomous Driving
    Alaba, Simegnew Yihunie
    Ball, John E.
    SENSORS, 2022, 22 (24)
  • [50] Filter Fusion: Camera-LiDAR Filter Fusion for 3-D Object Detection With a Robust Fused Head
    Xu, Yaming
    Li, Boliang
    Wang, Yan
    Cui, Yihan
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73