PLC-Fusion: Perspective-Based Hierarchical and Deep LiDAR Camera Fusion for 3D Object Detection in Autonomous Vehicles

被引:1
|
作者
Mushtaq, Husnain [1 ]
Deng, Xiaoheng [1 ]
Azhar, Fizza [2 ]
Ali, Mubashir [3 ]
Sherazi, Hafiz Husnain Raza [4 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Univ Chenab, Dept Comp Sci, Gujrat 50700, Pakistan
[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
[4] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, England
基金
中国国家自然科学基金;
关键词
LiDAR-camera fusion; object perspective sampling; ViT feature fusion; 3D object detection; autonomous vehicles;
D O I
10.3390/info15110739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D object detection is essential for autonomous driving, yet traditional LiDAR models often struggle with sparse point clouds. We propose perspective-aware hierarchical vision transformer-based LiDAR-camera fusion (PLC-Fusion) for 3D object detection to address this. This efficient, multi-modal 3D object detection framework integrates LiDAR and camera data for improved performance. First, our method enhances LiDAR data by projecting them onto a 2D plane, enabling the extraction of object perspective features from a probability map via the Object Perspective Sampling (OPS) module. It incorporates a lightweight perspective detector, consisting of interconnected 2D and monocular 3D sub-networks, to extract image features and generate object perspective proposals by predicting and refining top-scored 3D candidates. Second, it leverages two independent transformers-CamViT for 2D image features and LidViT for 3D point cloud features. These ViT-based representations are fused via the Cross-Fusion module for hierarchical and deep representation learning, improving performance and computational efficiency. These mechanisms enhance the utilization of semantic features in a region of interest (ROI) to obtain more representative point features, leading to a more effective fusion of information from both LiDAR and camera sources. PLC-Fusion outperforms existing methods, achieving a mean average precision (mAP) of 83.52% and 90.37% for 3D and BEV detection, respectively. Moreover, PLC-Fusion maintains a competitive inference time of 0.18 s. Our model addresses computational bottlenecks by eliminating the need for dense BEV searches and global attention mechanisms while improving detection range and precision.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion With Deep Association
    Wang, Xiyang
    Fu, Chunyun
    Li, Zhankun
    Lai, Ying
    He, Jiawei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03): : 8260 - 8267
  • [32] Real time object detection using LiDAR and camera fusion for autonomous driving
    Liu, Haibin
    Wu, Chao
    Wang, Huanjie
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [33] Real time object detection using LiDAR and camera fusion for autonomous driving
    Haibin Liu
    Chao Wu
    Huanjie Wang
    Scientific Reports, 13
  • [34] Object Detection and Segmentation using LiDAR-Camera Fusion for Autonomous Vehicle
    Senapati, Mrinal
    Anand, Bhaskar
    Thakur, Abhishek
    Verma, Harshal
    Rajalakshmi, P.
    2021 FIFTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC 2021), 2021, : 123 - 124
  • [35] Online Camera LiDAR Fusion and Object Detection on Hybrid Data for Autonomous Driving
    Banerjee, Koyel
    Notz, Dominik
    Windelen, Johannes
    Gavarraju, Sumanth
    He, Mingkang
    2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2018, : 1632 - 1638
  • [36] SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection
    Zhang, Hongcheng
    Liang, Liu
    Zeng, Pengxin
    Song, Xiao
    Wang, Zhe
    COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 109 - 128
  • [37] Snow-CLOCs: Camera-LiDAR Object Candidate Fusion for 3D Object Detection in Snowy Conditions
    Fan, Xiangsuo
    Xiao, Dachuan
    Li, Qi
    Gong, Rui
    SENSORS, 2024, 24 (13)
  • [38] CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection
    Nabati, Ramin
    Qi, Hairong
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1526 - 1535
  • [39] 3D LiDAR and Color Camera Data Fusion
    Ding, Yuqi
    Liu, Jiaming
    Ye, Jinwei
    Xiang, Weidong
    Wu, Hsiao-Chun
    Busch, Costas
    2020 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2020,
  • [40] CLFusion:3D Semantic Segmentation Based on Camera and Lidar Fusion
    Wang, Tianyue
    Song, Rujun
    Xiao, Zhuoling
    Yan, Bo
    Qin, Haojie
    He, Di
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,