PLC-Fusion: Perspective-Based Hierarchical and Deep LiDAR Camera Fusion for 3D Object Detection in Autonomous Vehicles

被引:1
|
作者
Mushtaq, Husnain [1 ]
Deng, Xiaoheng [1 ]
Azhar, Fizza [2 ]
Ali, Mubashir [3 ]
Sherazi, Hafiz Husnain Raza [4 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Univ Chenab, Dept Comp Sci, Gujrat 50700, Pakistan
[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
[4] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, England
基金
中国国家自然科学基金;
关键词
LiDAR-camera fusion; object perspective sampling; ViT feature fusion; 3D object detection; autonomous vehicles;
D O I
10.3390/info15110739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D object detection is essential for autonomous driving, yet traditional LiDAR models often struggle with sparse point clouds. We propose perspective-aware hierarchical vision transformer-based LiDAR-camera fusion (PLC-Fusion) for 3D object detection to address this. This efficient, multi-modal 3D object detection framework integrates LiDAR and camera data for improved performance. First, our method enhances LiDAR data by projecting them onto a 2D plane, enabling the extraction of object perspective features from a probability map via the Object Perspective Sampling (OPS) module. It incorporates a lightweight perspective detector, consisting of interconnected 2D and monocular 3D sub-networks, to extract image features and generate object perspective proposals by predicting and refining top-scored 3D candidates. Second, it leverages two independent transformers-CamViT for 2D image features and LidViT for 3D point cloud features. These ViT-based representations are fused via the Cross-Fusion module for hierarchical and deep representation learning, improving performance and computational efficiency. These mechanisms enhance the utilization of semantic features in a region of interest (ROI) to obtain more representative point features, leading to a more effective fusion of information from both LiDAR and camera sources. PLC-Fusion outperforms existing methods, achieving a mean average precision (mAP) of 83.52% and 90.37% for 3D and BEV detection, respectively. Moreover, PLC-Fusion maintains a competitive inference time of 0.18 s. Our model addresses computational bottlenecks by eliminating the need for dense BEV searches and global attention mechanisms while improving detection range and precision.
引用
收藏
页数:23
相关论文
共 39 条
  • [31] ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions
    Sural, Shounak
    Sahu, Nishad
    Rajkumar, Ragunathan
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1534 - 1541
  • [32] Low-observable targets detection for autonomous vehicles based on dual-modal sensor fusion with deep learning approach
    Geng, Keke
    Zou, Wei
    Yin, Guodong
    Li, Yang
    Zhou, Zihao
    Yang, Fan
    Wu, Yuan
    Shen, Cheng
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING, 2019, 233 (09) : 2270 - 2283
  • [33] Exploring Diversity-Based Active Learning for 3D Object Detection in Autonomous Driving
    Lin, Jinpeng
    Liang, Zhihao
    Deng, Shengheng
    Cai, Lile
    Jiang, Tao
    Li, Tianrui
    Jia, Kui
    Xu, Xun
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 15454 - 15466
  • [34] Deep Learning-Based Image 3-D Object Detection for Autonomous Driving: Review
    Alaba, Simegnew Yihunie
    Ball, John E.
    IEEE SENSORS JOURNAL, 2023, 23 (04) : 3378 - 3394
  • [35] Enhancing Grid-Based 3D Object Detection in Autonomous Driving With Improved Dimensionality Reduction
    Huang, Dihe
    Chen, Ying
    Ding, Yikang
    Liu, Yong
    Nie, Qiang
    Wang, Chengjie
    Li, Zhiheng
    IEEE ACCESS, 2023, 11 : 35243 - 35254
  • [36] O2SAT: Object-Oriented-Segmentation-Guided Spatial-Attention Network for 3D Object Detection in Autonomous Vehicles
    Mushtaq, Husnain
    Deng, Xiaoheng
    Ullah, Irshad
    Ali, Mubashir
    Malik, Babur Hayat
    INFORMATION, 2024, 15 (07)
  • [37] Towards Minimizing the LiDAR Sim-to-Real Domain Shift: Object-Level Local Domain Adaptation for 3D Point Clouds of Autonomous Vehicles
    Huch, Sebastian
    Lienkamp, Markus
    SENSORS, 2023, 23 (24)
  • [38] GFA-SMT: Geometric Feature Aggregation and Self-Attention in a Multi-Head Transformer for 3D Object Detection in Autonomous Vehicles
    Mushtaq, Husnain
    Deng, Xiaoheng
    Jiang, Ping
    Wan, Shaohua
    Ali, Mubashir
    Ullah, Irshad
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 26 (03) : 3557 - 3573
  • [39] AOP-Net: All-in-One Perception Network for LiDAR-based Joint 3D Object Detection and Panoptic Segmentation
    Xu, Yixuan
    Fazlali, Hamidreza
    Ren, Yuan
    Liu, Bingbing
    2023 IEEE INTELLIGENT VEHICLES SYMPOSIUM, IV, 2023,