PLC-Fusion: Perspective-Based Hierarchical and Deep LiDAR Camera Fusion for 3D Object Detection in Autonomous Vehicles

被引:1
|
作者
Mushtaq, Husnain [1 ]
Deng, Xiaoheng [1 ]
Azhar, Fizza [2 ]
Ali, Mubashir [3 ]
Sherazi, Hafiz Husnain Raza [4 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Univ Chenab, Dept Comp Sci, Gujrat 50700, Pakistan
[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
[4] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, England
基金
中国国家自然科学基金;
关键词
LiDAR-camera fusion; object perspective sampling; ViT feature fusion; 3D object detection; autonomous vehicles;
D O I
10.3390/info15110739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D object detection is essential for autonomous driving, yet traditional LiDAR models often struggle with sparse point clouds. We propose perspective-aware hierarchical vision transformer-based LiDAR-camera fusion (PLC-Fusion) for 3D object detection to address this. This efficient, multi-modal 3D object detection framework integrates LiDAR and camera data for improved performance. First, our method enhances LiDAR data by projecting them onto a 2D plane, enabling the extraction of object perspective features from a probability map via the Object Perspective Sampling (OPS) module. It incorporates a lightweight perspective detector, consisting of interconnected 2D and monocular 3D sub-networks, to extract image features and generate object perspective proposals by predicting and refining top-scored 3D candidates. Second, it leverages two independent transformers-CamViT for 2D image features and LidViT for 3D point cloud features. These ViT-based representations are fused via the Cross-Fusion module for hierarchical and deep representation learning, improving performance and computational efficiency. These mechanisms enhance the utilization of semantic features in a region of interest (ROI) to obtain more representative point features, leading to a more effective fusion of information from both LiDAR and camera sources. PLC-Fusion outperforms existing methods, achieving a mean average precision (mAP) of 83.52% and 90.37% for 3D and BEV detection, respectively. Moreover, PLC-Fusion maintains a competitive inference time of 0.18 s. Our model addresses computational bottlenecks by eliminating the need for dense BEV searches and global attention mechanisms while improving detection range and precision.
引用
收藏
页数:23
相关论文
共 39 条
  • [21] Geometric information constraint 3D object detection from LiDAR point cloud for autonomous vehicles under adverse weather
    Qi, Yuanfan
    Liu, Chun
    Scaioni, Marco
    Li, Yanyi
    Qiao, Yihong
    Ma, Xiaolong
    Wu, Hangbin
    Zhang, Keke
    Wang, Dazhi
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 161
  • [22] Multimodal Cooperative 3D Object Detection Over Connected Vehicles for Autonomous Driving
    Chi, Fangyuan
    Wang, Yixiao
    Pourazad, Mahsa T.
    Nasiopoulos, Panos
    Leung, Victor C. M.
    IEEE NETWORK, 2023, 37 (04): : 265 - 272
  • [23] AEPF: Attention-Enabled Point Fusion for 3D Object Detection
    Sharma, Sachin
    Meyer, Richard T.
    Asher, Zachary D.
    SENSORS, 2024, 24 (17)
  • [24] Multi-Camera 3D Object Detection for Autonomous Driving Using Deep Learning and Self-Attention Mechanism
    Hazarika, Ananya
    Vyas, Amit
    Rahmati, Mehdi
    Wang, Yan
    IEEE ACCESS, 2023, 11 : 64608 - 64620
  • [25] Multi-Scale Spatial Transformer Network for LiDAR-Camera 3D Object Detection
    Wang, Zhifan
    Zhang, Xiaohong
    Wang, Shidong
    Xin, Tong
    Zhang, Haofeng
    Lu, Jianfeng
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [26] 3D Object Detection with SLS-Fusion Network in Foggy Weather Conditions
    Nguyen Anh Minh Mai
    Duthon, Pierre
    Khoudour, Louahdi
    Crouzil, Alain
    Velastin, Sergio A.
    SENSORS, 2021, 21 (20)
  • [27] Generating Adversarial Point Clouds on Multi-modal Fusion Based 3D Object Detection Model
    Wang, Huiying
    Shen, Huixin
    Zhang, Boyang
    Wen, Yu
    Meng, Dan
    INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2021), PT I, 2021, 12918 : 187 - 203
  • [28] Visual-LiDAR Based 3D Object Detection and Tracking for Embedded Systems
    Sualeh, Muhammad
    Kim, Gon-Woo
    IEEE ACCESS, 2020, 8 : 156285 - 156298
  • [29] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
    Li, Jiahao
    Chen, Lingshan
    Li, Zhen
    IEEE ACCESS, 2025, 13 : 52385 - 52396
  • [30] A Smart IoT Enabled End-to-End 3D Object Detection System for Autonomous Vehicles
    Ahmed, Imran
    Jeon, Gwanggil
    Chehri, Abdellah
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 13078 - 13087