PPF-Net: Efficient Multimodal 3D Object Detection with Pillar-Point Fusion

被引:0
|
作者
Zhang, Lingxiao [1 ]
Li, Changyong [1 ]
机构
[1] Xinjiang Univ, Coll Mech Engn, Urumqi 830017, Peoples R China
来源
ELECTRONICS | 2025年 / 14卷 / 04期
关键词
3D object detection; cross-modal data augmentation; sensor fusion; joint regression loss function;
D O I
10.3390/electronics14040685
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting objects in 3D space using LiDAR is crucial for robotics and autonomous vehicles, but the sparsity of LiDAR-generated point clouds limits performance. Camera images, rich in semantic information, can effectively compensate for this limitation. We propose a simpler yet effective multimodal fusion framework to enhance 3D object detection without complex network designs. We introduce a cross-modal GT-Paste data augmentation method to address challenges like 2D object acquisition and occlusions from added objects. To better integrate image features with sparse point clouds, we propose Pillar-Point Fusion (PPF), which projects non-empty pillars onto image feature maps and uses an attention mechanism to map semantic features from pillars to their constituent points, fusing them with the points' geometric features. Additionally, we design the BD-IoU loss function, which measures 3D bounding box similarity, and a joint regression loss combining BD-IoU and Smooth L1, effectively guiding model training. Our framework achieves consistent improvements across KITTI benchmarks. On the validation set, PFF (PV-RCNN) achieves at least 1.84% AP improvement in Cyclist detection performance across all difficulty levels compared to other multimodal SOTA methods. On the test set, PPF-Net excels in pedestrian detection for moderate and hard difficulty levels and achieves the best results in low-beam LiDAR scenarios.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] CenterTransFuser: radar point cloud and visual information fusion for 3D object detection
    Li, Yan
    Zeng, Kai
    Shen, Tao
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2023, 2023 (01)
  • [42] HCPVF: Hierarchical Cascaded Point-Voxel Fusion for 3D Object Detection
    Fan, Baojie
    Zhang, Kexin
    Tian, Jiandong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 8997 - 9009
  • [43] A Novel Interactive Fusion Method with Images and Point Clouds for 3D Object Detection
    Xu, Kai
    Yang, Zhile
    Xu, Yangjie
    Feng, Liangbing
    APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [44] MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences
    He, Chenhang
    Li, Ruihuang
    Zhang, Yabin
    Li, Shuai
    Zhang, Lei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5196 - 5205
  • [45] Multimodal 3D Object Detection Method Based on Pseudo Point Cloud Feature Enhancement
    Kong D.-M.
    Li X.-W.
    Yang Q.-X.
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (04): : 759 - 775
  • [46] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    NEUROCOMPUTING, 2021, 437 : 107 - 117
  • [47] Dense Voxel Fusion for 3D Object Detection
    Mahmoud, Anas
    Hu, Jordan S. K.
    Waslander, Steven L.
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 663 - 672
  • [48] PointPainting: Sequential Fusion for 3D Object Detection
    Vora, Sourabh
    Lang, Alex H.
    Helou, Bassam
    Beijbom, Oscar
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4603 - 4611
  • [49] Dense projection fusion for 3D object detection
    Chen, Zhao
    Hu, Bin-Jie
    Luo, Chengxi
    Chen, Guohao
    Zhu, Haohui
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [50] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    Neurocomputing, 2021, 437 : 107 - 117