PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

被引:5
|
作者
Xie, Guotao [1 ,2 ]
Chen, Zhiyuan [1 ]
Gao, Ming [1 ,2 ]
Hu, Manjiang [1 ,2 ]
Qin, Xiaohui [1 ,2 ]
机构
[1] Hunan Univ, Coll Mech & Vehicle Engn, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha 410082, Peoples R China
[2] Hunan Univ, Wuxi Intelligent Control Res Inst, Wuxi 214115, Jiangsu, Peoples R China
关键词
Autonomous driving; 3D object detection; camera-LiDAR fusion; intelligent transportation systems;
D O I
10.1109/TITS.2023.3347078
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.
引用
收藏
页码:5598 / 5611
页数:14
相关论文
共 50 条
  • [31] PCDR-DFF: multi-modal 3D object detection based on point cloud diversity representation and dual feature fusion
    Xia, Chenxing
    Li, Xubing
    Gao, Xiuju
    Ge, Bin
    Li, Kuan-Ching
    Fang, Xianjin
    Zhang, Yan
    Yang, Ke
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (16): : 9329 - 9346
  • [32] PCDR-DFF: multi-modal 3D object detection based on point cloud diversity representation and dual feature fusion
    Chenxing Xia
    Xubing Li
    Xiuju Gao
    Bin Ge
    Kuan-Ching Li
    Xianjin Fang
    Yan Zhang
    Ke Yang
    Neural Computing and Applications, 2024, 36 : 9329 - 9346
  • [33] Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection
    Chen, Zehui
    Li, Zhenyu
    Zhang, Shiquan
    Fang, Liangji
    Jiang, Qinhong
    Zhao, Feng
    COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 628 - 644
  • [34] Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
    Khamsehashari, Razieh
    Schill, Kerstin
    2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 263 - 267
  • [35] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy
    Wang, Li
    Zhang, Xinyu
    Song, Ziying
    Bi, Jiangfeng
    Zhang, Guoxin
    Wei, Haiyue
    Tang, Liyao
    Yang, Lei
    Li, Jun
    Jia, Caiyan
    Zhao, Lijun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (07): : 3781 - 3798
  • [36] SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
    Zhao, Haimei
    Zhang, Qiming
    Zhao, Shanshan
    Chen, Zhe
    Zhang, Jing
    Tao, Dacheng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7460 - 7468
  • [37] EPNet plus plus : Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection
    Liu, Zhe
    Huang, Tengteng
    Li, Bingling
    Chen, Xiwu
    Wang, Xi
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8324 - 8341
  • [38] Bridging the View Disparity Between Radar and Camera Features for Multi-Modal Fusion 3D Object Detection
    Zhou, Taohua
    Chen, Junjie
    Shi, Yining
    Jiang, Kun
    Yang, Mengmeng
    Yang, Diange
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (02): : 1523 - 1535
  • [39] MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection
    Beemelmanns, Till
    Zhang, Quan
    Geller, Christian
    Eckstein, Lutz
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 3255 - 3261
  • [40] PPF-Net: Efficient Multimodal 3D Object Detection with Pillar-Point Fusion
    Zhang, Lingxiao
    Li, Changyong
    ELECTRONICS, 2025, 14 (04):