PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

被引:5
|
作者
Xie, Guotao [1 ,2 ]
Chen, Zhiyuan [1 ]
Gao, Ming [1 ,2 ]
Hu, Manjiang [1 ,2 ]
Qin, Xiaohui [1 ,2 ]
机构
[1] Hunan Univ, Coll Mech & Vehicle Engn, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha 410082, Peoples R China
[2] Hunan Univ, Wuxi Intelligent Control Res Inst, Wuxi 214115, Jiangsu, Peoples R China
关键词
Autonomous driving; 3D object detection; camera-LiDAR fusion; intelligent transportation systems;
D O I
10.1109/TITS.2023.3347078
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.
引用
收藏
页码:5598 / 5611
页数:14
相关论文
共 50 条
  • [11] MLF3D: Multi-Level Fusion for Multi-Modal 3D Object Detection
    Jiang, Han
    Wang, Jianbin
    Xiao, Jianru
    Zhao, Yanan
    Chen, Wanqing
    Ren, Yilong
    Yu, Haiyang
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1588 - 1593
  • [12] AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection
    Chen, Zehui
    Li, Zhenyu
    Zhang, Shiquan
    Fang, Liangji
    Jiang, Qinhong
    Zhao, Feng
    Zhou, Bolei
    Zhao, Hang
    PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 827 - 833
  • [13] Multi-Modal Fusion Based on Depth Adaptive Mechanism for 3D Object Detection
    Liu, Zhanwen
    Cheng, Juanru
    Fan, Jin
    Lin, Shan
    Wang, Yang
    Zhao, Xiangmo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 707 - 717
  • [14] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
    Li, Jiahao
    Chen, Lingshan
    Li, Zhen
    IEEE ACCESS, 2025, 13 : 52385 - 52396
  • [15] Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion
    Zuo, Liangyu
    Li, Yaochen
    Han, Mengtao
    Li, Qiao
    Liu, Yuehu
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2746 - 2751
  • [16] Enhancing 3D object detection through multi-modal fusion for cooperative perception
    Xia, Bin
    Zhou, Jun
    Kong, Fanyu
    You, Yuhe
    Yang, Jiarui
    Lin, Lin
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 104 : 46 - 55
  • [17] Multi-Modal 3D Object Detection by Box Matching
    Liu, Zhe
    Ye, Xiaoqing
    Zou, Zhikang
    He, Xinwei
    Tan, Xiao
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [18] Unlocking the power of multi-modal fusion in 3D object tracking
    Hu, Yue
    IET COMPUTER VISION, 2025, 19 (01)
  • [19] VPC-VoxelNet: multi-modal fusion 3D object detection networks based on virtual point clouds
    Zhang, Qiang
    Shi, Qin
    Cheng, Teng
    Zhang, Junning
    Chen, Jiong
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (01)
  • [20] MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection
    Lin, Zewei
    Shen, Yanqing
    Zhou, Sanping
    Chen, Shitao
    Zheng, Nanning
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 136 - 149