PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

被引:5
|
作者
Xie, Guotao [1 ,2 ]
Chen, Zhiyuan [1 ]
Gao, Ming [1 ,2 ]
Hu, Manjiang [1 ,2 ]
Qin, Xiaohui [1 ,2 ]
机构
[1] Hunan Univ, Coll Mech & Vehicle Engn, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha 410082, Peoples R China
[2] Hunan Univ, Wuxi Intelligent Control Res Inst, Wuxi 214115, Jiangsu, Peoples R China
关键词
Autonomous driving; 3D object detection; camera-LiDAR fusion; intelligent transportation systems;
D O I
10.1109/TITS.2023.3347078
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.
引用
收藏
页码:5598 / 5611
页数:14
相关论文
共 50 条
  • [41] Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection
    Chu, Huazhen
    Liu, Haizhuang
    Zhuo, Junbao
    Chen, Jiansheng
    Ma, Huimin
    PATTERN RECOGNITION, 2025, 157
  • [42] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
  • [43] Multi-modal 3D object detection by 2D-guided precision anchor proposal and multi-layer fusion
    Wu, Yi
    Jiang, Xiaoyan
    Fang, Zhijun
    Gao, Yongbin
    Fujita, Hamido
    APPLIED SOFT COMPUTING, 2021, 108
  • [44] ActiveAnno3D-An Active Learning Framework for Multi-Modal 3D Object Detection
    Ghita, Ahmed
    Antoniussen, Bjork
    Zimmer, Walter
    Greer, Ross
    Cress, Christian
    Mogelmose, Andreas
    Trivedi, Mohan M.
    Knoll, Alois C.
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1699 - 1706
  • [45] Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization
    Rollo, Federico
    Raiola, Gennaro
    Zunino, Andrea
    Tsagarakis, Nikolaos
    Ajoudani, Arash
    2023 EUROPEAN CONFERENCE ON MOBILE ROBOTS, ECMR, 2023, : 90 - 97
  • [46] RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM
    Song, Ziying
    Zhang, Guoxing
    Liu, Lin
    Yang, Lei
    Xu, Shaoqing
    Jia, Caiyan
    Jia, Feiyang
    Wang, Li
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 1272 - 1280
  • [47] Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection
    Huang, Linyan
    Li, Zhiqi
    Sima, Chonghao
    Wang, Wenhai
    Wang, Jingdong
    Qiao, Yu
    Li, Hongyang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] TransFusion: Multi-Modal Robust Fusion for 3D Object Detection in Foggy Weather Based on Spatial Vision Transformer
    Zhang, Cheng
    Wang, Hai
    Cai, Yingfeng
    Chen, Long
    Li, Yicheng
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (09) : 10652 - 10666
  • [49] APPFNet: Adaptive point-pixel fusion network for 3D semantic segmentation with neighbor feature aggregation
    Wu, Zhaolong
    Zhang, Yong
    Lan, Rukai
    Qiu, Shaohua
    Ran, Shaolin
    Liu, Yifan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [50] MMDistill: Multi-Modal BEV Distillation Framework for Multi-View 3D Object Detection
    Jiao, Tianzhe
    Chen, Yuming
    Zhang, Zhe
    Guo, Chaopeng
    Song, Jie
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 81 (03): : 4307 - 4325