PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

被引：5

作者：

Xie, Guotao ^{[1
,2
]}

Chen, Zhiyuan ^{[1
]}

Gao, Ming ^{[1
,2
]}

Hu, Manjiang ^{[1
,2
]}

Qin, Xiaohui ^{[1
,2
]}

机构：

[1] Hunan Univ, Coll Mech & Vehicle Engn, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha 410082, Peoples R China

[2] Hunan Univ, Wuxi Intelligent Control Res Inst, Wuxi 214115, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 06期

关键词：

Autonomous driving; 3D object detection; camera-LiDAR fusion; intelligent transportation systems;

D O I：

10.1109/TITS.2023.3347078

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.

引用

页码：5598 / 5611

页数：14

共 50 条

[41] Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection
Chu, Huazhen
Liu, Haizhuang
Zhuo, Junbao
Chen, Jiansheng
Ma, Huimin
PATTERN RECOGNITION, 2025, 157
[42] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
Wang, Li
Zhang, Xinyu
Li, Jun
Xv, Baowei
Fu, Rong
Chen, Haifeng
Yang, Lei
Jin, Dafeng
Zhao, Lijun
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
[43] Multi-modal 3D object detection by 2D-guided precision anchor proposal and multi-layer fusion
Wu, Yi
Jiang, Xiaoyan
Fang, Zhijun
Gao, Yongbin
Fujita, Hamido
APPLIED SOFT COMPUTING, 2021, 108
[44] ActiveAnno3D-An Active Learning Framework for Multi-Modal 3D Object Detection
Ghita, Ahmed
Antoniussen, Bjork
Zimmer, Walter
Greer, Ross
Cress, Christian
Mogelmose, Andreas
Trivedi, Mohan M.
Knoll, Alois C.
2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1699 - 1706
[45] Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization
Rollo, Federico
Raiola, Gennaro
Zunino, Andrea
Tsagarakis, Nikolaos
Ajoudani, Arash
2023 EUROPEAN CONFERENCE ON MOBILE ROBOTS, ECMR, 2023, : 90 - 97
[46] RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM
Song, Ziying
Zhang, Guoxing
Liu, Lin
Yang, Lei
Xu, Shaoqing
Jia, Caiyan
Jia, Feiyang
Wang, Li
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 1272 - 1280
[47] Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection
Huang, Linyan
Li, Zhiqi
Sima, Chonghao
Wang, Wenhai
Wang, Jingdong
Qiao, Yu
Li, Hongyang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[48] TransFusion: Multi-Modal Robust Fusion for 3D Object Detection in Foggy Weather Based on Spatial Vision Transformer
Zhang, Cheng
Wang, Hai
Cai, Yingfeng
Chen, Long
Li, Yicheng
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (09) : 10652 - 10666
[49] APPFNet: Adaptive point-pixel fusion network for 3D semantic segmentation with neighbor feature aggregation
Wu, Zhaolong
Zhang, Yong
Lan, Rukai
Qiu, Shaohua
Ran, Shaolin
Liu, Yifan
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
[50] MMDistill: Multi-Modal BEV Distillation Framework for Multi-View 3D Object Detection
Jiao, Tianzhe
Chen, Yuming
Zhang, Zhe
Guo, Chaopeng
Song, Jie
CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 81 (03): : 4307 - 4325

← 1 2 3 4 5 →