PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection

被引：5

作者：

Xie, Guotao ^{[1
,2
]}

Chen, Zhiyuan ^{[1
]}

Gao, Ming ^{[1
,2
]}

Hu, Manjiang ^{[1
,2
]}

Qin, Xiaohui ^{[1
,2
]}

机构：

[1] Hunan Univ, Coll Mech & Vehicle Engn, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha 410082, Peoples R China

[2] Hunan Univ, Wuxi Intelligent Control Res Inst, Wuxi 214115, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 06期

关键词：

Autonomous driving; 3D object detection; camera-LiDAR fusion; intelligent transportation systems;

D O I：

10.1109/TITS.2023.3347078

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.

引用

页码：5598 / 5611

页数：14

共 50 条

[11] MLF3D: Multi-Level Fusion for Multi-Modal 3D Object Detection
Jiang, Han
Wang, Jianbin
Xiao, Jianru
Zhao, Yanan
Chen, Wanqing
Ren, Yilong
Yu, Haiyang
2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1588 - 1593
[12] AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection
Chen, Zehui
Li, Zhenyu
Zhang, Shiquan
Fang, Liangji
Jiang, Qinhong
Zhao, Feng
Zhou, Bolei
Zhao, Hang
PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 827 - 833
[13] Multi-Modal Fusion Based on Depth Adaptive Mechanism for 3D Object Detection
Liu, Zhanwen
Cheng, Juanru
Fan, Jin
Lin, Shan
Wang, Yang
Zhao, Xiangmo
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 707 - 717
[14] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
Li, Jiahao
Chen, Lingshan
Li, Zhen
IEEE ACCESS, 2025, 13 : 52385 - 52396
[15] Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion
Zuo, Liangyu
Li, Yaochen
Han, Mengtao
Li, Qiao
Liu, Yuehu
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2746 - 2751
[16] Enhancing 3D object detection through multi-modal fusion for cooperative perception
Xia, Bin
Zhou, Jun
Kong, Fanyu
You, Yuhe
Yang, Jiarui
Lin, Lin
ALEXANDRIA ENGINEERING JOURNAL, 2024, 104 : 46 - 55
[17] Multi-Modal 3D Object Detection by Box Matching
Liu, Zhe
Ye, Xiaoqing
Zou, Zhikang
He, Xinwei
Tan, Xiao
Ding, Errui
Wang, Jingdong
Bai, Xiang
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
[18] Unlocking the power of multi-modal fusion in 3D object tracking
Hu, Yue
IET COMPUTER VISION, 2025, 19 (01)
[19] VPC-VoxelNet: multi-modal fusion 3D object detection networks based on virtual point clouds
Zhang, Qiang
Shi, Qin
Cheng, Teng
Zhang, Junning
Chen, Jiong
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (01)
[20] MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection
Lin, Zewei
Shen, Yanqing
Zhou, Sanping
Chen, Shitao
Zheng, Nanning
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 136 - 149

← 1 2 3 4 5 →