Rethinking the Late Fusion of LiDAR-Camera Based 3D Object Detection

被引：0

作者：

Yu, Lehang ^{[1
,2
]}

Zhang, Jing ^{[1
,2
]}

Liu, Zhong ^{[1
,2
]}

Yue, Haosong ^{[1
,2
]}

Chen, Weihai ^{[1
,2
]}

机构：

[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing, Peoples R China

[2] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China

来源：

2024 IEEE 19TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, ICIEA 2024 | 2024年

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

3D object detection; late fusion; semantic segmentation; VOXELNET;

D O I：

10.1109/ICIEA61579.2024.10665299

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

3D object detection plays an important role in autonomous driving. Different from the early stage methods that only use single-modality data, multi-modality based, mostly LiDAR-camera based detectors, have been widely studied and proposed in recent years. All fusion methods can be divided into three categories, i.e. early fusion that fuses raw data from different modalities, middle fusion that intermediately fuses the extracted multi-modal features by feature alignment, and late fusion that conducts fusion on instance level after detectors of different modality individually give their predictions. Current methods mainly focus on early and middle fusion, neglecting the huge potential of late fusion. This paper reveals the feasibility of improving the performance of 3D object detector by applying our proposed semantic consistency filter (SCF), a plug-and-play late fusion strategy, to any existing models. SCF works by removing incorrect prediction boxes by computing semantic inconsistency rate (SIR) with their corresponding 2D semantic mask generated by semantic segmentation network. Abundant experiments conducted with several different baselines prove the effectiveness and versatility of SCF, indicating that late fusion might be the key to improve the performance of 3D object detectors.

引用

页数：6

共 50 条

[1] SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection
Zhang, Hongcheng
Liang, Liu
Zeng, Pengxin
Song, Xiao
Wang, Zhe
COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 109 - 128
[2] MSGFusion: Muti-scale Semantic Guided LiDAR-Camera Fusion for 3D Object Detection
Zhu, Huming
Xue, Yiyu
Cheng, Xinyue
Hou, Biao
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[3] FAFNs: Frequency-Aware LiDAR-Camera Fusion Networks for 3-D Object Detection
Wang, Jingxuan
Lu, Yuanyao
Jiang, Haiyang
IEEE SENSORS JOURNAL, 2023, 23 (24) : 30847 - 30857
[4] CoreNet: Conflict Resolution Network for point-pixel misalignment and sub-task suppression of 3D LiDAR-camera object detection
Li, Yiheng
Yang, Yang
Lei, Zhen
INFORMATION FUSION, 2025, 118
[5] GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection
Li, Xiaotian
Fan, Baojie
Tian, Jiandong
Fan, Huijie
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 21209 - 21218
[6] Building and optimization of 3D semantic map based on Lidar and camera fusion
Li, Jing
Zhang, Xin
Li, Jiehao
Liu, Yanyu
Wang, Junzheng
NEUROCOMPUTING, 2020, 409 : 394 - 407
[7] LiDAR-Camera Continuous Fusion in Voxelized Grid for Semantic Scene Completion
Lu, Zonghao
Cao, Bing
Hu, Qinghua
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12330 - 12344
[8] Efficient Traversability Mapping Based on Single Camera and 3D LiDAR
Youn, Chanmin
Youn, Wonkeun
Kim, Sanghyun
Park, Jinseong
Shin, Young-Sik
INTELLIGENT AUTONOMOUS SYSTEMS 18, VOL 1, IAS18-2023, 2024, 795 : 607 - 615
[9] Surface Target Detection Algorithm Based on 3D Lidar
Zhou Zhiguo
Li Yiyao
Cao Hangwei
Di Shunfan
LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (18)
[10] Fusion of 3D-LIDAR and camera data for scene parsing
Zhao, Gangqiang
Xiao, Xuhong
Yuan, Junsong
Ng, Gee Wah
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (01) : 165 - 183

← 1 2 3 4 5 →