Deformable Feature Fusion Network for Multi-Modal 3D Object Detection

被引：0

作者：

Guo, Kun ^{[1
]}

Gan, Tong ^{[2
]}

Ding, Zhao ^{[3
]}

Ling, Qiang ^{[1
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei, Peoples R China

[2] Anhui ShineAuto Autonomous Driving Technol Co Ltd, Res & Dev Dept, Hefei, Peoples R China

[3] Anhui JiangHuai Automobile Grp Co Ltd, Inst Intelligent & Networked Automobile, Hefei, Peoples R China

来源：

2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024 | 2024年

关键词：

3D object detection; multi-modal fusion; feature alignment; VOXELNET;

D O I：

10.1109/RAIIC61787.2024.10670940

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

LiDAR and cameras are two widely used sensors in 3D object detection. LiDAR point clouds show geometry knowledge of objects, while RGB images provide semantic information, such as color and texture. How to effectively fuse their features is the key to improving detection performance. This paper proposes a Deformable Feature Fusion Network, which performs LiDAR-camera fusion in a flexible way. We present multi-modal features in the bird's-eye view(BEV), and build a Deformable-Attention Fusion(DAF) module to conduct feature fusion. Besides fusion methods, feature alignment is also important in multi-modal detection. Data augmentation of point clouds may change the projection relationship between RGB images and LiDAR point clouds and causes feature misalignment. We introduce a Feature Alignment Transform(FAT) module and alleviate the problem without introducing any trainable parameters. We conduct experiments on the KITTI dataset to evaluate the effectiveness of proposed modules and the experiment results show that our method outperforms most existing methods.

引用

页码：363 / 367

页数：5

共 50 条

[31] Heterogeneous Feature Fusion Approach for Multi-Modal Indoor Localization
Zhou, Junyi
Huang, Kaixuan
Tang, Siyu
Zhang, Shunqing
2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
[32] Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking
Li, Zheng
Cai, Weibo
Dong, Junhao
Lai, Jianhuang
Xie, Xiaohua
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 68 - 80
[33] Fake News Detection Based on BERT Multi-domain and Multi-modal Fusion Network
Yu, Kai
Jiao, Shiming
Ma, Zhilong
COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 252
[34] MF-Net: Meta Fusion Network for 3D object detection
Meng, Zhaoxin
Luo, Guiyang
Yuan, Quan
Li, Jinglin
Yang, Fangchun
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[35] Optical Flow-Aware-Based Multi-Modal Fusion Network for Violence Detection
Xiao, Yang
Gao, Guxue
Wang, Liejun
Lai, Huicheng
ENTROPY, 2022, 24 (07)
[36] Personalized Clothing Prediction Algorithm Based on Multi-modal Feature Fusion
Liu, Rong
Joseph, Annie Anak
Xin, Miaomiao
Zang, Hongyan
Wang, Wanzhen
Zhang, Shengqun
INTERNATIONAL JOURNAL OF ENGINEERING AND TECHNOLOGY INNOVATION, 2024, 14 (02) : 216 - 230
[37] Multi-modal neuroimaging feature fusion for diagnosis of Alzheimer's disease
Zhang, Tao
Shi, Mingyang
JOURNAL OF NEUROSCIENCE METHODS, 2020, 341
[38] SEMI-DECOUPLED 6D POSE ESTIMATION VIA MULTI-MODAL FEATURE FUSION
Zhang, Zhenhu
Cao, Xin
Jin, Li
Qin, Xueying
Tong, Ruofeng
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 2610 - 2614
[39] LIVER TUMOR DETECTION VIA A MULTI-SCALE INTERMEDIATE MULTI-MODAL FUSION NETWORK ON MRI IMAGES
Pan, Chao
Zhou, Peiyun
Tan, Jingru
Sun, Baoye
Guan, Ruoyu
Wang, Zhutao
Luo, Ye
Lu, Jianwei
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 299 - 303
[40] Multi-level Interaction Network for Multi-Modal Rumor Detection
Zou, Ting
Qian, Zhong
Li, Peifeng
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,

← 1 2 3 4 5 →