Deformable Feature Fusion Network for Multi-Modal 3D Object Detection

被引:0
|
作者
Guo, Kun [1 ]
Gan, Tong [2 ]
Ding, Zhao [3 ]
Ling, Qiang [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei, Peoples R China
[2] Anhui ShineAuto Autonomous Driving Technol Co Ltd, Res & Dev Dept, Hefei, Peoples R China
[3] Anhui JiangHuai Automobile Grp Co Ltd, Inst Intelligent & Networked Automobile, Hefei, Peoples R China
来源
2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024 | 2024年
关键词
3D object detection; multi-modal fusion; feature alignment; VOXELNET;
D O I
10.1109/RAIIC61787.2024.10670940
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
LiDAR and cameras are two widely used sensors in 3D object detection. LiDAR point clouds show geometry knowledge of objects, while RGB images provide semantic information, such as color and texture. How to effectively fuse their features is the key to improving detection performance. This paper proposes a Deformable Feature Fusion Network, which performs LiDAR-camera fusion in a flexible way. We present multi-modal features in the bird's-eye view(BEV), and build a Deformable-Attention Fusion(DAF) module to conduct feature fusion. Besides fusion methods, feature alignment is also important in multi-modal detection. Data augmentation of point clouds may change the projection relationship between RGB images and LiDAR point clouds and causes feature misalignment. We introduce a Feature Alignment Transform(FAT) module and alleviate the problem without introducing any trainable parameters. We conduct experiments on the KITTI dataset to evaluate the effectiveness of proposed modules and the experiment results show that our method outperforms most existing methods.
引用
收藏
页码:363 / 367
页数:5
相关论文
共 50 条
  • [31] Heterogeneous Feature Fusion Approach for Multi-Modal Indoor Localization
    Zhou, Junyi
    Huang, Kaixuan
    Tang, Siyu
    Zhang, Shunqing
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [32] Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking
    Li, Zheng
    Cai, Weibo
    Dong, Junhao
    Lai, Jianhuang
    Xie, Xiaohua
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 68 - 80
  • [33] Fake News Detection Based on BERT Multi-domain and Multi-modal Fusion Network
    Yu, Kai
    Jiao, Shiming
    Ma, Zhilong
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 252
  • [34] MF-Net: Meta Fusion Network for 3D object detection
    Meng, Zhaoxin
    Luo, Guiyang
    Yuan, Quan
    Li, Jinglin
    Yang, Fangchun
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [35] Optical Flow-Aware-Based Multi-Modal Fusion Network for Violence Detection
    Xiao, Yang
    Gao, Guxue
    Wang, Liejun
    Lai, Huicheng
    ENTROPY, 2022, 24 (07)
  • [36] Personalized Clothing Prediction Algorithm Based on Multi-modal Feature Fusion
    Liu, Rong
    Joseph, Annie Anak
    Xin, Miaomiao
    Zang, Hongyan
    Wang, Wanzhen
    Zhang, Shengqun
    INTERNATIONAL JOURNAL OF ENGINEERING AND TECHNOLOGY INNOVATION, 2024, 14 (02) : 216 - 230
  • [37] Multi-modal neuroimaging feature fusion for diagnosis of Alzheimer's disease
    Zhang, Tao
    Shi, Mingyang
    JOURNAL OF NEUROSCIENCE METHODS, 2020, 341
  • [38] SEMI-DECOUPLED 6D POSE ESTIMATION VIA MULTI-MODAL FEATURE FUSION
    Zhang, Zhenhu
    Cao, Xin
    Jin, Li
    Qin, Xueying
    Tong, Ruofeng
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 2610 - 2614
  • [39] LIVER TUMOR DETECTION VIA A MULTI-SCALE INTERMEDIATE MULTI-MODAL FUSION NETWORK ON MRI IMAGES
    Pan, Chao
    Zhou, Peiyun
    Tan, Jingru
    Sun, Baoye
    Guan, Ruoyu
    Wang, Zhutao
    Luo, Ye
    Lu, Jianwei
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 299 - 303
  • [40] Multi-level Interaction Network for Multi-Modal Rumor Detection
    Zou, Ting
    Qian, Zhong
    Li, Peifeng
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,