Deep multi-scale and multi-modal fusion for 3D object detection

被引:17
|
作者
Guo, Rui [1 ,3 ]
Li, Deng [2 ]
Han, Yahong [2 ]
机构
[1] Southeast Univ, Sch Energy & Environm, Nanjing, Peoples R China
[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[3] Southeast Univ, Natl Engn Res Ctr Turbo Generator Vibrat, Nanjing, Peoples R China
关键词
3D Object detection; Feature fusion; Autonomous driving; Point cloud;
D O I
10.1016/j.patrec.2021.08.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The perception of 3D objects in the scene is the basis of autonomous driving. Most autonomous driving cars are equipped with cameras and Lidar to obtain 3D spatial information. RGB images taken from the camera and point cloud produced by Lidar both have their own advantages for 3D object detection. In order to make better use of the advantages of image data and point cloud data, a 3D object detection method based on Deep Multi-scale and Multi-modal Fusion (DMMF) is proposed. Firstly, point cloud is projected to the Bird's Eye View (BEV) and extract BEV map and RGB image feature with feature extractor, respectively. Then, fuse the multi-modal feature with the deep multi-scale fusion method and finally input to position regression and classification network for object classification and accurate positioning. The experimental results on the benchmark KITTI dataset show that the method reaches state-of-theart in both car and pedestrian classes, especially for hard level data, the detection AP is significantly improved. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:236 / 242
页数:7
相关论文
共 50 条
  • [1] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
  • [2] DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
    Li, Yingwei
    Yu, Adams Wei
    Meng, Tianjian
    Caine, Ben
    Ngiam, Jiquan
    Peng, Daiyi
    Shen, Junyang
    Lu, Yifeng
    Zhou, Denny
    Le, Quoc, V
    Yuille, Alan
    Tan, Mingxing
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17161 - 17170
  • [3] ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion
    Cai, Qi
    Pan, Yingwei
    Yao, Ting
    Ngo, Chong-Wah
    Mei, Tao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18021 - 18030
  • [4] Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
    Li, Xin
    Shi, Botian
    Hou, Yuenan
    Wu, Xingjiao
    Ma, Tianlong
    Li, Yikang
    He, Liang
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 691 - 707
  • [5] Multi-modal feature fusion for 3D object detection in the production workshop
    Hou, Rui
    Chen, Guangzhu
    Han, Yinhe
    Tang, Zaizuo
    Ru, Qingjun
    APPLIED SOFT COMPUTING, 2022, 115
  • [6] Multi-Modal Streaming 3D Object Detection
    Abdelfattah, Mazen
    Yuan, Kaiwen
    Wang, Z. Jane
    Ward, Rabab
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6163 - 6170
  • [7] Research on 3D Object Detection Method Based on Multi-Modal Fusion
    Tian, Feng
    Zong, Neili
    Liu, Fang
    Lu, Yuanyuan
    Liu, Chao
    Jiang, Wenwen
    Zhao, Ling
    Han, Yuxiang
    Computer Engineering and Applications, 2024, 60 (13) : 113 - 123
  • [8] Deformable Feature Fusion Network for Multi-Modal 3D Object Detection
    Guo, Kun
    Gan, Tong
    Ding, Zhao
    Ling, Qiang
    2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024, 2024, : 363 - 367
  • [9] MLF3D: Multi-Level Fusion for Multi-Modal 3D Object Detection
    Jiang, Han
    Wang, Jianbin
    Xiao, Jianru
    Zhao, Yanan
    Chen, Wanqing
    Ren, Yilong
    Yu, Haiyang
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1588 - 1593
  • [10] BSM-NET: multi-bandwidth, multi-scale and multi-modal fusion network for 3D object detection of 4D radar and LiDAR
    Jiang, Tiezhen
    Kang, Runjie
    Li, Qingzhu
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (03)