ACF-Net: Asymmetric Cascade Fusion for 3D Detection With LiDAR Point Clouds and Images

被引:8
作者
Tian, Yonglin [1 ,2 ]
Zhang, Xianjing [2 ]
Wang, Xiao [3 ]
Xu, Jintao [2 ]
Wang, Jiangong [1 ]
Ai, Rui [4 ]
Gu, Weihao [4 ]
Ding, Weiping [5 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Haomo Technol Co Ltd, AI Ctr, Beijing 100192, Peoples R China
[3] Anhui Univ, Sch Artificial Intelligence, Hefei 230031, Peoples R China
[4] Haomo Technol Co Ltd, Beijing 100192, Peoples R China
[5] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
来源
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES | 2024年 / 9卷 / 02期
关键词
Three-dimensional displays; Feature extraction; Point cloud compression; Laser radar; Object detection; Timing; Fuses; 3D detection; autonomous driving; asymmetric fusion; cascade fusion; multimodal fusion; OBJECT; PERFORMANCE;
D O I
10.1109/TIV.2023.3341223
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recognition and utilization of complementary information arising from modality-intrinsic properties play crucial roles in multimodal 3D detection. However, most of the current approaches for fusion-based 3D detection follow symmetrical fusion paradigms and adopt early fusion, middle fusion as well as late fusion styles, which ignore the unequal status of data with different modalities. In this paper, according to the timing of fusion, we adopt an asymmetric cascade fusion network to exploit both the structural information from point clouds and the complementary semantic information from images. A multi-stage cascade design of 3D object detection is proposed to iteratively refine predictions and several late image features (comprised of detection clues, segmentation clues, and deep features from encoders) are incorporated into different stages of the LiDAR branch to maintain the integrity of image features and enable deep multimodal interactions. Besides, to mitigate the effects of the down-sampling of voxelized features and possible mismatching of multimodal data, we propose proxy-based cross-modality sampling to utilize the high-density point clouds coordinates and develop an image degeneration process to simulate the noise in cross-modality matching for robust training. Extensive experiments are conducted on KITTI and Waymo Open Dataset, which validate the effectiveness of the proposed method.
引用
收藏
页码:3360 / 3371
页数:12
相关论文
共 50 条
  • [21] LXL: LiDAR Excluded Lean 3D Object Detection With 4D Imaging Radar and Camera Fusion
    Xiong, Weiyi
    Liu, Jianan
    Huang, Tao
    Han, Qing-Long
    Xia, Yuxuan
    Zhu, Bing
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 79 - 92
  • [22] PDANet: Point Distribution Awareness for 3-D Object Detection From LiDAR Point Clouds
    Tang, Miao
    Yu, Dianyu
    Hu, Qingyong
    Dai, Wenxia
    Xiao, Wen
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [23] Transformer-Based Optimized Multimodal Fusion for 3D Object Detection in Autonomous Driving
    Alaba, Simegnew Yihunie
    Ball, John E.
    [J]. IEEE ACCESS, 2024, 12 : 50165 - 50176
  • [24] Monitoring Critical Infrastructure Using 3D LiDAR Point Clouds
    Sharifisoraki, Z.
    Dey, A.
    Selzler, R.
    Amini, M.
    Green, J. R.
    Rajan, S.
    Kwamena, F. A.
    [J]. IEEE ACCESS, 2023, 11 : 314 - 336
  • [25] CrossFusion net: Deep 3D object detection based on RGB images and point clouds in autonomous driving
    Hong, Dza-Shiang
    Chen, Hung-Hao
    Hsiao, Pei-Yung
    Fu, Li-Chen
    Siao, Siang-Min
    [J]. IMAGE AND VISION COMPUTING, 2020, 100
  • [26] MD3D: Mixture-Density-Based 3D Object Detection in Point Clouds
    Choi, Jaeseok
    Song, Yeji
    Kim, Yerim
    Yoo, Jaeyoung
    Kwak, Nojun
    [J]. IEEE ACCESS, 2022, 10 : 104011 - 104022
  • [27] Range-Aware Attention Network for LiDAR-Based 3D Object Detection With Auxiliary Point Density Level Estimation
    Lu, Yantao
    Hao, Xuetao
    Li, Yilan
    Chai, Weiheng
    Sun, Shiqi
    Velipasalar, Senem
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (01) : 292 - 305
  • [28] SCDA-Net: Structure Completion and Density Awareness Network for LiDAR-Based 3D Object Detection
    Wu, Shuwen
    Yang, Jinfu
    Ma, Jiaqi
    Zhang, Shaochen
    Hao, Tianhao
    Li, Mingai
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (05): : 4268 - 4275
  • [29] Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection From Point Clouds
    Yin, Junbo
    Shen, Jianbing
    Gao, Xin
    Crandall, David J.
    Yang, Ruigang
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9822 - 9835
  • [30] Fully Sparse Fusion for 3D Object Detection
    Li, Yingyan
    Fan, Lue
    Liu, Yang
    Huang, Zehao
    Chen, Yuntao
    Wang, Naiyan
    Zhang, Zhaoxiang
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 7217 - 7231