ACF-Net: Asymmetric Cascade Fusion for 3D Detection With LiDAR Point Clouds and Images

被引：8

作者：

Tian, Yonglin ^{[1
,2
]}

Zhang, Xianjing ^{[2
]}

Wang, Xiao ^{[3
]}

Xu, Jintao ^{[2
]}

Wang, Jiangong ^{[1
]}

Ai, Rui ^{[4
]}

Gu, Weihao ^{[4
]}

Ding, Weiping ^{[5
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

[2] Haomo Technol Co Ltd, AI Ctr, Beijing 100192, Peoples R China

[3] Anhui Univ, Sch Artificial Intelligence, Hefei 230031, Peoples R China

[4] Haomo Technol Co Ltd, Beijing 100192, Peoples R China

[5] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES | 2024年 / 9卷 / 02期

关键词：

Three-dimensional displays; Feature extraction; Point cloud compression; Laser radar; Object detection; Timing; Fuses; 3D detection; autonomous driving; asymmetric fusion; cascade fusion; multimodal fusion; OBJECT; PERFORMANCE;

D O I：

10.1109/TIV.2023.3341223

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The recognition and utilization of complementary information arising from modality-intrinsic properties play crucial roles in multimodal 3D detection. However, most of the current approaches for fusion-based 3D detection follow symmetrical fusion paradigms and adopt early fusion, middle fusion as well as late fusion styles, which ignore the unequal status of data with different modalities. In this paper, according to the timing of fusion, we adopt an asymmetric cascade fusion network to exploit both the structural information from point clouds and the complementary semantic information from images. A multi-stage cascade design of 3D object detection is proposed to iteratively refine predictions and several late image features (comprised of detection clues, segmentation clues, and deep features from encoders) are incorporated into different stages of the LiDAR branch to maintain the integrity of image features and enable deep multimodal interactions. Besides, to mitigate the effects of the down-sampling of voxelized features and possible mismatching of multimodal data, we propose proxy-based cross-modality sampling to utilize the high-density point clouds coordinates and develop an image degeneration process to simulate the noise in cross-modality matching for robust training. Extensive experiments are conducted on KITTI and Waymo Open Dataset, which validate the effectiveness of the proposed method.

引用

页码：3360 / 3371

页数：12

共 50 条

[21] LXL: LiDAR Excluded Lean 3D Object Detection With 4D Imaging Radar and Camera Fusion
Xiong, Weiyi
Liu, Jianan
Huang, Tao
Han, Qing-Long
Xia, Yuxuan
Zhu, Bing
[J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 79 - 92
[22] PDANet: Point Distribution Awareness for 3-D Object Detection From LiDAR Point Clouds
Tang, Miao
Yu, Dianyu
Hu, Qingyong
Dai, Wenxia
Xiao, Wen
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[23] Transformer-Based Optimized Multimodal Fusion for 3D Object Detection in Autonomous Driving
Alaba, Simegnew Yihunie
Ball, John E.
[J]. IEEE ACCESS, 2024, 12 : 50165 - 50176
[24] Monitoring Critical Infrastructure Using 3D LiDAR Point Clouds
Sharifisoraki, Z.
Dey, A.
Selzler, R.
Amini, M.
Green, J. R.
Rajan, S.
Kwamena, F. A.
[J]. IEEE ACCESS, 2023, 11 : 314 - 336
[25] CrossFusion net: Deep 3D object detection based on RGB images and point clouds in autonomous driving
Hong, Dza-Shiang
Chen, Hung-Hao
Hsiao, Pei-Yung
Fu, Li-Chen
Siao, Siang-Min
[J]. IMAGE AND VISION COMPUTING, 2020, 100
[26] MD3D: Mixture-Density-Based 3D Object Detection in Point Clouds
Choi, Jaeseok
Song, Yeji
Kim, Yerim
Yoo, Jaeyoung
Kwak, Nojun
[J]. IEEE ACCESS, 2022, 10 : 104011 - 104022
[27] Range-Aware Attention Network for LiDAR-Based 3D Object Detection With Auxiliary Point Density Level Estimation
Lu, Yantao
Hao, Xuetao
Li, Yilan
Chai, Weiheng
Sun, Shiqi
Velipasalar, Senem
[J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (01) : 292 - 305
[28] SCDA-Net: Structure Completion and Density Awareness Network for LiDAR-Based 3D Object Detection
Wu, Shuwen
Yang, Jinfu
Ma, Jiaqi
Zhang, Shaochen
Hao, Tianhao
Li, Mingai
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (05): : 4268 - 4275
[29] Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection From Point Clouds
Yin, Junbo
Shen, Jianbing
Gao, Xin
Crandall, David J.
Yang, Ruigang
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9822 - 9835
[30] Fully Sparse Fusion for 3D Object Detection
Li, Yingyan
Fan, Lue
Liu, Yang
Huang, Zehao
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 7217 - 7231

← 1 2 3 4 5 →