Emerging Trends in Autonomous Vehicle Perception: Multimodal Fusion for 3D Object Detection

被引:27
作者
Alaba, Simegnew Yihunie [1 ]
Gurbuz, Ali C. [1 ]
Ball, John E. [1 ]
机构
[1] Mississippi State Univ, James Worth Bagley Coll Engn, Dept Elect & Comp Engn, Starkville, MS 39762 USA
关键词
3D object detection; autonomous vehicles; deep learning; LiDAR; multimodal fusion; perception; vision transformers; POINT CLOUDS; LIDAR;
D O I
10.3390/wevj15010020
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The pursuit of autonomous driving relies on developing perception systems capable of making accurate, robust, and rapid decisions to interpret the driving environment effectively. Object detection is crucial for understanding the environment at these systems' core. While 2D object detection and classification have advanced significantly with the advent of deep learning (DL) in computer vision (CV) applications, they fall short in providing essential depth information, a key element in comprehending driving environments. Consequently, 3D object detection becomes a cornerstone for autonomous driving and robotics, offering precise estimations of object locations and enhancing environmental comprehension. The CV community's growing interest in 3D object detection is fueled by the evolution of DL models, including Convolutional Neural Networks (CNNs) and Transformer networks. Despite these advancements, challenges such as varying object scales, limited 3D sensor data, and occlusions persist in 3D object detection. To address these challenges, researchers are exploring multimodal techniques that combine information from multiple sensors, such as cameras, radar, and LiDAR, to enhance the performance of perception systems. This survey provides an exhaustive review of multimodal fusion-based 3D object detection methods, focusing on CNN and Transformer-based models. It underscores the necessity of equipping fully autonomous vehicles with diverse sensors to ensure robust and reliable operation. The survey explores the advantages and drawbacks of cameras, LiDAR, and radar sensors. Additionally, it summarizes autonomy datasets and examines the latest advancements in multimodal fusion-based methods. The survey concludes by highlighting the ongoing challenges, open issues, and potential directions for future research.
引用
收藏
页数:35
相关论文
共 182 条
[1]  
Alaba J. E., 2023, Proc. SPIE, V12540, P36
[2]   Deep Learning-Based Image 3-D Object Detection for Autonomous Driving: Review [J].
Alaba, Simegnew Yihunie ;
Ball, John E. .
IEEE SENSORS JOURNAL, 2023, 23 (04) :3378-3394
[3]   A Survey on Deep-Learning-Based LiDAR 3D Object Detection for Autonomous Driving [J].
Alaba, Simegnew Yihunie ;
Ball, John E. .
SENSORS, 2022, 22 (24)
[4]   WCNN3D: Wavelet Convolutional Neural Network-Based 3D Object Detection for Autonomous Driving [J].
Alaba, Simegnew Yihunie ;
Ball, John E. .
SENSORS, 2022, 22 (18)
[5]   Geometric calibration for LiDAR-camera system fusing 3D-2D and 3D-3D point correspondences [J].
An, Pei ;
Ma, Tao ;
Yu, Kun ;
Fang, Bin ;
Zhang, Jun ;
Fu, Wenxing ;
Ma, Jie .
OPTICS EXPRESS, 2020, 28 (02) :2122-2141
[6]   A Survey on 3D Object Detection Methods for Autonomous Driving Applications [J].
Arnold, Eduardo ;
Al-Jarrah, Omar Y. ;
Dianati, Mehrdad ;
Fallah, Saber ;
Oxtoby, David ;
Mouzakitis, Alex .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (10) :3782-3795
[7]   TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers [J].
Bai, Xuyang ;
Hu, Zeyu ;
Zhu, Xinge ;
Huang, Qingqiu ;
Chen, Yilun ;
Fu, Hangbo ;
Tai, Chiew-Lan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1080-1089
[8]   LiDAR-Camera Calibration Using Line Correspondences [J].
Bai, Zixuan ;
Jiang, Guang ;
Xu, Ailing .
SENSORS, 2020, 20 (21) :1-17
[9]  
Beltrán J, 2022, Arxiv, DOI [arXiv:2101.04431, DOI 10.48550/ARXIV.2101.04431]
[10]   Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather [J].
Bijelic, Mario ;
Gruber, Tobias ;
Mannan, Fahim ;
Kraus, Florian ;
Ritter, Werner ;
Dietmayer, Klaus ;
Heide, Felix .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11679-11689