A survey on occupancy perception for autonomous driving: The information fusion perspective

被引:0
作者
Xu, Huaiyuan [1 ]
Chen, Junliang [1 ]
Meng, Shiyu [1 ]
Wang, Yi [1 ]
Chau, Lap-Pui [1 ]
机构
[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Peoples R China
关键词
Autonomous driving; Information fusion; Occupancy perception; Multi-modal data; 3D OBJECT DETECTION; RECONSTRUCTION; BENCHMARK; FIELDS;
D O I
10.1016/j.inffus.2024.102671
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this paper will inspire the community and encourage more research work on 3D occupancy perception. A comprehensive list of studies in this survey is publicly available in an active repository that continuously collects the latest work: https://github.com/HuaiyuanXu/3D-Occupancy-Perception.
引用
收藏
页数:17
相关论文
共 172 条
[1]   UnO: Unsupervised Occupancy Fields for Perception and Forecasting [J].
Agro, Ben ;
Sykora, Quinlan ;
Casas, Sergio ;
Gilles, Thomas ;
Urtasun, Raquel .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, :14487-14496
[2]   MonoScene: Monocular 3D Semantic Scene Completion [J].
Anh-Quan Cao ;
de Charette, Raoul .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :3981-3991
[3]  
[Anonymous], 2024, WORKSHOP AUTONOMOUS
[4]  
[Anonymous], 2024, Occupancy networks
[5]  
[Anonymous], 2024, Tesla AI day
[6]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[7]   The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks [J].
Berman, Maxim ;
Triki, Amal Rannen ;
Blaschko, Matthew B. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4413-4421
[8]  
Boeder S, 2024, Arxiv, DOI arXiv:2402.12792
[9]  
Caesar H, 2022, Arxiv, DOI [arXiv:2106.11810, DOI 10.48550/ARXIV.2106.11810]
[10]  
Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164