MonoScene: Monocular 3D Semantic Scene Completion

被引:150
作者
Anh-Quan Cao [1 ]
de Charette, Raoul [1 ]
机构
[1] INRIA, Paris, France
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.00396
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets, bridged by a novel 2D-3D features projection inspired by optics, and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metries and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene.
引用
收藏
页码:3981 / 3991
页数:11
相关论文
共 83 条
[11]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[12]  
Cai Yingjie, 2021, CVPR
[13]  
Chen R., 2019, ROBIO, P2
[14]   What Is Decidable about String Constraints with the ReplaceAll Function [J].
Chen, Taolue ;
Chen, Yan ;
Hague, Matthew ;
Lin, Anthony W. ;
Wu, Zhilin .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2018, 2 (POPL)
[15]  
Chen Xiaokang, 2020, CVPR
[16]  
Chen Yueh-Tung, 2019, ICIP
[17]  
Chen Zhiqin, 2020, CVPR
[18]  
Cheng Ran, 2020, CORL
[19]  
Cheng Ziang, 2021, CVPR
[20]  
Cherabier Ian, 2018, ECCV