CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image

被引:38
作者
Popov, Stefan [1 ]
Bauszat, Pablo [1 ]
Ferrari, Vittorio [1 ]
机构
[1] Google Res, Zurich, Switzerland
来源
COMPUTER VISION - ECCV 2020, PT II | 2020年 / 12347卷
关键词
D O I
10.1007/978-3-030-58536-5_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Advances in deep learning techniques have allowed recent work to reconstruct the shape of a single object given only one RBG image as input. Building on common encoder-decoder architectures for this task, we propose three extensions: (1) ray-traced skip connections that propagate local 2D information to the output 3D volume in a physically correct manner; (2) a hybrid 3D volume representation that enables building translation equivariant models, while at the same time encoding fine object details without an excessive memory footprint; (3) a reconstruction loss tailored to capture overall object geometry. Furthermore, we adapt our model to address the harder task of reconstructing multiple objects from a single image. We reconstruct all objects jointly in one pass, producing a coherent reconstruction, where all objects live in a single consistent 3D coordinate frame relative to the camera and they do not intersect in 3D space. We also handle occlusions and resolve them by hallucinating the missing object parts in the 3D volume. We validate the impact of our contributions experimentally both on synthetic data from ShapeNet as well as real images from Pix3D. Our method improves over the state-of-the-art single-object methods on both datasets. Finally, we evaluate performance quantitatively on multiple object reconstruction with synthetic scenes assembled from ShapeNet objects.
引用
收藏
页码:366 / 383
页数:18
相关论文
共 52 条
[1]  
[Anonymous], MULTIPLE VIEW GEOMET
[2]  
[Anonymous], About us
[3]   The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks [J].
Berman, Maxim ;
Triki, Amal Rannen ;
Blaschko, Matthew B. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4413-4421
[4]   Matterport3D: Learning from RGB-D Data in Indoor Environments [J].
Chang, Angel ;
Dai, Angela ;
Funkhouser, Thomas ;
Halber, Maciej ;
Niessner, Matthias ;
Savva, Manolis ;
Song, Shuran ;
Zeng, Andy ;
Zhang, Yinda .
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, :667-676
[5]   BSP-Net: Generating Compact Meshes via Binary Space Partitioning [J].
Chen, Zhiqin ;
Tagliasacchi, Andrea ;
Zhang, Hao .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :42-51
[6]   Learning Implicit Fields for Generative Shape Modeling [J].
Chen, Zhiqin ;
Zhang, Hao .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5932-5941
[7]   3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].
Choy, Christopher B. ;
Xu, Danfei ;
Gwak, Jun Young ;
Chen, Kevin ;
Savarese, Silvio .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644
[8]   A Point Set Generation Network for 3D Object Reconstruction from a Single Image [J].
Fan, Haoqiang ;
Su, Hao ;
Guibas, Leonidas .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2463-2471
[9]   Learning a Predictable and Generative Vector Representation for Objects [J].
Girdhar, Rohit ;
Fouhey, David F. ;
Rodriguez, Mikel ;
Gupta, Abhinav .
COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :484-499
[10]   Mesh R-CNN [J].
Gkioxari, Georgia ;
Malik, Jitendra ;
Johnson, Justin .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9784-9794