Learning 3D Scene Semantics and Structure from a Single Depth Image

被引:1
|
作者
Yang, Bo [1 ]
Lai, Zihang [1 ]
Lu, Xiaoxuan [1 ]
Lin, Shuyu [1 ]
Wen, Hongkai [2 ]
Markham, Andrew [1 ]
Trigoni, Niki [1 ]
机构
[1] Univ Oxford, Oxford, England
[2] Univ Warwick, Coventry, W Midlands, England
来源
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) | 2018年
关键词
D O I
10.1109/CVPRW.2018.00069
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we aim to understand the semantics and 3D structure of a scene from a single depth image. Recent deep neural networks based methods aim to simultaneously learn object class labels and infer the 3D shape of a scene represented by a large voxel grid. However, individual objects within the scene are usually only represented by a few voxels leading to a loss of geometric detail. In addition, significant computational and memory resources are required to process the large scale voxel grid of a whole scene. To address this, we propose an efficient and holistic pipeline, 3R-Depth, to simultaneously learn the semantics and structure of a scene from a single depth image. Our key idea is to deeply fuse an efficient 3D shape estimator with existing recognition (e.g., ResNets) and segmentation (e.g., Mask R-CNN) techniques. Object level semantics and latent feature maps are extracted and then fed to a shape estimator to extract the 3D shape. Extensive experiments are conducted on large-scale synthesized indoor scene datasets, quantitatively and qualitatively demonstrating the merits and superior performance of 3R-Depth.
引用
收藏
页码:422 / 425
页数:4
相关论文
共 50 条
  • [1] Make3D: Learning 3D Scene Structure from a Single Still Image
    Saxena, Ashutosh
    Sun, Min
    Ng, Andrew Y.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (05) : 824 - 840
  • [2] Learning to Recover 3D Scene Shape from a Single Image
    Yin, Wei
    Zhang, Jianming
    Wang, Oliver
    Niklaus, Simon
    Mai, Long
    Chen, Simon
    Shen, Chunhua
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 204 - 213
  • [3] Inferring 3D scene structure from a single polarization image
    Rahmann, S
    POLARIZATION AND COLOR TECHNIQUES IN INDUSTRIAL INSPECTION, 1999, 3826 : 22 - 33
  • [4] Learning 3-d scene structure from a single still image
    Saxena, Ashutosh
    Sun, Min
    Ng, Andrew Y.
    2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 1 - 8
  • [5] Fully Convolutional Denoising Autoencoder for 3D Scene Reconstruction from a single depth image
    Palla, Alessandro
    Moloney, David
    Fanucci, Luca
    2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 566 - 575
  • [6] 3D SEMANTIC SCENE COMPLETION FROM A SINGLE DEPTH IMAGE USING ADVERSARIAL TRAINING
    Chen, Yueh-Tung
    Garbade, Martin
    Gall, Juergen
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1835 - 1839
  • [7] Machine learning for scene 3D reconstruction using a single image
    Knyaz, Vladimir
    OPTICS, PHOTONICS AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS VI, 2021, 11353
  • [8] 3D Priors for Scene Learning from a Single View
    Rother, Diego
    Patwardban, Kedar
    Aganj, Iman
    Sapiro, Guillermo
    2008 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, VOLS 1-3, 2008, : 635 - +
  • [9] 3D Scene Graph: A structure for unified semantics, 3D space, and camera
    Armeni, Iro
    He, Zhi-Yang
    Gwak, JunYoung
    Zamir, Amir R.
    Fischer, Martin
    Malik, Jitendra
    Savarese, Silvio
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5663 - 5672
  • [10] DEPTH PREDICTION FROM A SINGLE IMAGE WITH 3D CONSISTENCY
    Tian, Hu
    Li, Fei
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 111 - 115