Learning 3D Scene Semantics and Structure from a Single Depth Image

被引：1

作者：

Yang, Bo ^{[1
]}

Lai, Zihang ^{[1
]}

Lu, Xiaoxuan ^{[1
]}

Lin, Shuyu ^{[1
]}

Wen, Hongkai ^{[2
]}

Markham, Andrew ^{[1
]}

Trigoni, Niki ^{[1
]}

机构：

[1] Univ Oxford, Oxford, England

[2] Univ Warwick, Coventry, W Midlands, England

来源：

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) | 2018年

关键词：

D O I：

10.1109/CVPRW.2018.00069

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we aim to understand the semantics and 3D structure of a scene from a single depth image. Recent deep neural networks based methods aim to simultaneously learn object class labels and infer the 3D shape of a scene represented by a large voxel grid. However, individual objects within the scene are usually only represented by a few voxels leading to a loss of geometric detail. In addition, significant computational and memory resources are required to process the large scale voxel grid of a whole scene. To address this, we propose an efficient and holistic pipeline, 3R-Depth, to simultaneously learn the semantics and structure of a scene from a single depth image. Our key idea is to deeply fuse an efficient 3D shape estimator with existing recognition (e.g., ResNets) and segmentation (e.g., Mask R-CNN) techniques. Object level semantics and latent feature maps are extracted and then fed to a shape estimator to extract the 3D shape. Extensive experiments are conducted on large-scale synthesized indoor scene datasets, quantitatively and qualitatively demonstrating the merits and superior performance of 3R-Depth.

引用

页码：422 / 425

页数：4

共 50 条

[1] Make3D: Learning 3D Scene Structure from a Single Still Image
Saxena, Ashutosh
Sun, Min
Ng, Andrew Y.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (05) : 824 - 840
[2] Learning to Recover 3D Scene Shape from a Single Image
Yin, Wei
Zhang, Jianming
Wang, Oliver
Niklaus, Simon
Mai, Long
Chen, Simon
Shen, Chunhua
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 204 - 213
[3] Inferring 3D scene structure from a single polarization image
Rahmann, S
POLARIZATION AND COLOR TECHNIQUES IN INDUSTRIAL INSPECTION, 1999, 3826 : 22 - 33
[4] Learning 3-d scene structure from a single still image
Saxena, Ashutosh
Sun, Min
Ng, Andrew Y.
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 1 - 8
[5] Fully Convolutional Denoising Autoencoder for 3D Scene Reconstruction from a single depth image
Palla, Alessandro
Moloney, David
Fanucci, Luca
2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 566 - 575
[6] 3D SEMANTIC SCENE COMPLETION FROM A SINGLE DEPTH IMAGE USING ADVERSARIAL TRAINING
Chen, Yueh-Tung
Garbade, Martin
Gall, Juergen
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1835 - 1839
[7] Machine learning for scene 3D reconstruction using a single image
Knyaz, Vladimir
OPTICS, PHOTONICS AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS VI, 2021, 11353
[8] 3D Priors for Scene Learning from a Single View
Rother, Diego
Patwardban, Kedar
Aganj, Iman
Sapiro, Guillermo
2008 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, VOLS 1-3, 2008, : 635 - +
[9] 3D Scene Graph: A structure for unified semantics, 3D space, and camera
Armeni, Iro
He, Zhi-Yang
Gwak, JunYoung
Zamir, Amir R.
Fischer, Martin
Malik, Jitendra
Savarese, Silvio
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5663 - 5672
[10] DEPTH PREDICTION FROM A SINGLE IMAGE WITH 3D CONSISTENCY
Tian, Hu
Li, Fei
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 111 - 115

← 1 2 3 4 5 →