SparseGNV: Generating Novel Views of Indoor Scenes with Sparse RGB-D Images

被引:0
|
作者
Cheng, Weihao [1 ]
Cao, Yan-Pei [1 ]
Shan, Ying [1 ]
机构
[1] ARC Lab, Tencent PCG, Shenzhen, Peoples R China
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2 | 2024年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study to generate novel views of indoor scenes given sparse input views. The challenge is to achieve both photorealism and view consistency. We present SparseGNV: a learning framework that incorporates 3D structures and image generative models to generate novel views with three modules. The first module builds a neural point cloud as underlying geometry, providing scene context and guidance for the target novel view. The second module utilizes a transformer-based network to map the scene context and the guidance into a shared latent space and autoregressively decodes the target view in the form of discrete image tokens. The third module reconstructs the tokens back to the image of the target view. SparseGNV is trained across a large-scale indoor scene dataset to learn generalizable priors. Once trained, it can efficiently generate novel views of an unseen indoor scene in a feed-forward manner. We evaluate SparseGNV on real-world indoor scenes and demonstrate that it outperforms state-of-the-art methods based on either neural radiance fields or conditional image generation.
引用
收藏
页码:1308 / 1316
页数:9
相关论文
共 50 条
  • [1] RGB-D IBR: Rendering Indoor Scenes Using Sparse RGB-D Images with Local Alignments
    Jeong, Yeongyu
    Kim, Haejoon
    Seo, Hyewon
    Cordier, Frederic
    Lee, Seungyong
    PROCEEDINGS I3D 2016: 20TH ACM SIGGRAPH SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES, 2016, : 205 - 206
  • [2] RGB-D DSO: Direct Sparse Odometry With RGB-D Cameras for Indoor Scenes
    Yuan, Zikang
    Cheng, Ken
    Tang, Jinhui
    Yang, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4092 - 4101
  • [3] Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
    Gupta, Saurabh
    Arbelaez, Pablo
    Malik, Jitendra
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 564 - 571
  • [4] Semantic Labeling of Indoor Scenes from RGB-D Images with Discriminative Learning
    Liu, Bo
    Fan, Haoqi
    SIXTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2013), 2013, 9067
  • [5] Object Segmentation of Indoor Scenes Using Perceptual Organization on RGB-D Images
    Wang, Chaonan
    Xue, Yanbing
    Zhang, Hua
    Xu, Guangping
    Gao, Zan
    2016 8TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS & SIGNAL PROCESSING (WCSP), 2016,
  • [6] Dense 3D Semantic Mapping of Indoor Scenes from RGB-D Images
    Hermans, Alexander
    Floros, Georgios
    Leibe, Bastian
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 2631 - 2638
  • [7] Superpixels of RGB-D Images for Indoor Scenes Based on Weighted Geodesic Driven Metric
    Pan, Xiao
    Zhou, Yuanfeng
    Li, Feng
    Zhang, Caiming
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2017, 23 (10) : 2342 - 2356
  • [8] 21/2D Scene Reconstruction of Indoor Scenes from Single RGB-D Images
    Neverova, Natalia
    Muselet, Damien
    Tremeau, Alain
    COMPUTATIONAL COLOR IMAGING, CCIW 2013, 2013, 7786 : 281 - 295
  • [9] Online Reconstruction of Indoor Scenes from RGB-D Streams
    Wang, Hao
    Wang, Jun
    Wang, Liang
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3271 - 3279
  • [10] Unsupervised object region proposals for RGB-D indoor scenes
    Deng, Zhuo
    Todorovic, Sinisa
    Latecki, Longin Jan
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 154 : 127 - 136