Any-Shot GIN: Generalizing Implicit Networks for Reconstructing Novel Classes

被引:6
作者
Xian, Yongqin [1 ,2 ]
Chibane, Julian [2 ,3 ]
Bhatnagar, Bharat Lal [2 ,3 ]
Schiele, Bernt [2 ]
Akata, Zeynep [2 ,3 ,4 ]
Pons-Moll, Gerard [2 ,3 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Max Planck Inst Informat, Saarbrucken, Germany
[3] Univ Tubingen, Tubingen, Germany
[4] Max Planck Inst Intelligent Syst, Stuttgart, Germany
来源
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV | 2022年
关键词
D O I
10.1109/3DV57658.2022.00064
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the task of estimating the 3D shapes of novel shape classes from a single RGB image. Prior works are either limited to reconstructing known training classes or are unable to reconstruct high-quality shapes. To solve those issues, we propose Generalizing Implicit Networks (GIN) which decomposes 3D reconstruction into 1.) front-back depth estimation followed by differentiable depth voxelization, and 2.) implicit shape completion with 3D features. The key insight is that the depth estimation network learns local class-agnostic shape priors, allowing us to generalize to novel classes, while our implicit shape completion network is able to predict accurate shapes with rich details by learning implicit surfaces in 3D voxel space. We conduct extensive experiments on a large-scale benchmark using 55 classes of ShapeNet and real images of Pix3D. We qualitatively and quantitatively show that the proposed GIN significantly outperforms the state of the art on both seen and novel shape classes for single-image 3D reconstruction. We also illustrate that our GIN can be further improved by using only few-shot depth supervision from novel classes.
引用
收藏
页码:526 / 535
页数:10
相关论文
共 55 条
[41]  
Vinyals Oriol, MATCHING NETWORKS ON
[42]   GSIR: Generalizable 3D Shape Interpretation and Reconstruction [J].
Wang, Jianren ;
Fang, Zhaoyuan .
COMPUTER VISION - ECCV 2020, PT XIII, 2020, 12358 :498-514
[43]   Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images [J].
Wang, Nanyang ;
Zhang, Yinda ;
Li, Zhuwen ;
Fu, Yanwei ;
Liu, Wei ;
Jiang, Yu-Gang .
COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :55-71
[44]   ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image [J].
Wang, Yida ;
Tan, David Joseph ;
Navab, Nassir ;
Tombari, Federico .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8607-8616
[45]   Learning Shape Priors for Single-View 3D Completion And Reconstruction [J].
Wu, Jiajun ;
Zhang, Chengkai ;
Zhang, Xiuming ;
Zhang, Zhoutong ;
Freeman, William T. ;
Tenenbaum, Joshua B. .
COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :673-691
[46]  
Wu JJ, 2016, ADV NEUR IN, V29
[47]  
Wu J, 2017, ADV NEUR IN, V30
[48]  
Chang AX, 2015, Arxiv, DOI [arXiv:1512.03012, 10.48550/arXiv.1512.03012]
[49]   Zero-Shot Learning-A Comprehensive Evaluation of the Good, the Bad and the Ugly [J].
Xian, Yongqin ;
Lampert, Christoph H. ;
Schiele, Bernt ;
Akata, Zeynep .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (09) :2251-2265
[50]  
Xiao JX, 2010, PROC CVPR IEEE, P3485, DOI 10.1109/CVPR.2010.5539970