Scaling CNNs for High Resolution Volumetric Reconstruction from a Single Image

被引:24
作者
Johnston, Adrian [1 ]
Garg, Ravi [1 ]
Carneiro, Gustavo [1 ]
Reid, Ian [1 ]
van den Hengel, Anton [1 ]
机构
[1] Univ Adelaide, Australian Ctr Visual Technol, Adelaide, SA, Australia
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017) | 2017年
基金
澳大利亚研究理事会;
关键词
SHAPE;
D O I
10.1109/ICCVW.2017.114
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the long-standing tasks in computer vision is to use a single 2-D view of an object in order to produce its 3-D shape. Recovering the lost dimension in this process has been the goal of classic shape-from-X methods, but often the assumptions made in those works are quite limiting to be useful for general 3-D objects. This problem has been recently addressed with deep learning methods containing a 2-D (convolution) encoder followed by a 3-D (de-convolution) decoder. These methods have been reasonably successful, but memory and run time constraints impose a strong limitation in terms of the resolution of the reconstructed 3-D shapes. In particular, state-of-the-art methods are able to reconstruct 3-D shapes represented by volumes of at most 323 voxels using state-of-the-art desktop computers. In this work, we present a scalable 2-D single view to 3-D volume reconstruction deep learning method, where the 3-D (deconvolution) decoder is replaced by a simple inverse discrete cosine transform (IDCT) decoder. Our simpler architecture has an order of magnitude faster inference when reconstructing 3-D volumes compared to the convolutionde-convolutional model, an exponentially smaller memory complexity while training and testing, and a sub-linear run-time training complexity with respect to the output volume size. We show on benchmark datasets that our method can produce high-resolution reconstructions with state of the art accuracy.
引用
收藏
页码:930 / 939
页数:10
相关论文
共 43 条
[1]   SCAPE: Shape Completion and Animation of People [J].
Anguelov, D ;
Srinivasan, P ;
Koller, D ;
Thrun, S ;
Rodgers, J ;
Davis, J .
ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416
[2]  
[Anonymous], ARXIV170704682
[3]  
[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298801
[4]  
[Anonymous], CORR
[5]  
Chang A. X., 2015, ARXIV
[6]   3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].
Choy, Christopher B. ;
Xu, Danfei ;
Gwak, Jun Young ;
Chen, Kevin ;
Savarese, Silvio .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644
[7]   Dense Reconstruction Using 3D Object Shape Priors [J].
Dame, Amaury ;
Prisacariu, Victor A. ;
Ren, Carl Y. ;
Reid, Ian .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1288-1295
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]  
Dieleman Sander, 2015, Zenodo
[10]   Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation [J].
Ghiasi, Golnaz ;
Fowlkes, Charless C. .
COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :519-534