Scaling CNNs for High Resolution Volumetric Reconstruction from a Single Image

被引：24

作者：

Johnston, Adrian ^{[1
]}

Garg, Ravi ^{[1
]}

Carneiro, Gustavo ^{[1
]}

Reid, Ian ^{[1
]}

van den Hengel, Anton ^{[1
]}

机构：

[1] Univ Adelaide, Australian Ctr Visual Technol, Adelaide, SA, Australia

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017) | 2017年

基金：

澳大利亚研究理事会;

关键词：

SHAPE;

D O I：

10.1109/ICCVW.2017.114

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

One of the long-standing tasks in computer vision is to use a single 2-D view of an object in order to produce its 3-D shape. Recovering the lost dimension in this process has been the goal of classic shape-from-X methods, but often the assumptions made in those works are quite limiting to be useful for general 3-D objects. This problem has been recently addressed with deep learning methods containing a 2-D (convolution) encoder followed by a 3-D (de-convolution) decoder. These methods have been reasonably successful, but memory and run time constraints impose a strong limitation in terms of the resolution of the reconstructed 3-D shapes. In particular, state-of-the-art methods are able to reconstruct 3-D shapes represented by volumes of at most 323 voxels using state-of-the-art desktop computers. In this work, we present a scalable 2-D single view to 3-D volume reconstruction deep learning method, where the 3-D (deconvolution) decoder is replaced by a simple inverse discrete cosine transform (IDCT) decoder. Our simpler architecture has an order of magnitude faster inference when reconstructing 3-D volumes compared to the convolutionde-convolutional model, an exponentially smaller memory complexity while training and testing, and a sub-linear run-time training complexity with respect to the output volume size. We show on benchmark datasets that our method can produce high-resolution reconstructions with state of the art accuracy.

引用

页码：930 / 939

页数：10

共 43 条

[1] SCAPE: Shape Completion and Animation of People [J].

Anguelov, D ;

Srinivasan, P ;

Koller, D ;

Thrun, S ;

Rodgers, J ;

Davis, J .

ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416

[2]

[Anonymous], ARXIV170704682

[3]

[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298801

[4]

[Anonymous], CORR

[5]

Chang A. X., 2015, ARXIV

[6] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].

Choy, Christopher B. ;

Xu, Danfei ;

Gwak, Jun Young ;

Chen, Kevin ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644

[7] Dense Reconstruction Using 3D Object Shape Priors [J].

Dame, Amaury ;

Prisacariu, Victor A. ;

Ren, Carl Y. ;

Reid, Ian .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1288-1295

[8]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[9]

Dieleman Sander, 2015, Zenodo

[10] Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation [J].

Ghiasi, Golnaz ;

Fowlkes, Charless C. .

COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :519-534

← 1 2 3 4 5 →