A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image

被引：22

作者：

Zhao, Ruiqi ^{[1
]}

Wang, Yan ^{[1
]}

Martinez, Aleix M. ^{[1
]}

机构：

[1] Ohio State Univ, Dept Elect & Comp Engn, Columbus, OH 43210 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2018年 / 40卷 / 12期

基金：

美国国家卫生研究院;

关键词：

3D modeling and reconstruction; fine-grained reconstruction; 3D shape from a single 2D image; deep learning;

D O I：

10.1109/TPAMI.2017.2772922

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Three-dimensional shape reconstruction of 2D landmark points on a single image is a hallmark of human vision, but is a task that has been proven difficult for computer vision algorithms. We define a feed-forward deep neural network algorithm that can reconstruct 3D shapes from 2D landmark points almost perfectly (i.e., with extremely small reconstruction errors), even when these 2D landmarks are from a single image. Our experimental results show an improvement of up to two-fold over state-of-the-art computer vision algorithms; 3D shape reconstruction error (measured as the Procrustes distance between the reconstructed shape and the ground-truth) of human faces is < .004, cars is .0022, human bodies is .022, and highly-deformable flags is .0004. Our algorithm was also a top performer at the 2016 3D Face Alignment in the Wild Challenge competition (done in conjunction with the European Conference on Computer Vision, ECCV) that required the reconstruction of 3D face shape from a single image. The derived algorithm can be trained in a couple hours and testing runs at more than 1,000 frames/s on an i7 desktop. We also present an innovative data augmentation approach that allows us to train the system efficiently with small number of samples. And the system is robust to noise (e.g., imprecise landmark points) and missing data (e.g., occluded or undetected landmark points).

引用

页码：3059 / 3066

页数：8

共 39 条

[1] Akhter I, 2015, PROC CVPR IEEE, P1446, DOI 10.1109/CVPR.2015.7298751
[2] Trajectory Space: A Dual Representation for Nonrigid Structure from Motion
Akhter, Ijaz
Sheikh, Yaser
Khan, Sohaib
Kanade, Takeo
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (07) : 1442 - 1456
[3] [Anonymous], CARN MELL U GRAPH LA
[4] [Anonymous], P EUR C COMPUT VIS
[5] [Anonymous], P 2008 8 IEEE INT C, DOI DOI 10.1109/AFGR.2008.4813324
[6] [Anonymous], ADV NEUR INF P SYST
[7] [Anonymous], 2003, Multiple view geometry in computer vision
[8] Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration
Bartoli, Adrien
Collins, Toby
[J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1514 - 1521
[9] Bengio Y, 1996, ADV NEUR IN, V8, P395
[10] Chen Y, 2010, LECT NOTES COMPUT SC, V6313, P300

← 1 2 3 4 →