Make3D: Learning 3D Scene Structure from a Single Still Image

被引:1122
作者
Saxena, Ashutosh [1 ]
Sun, Min [2 ]
Ng, Andrew Y. [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Princeton Univ, Vis Lab, Princeton, NJ 08540 USA
基金
美国国家科学基金会;
关键词
Machine learning; monocular vision; learning depth; vision and scene understanding; scene analysis; depth cues; TEXTURE; MOTION; SHAPE;
D O I
10.1109/TPAMI.2008.132
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of estimating detailed 3D structure from a single still image of an unstructured environment. Our goal is to create 3D models that are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov Random Field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3D structure than does prior art and also give a much richer experience in the 3D flythroughs created using image-based rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3D models for 64.9 percent of 588 images downloaded from the Internet. We have also extended our model to produce large-scale 3D models from a few images.
引用
收藏
页码:824 / 840
页数:17
相关论文
共 43 条
  • [1] [Anonymous], 2006, P IEEE CS C COMP VIS
  • [2] Ashutosh Saxena A.Y. N., 2007, International Journal of Computer Vision (IJCV)
  • [3] Speeded-Up Robust Features (SURF)
    Bay, Herbert
    Ess, Andreas
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
  • [4] Bishop C., 2006, BOOK REV PATTERNRECO, DOI DOI 10.1117/1.2819119
  • [5] Boyd S, 2004, Convex Optimization, P561, DOI [DOI 10.1017/CBO9780511804441, 10.1017/CBO9780511804441]
  • [6] CHRISPAUL MK, 2006, P NIPS WORKSH ADV ST
  • [7] Single view metrology
    Criminisi, A
    Reid, I
    Zisserman, A
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2000, 40 (02) : 123 - 148
  • [8] DALAI N, 2005, P IEEE CS C COMP VIS
  • [9] DELAGE E, 2006, P IEEE CS C COMP VIS
  • [10] DELAGE E, 2005, P 12 INT S ROB RES