Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views

被引:442
作者
Su, Hao [1 ]
Qi, Charles R. [1 ]
Li, Yangyan [1 ]
Guibas, Leonidas J. [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
来源
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2015年
基金
美国国家科学基金会;
关键词
POSE;
D O I
10.1109/ICCV.2015.308
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs (Convolutional Neural Networks). We believe that 3D models have the potential in generating a large number of images of high variation, which can be well exploited by deep CNN with a high learning capacity. Towards this goal, we propose a scalable and overfitresistant image synthesis pipeline, together with a novel CNN specifically tailored for the viewpoint estimation task. Experimentally, we show that the viewpoint estimation from our pipeline can significantly outperform state-of-the-art methods on PASCAL 3D+ benchmark.
引用
收藏
页码:2686 / 2694
页数:9
相关论文
共 37 条
[1]  
[Anonymous], 2015, SHAPENET INFORM RICH
[2]  
[Anonymous], COMPUTER VISION PATT
[3]  
[Anonymous], 2015, ARXIV150204652
[4]  
[Anonymous], 2014, CORR
[5]  
[Anonymous], 2010, Bmvc
[6]   Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models [J].
Aubry, Mathieu ;
Maturana, Daniel ;
Efros, Alexei A. ;
Russell, Bryan C. ;
Sivic, Josef .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3762-3769
[7]  
Bowyer K. W., 1990, International Journal of Imaging Systems and Technology, V2, P315, DOI 10.1002/ima.1850020407
[8]   A similarity-based aspect-graph approach to 3D object recognition [J].
Cyr, CM ;
Kimia, BB .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 57 (01) :5-22
[9]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[10]   SoftPOSIT: Simultaneous pose and correspondence determination [J].
David, P ;
Dementhon, D ;
Duraiswami, R ;
Samet, H .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 59 (03) :259-284