Understanding Indoor Scenes using 3D Geometric Phrases

被引:93
作者
Choi, Wongun [1 ]
Chao, Yu-Wei [1 ]
Pantofaru, Caroline [2 ]
Savarese, Silvio [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Google, Mountain View, CA USA
来源
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2013年
关键词
D O I
10.1109/CVPR.2013.12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual scene understanding is a difficult problem interleaving object detection, geometric reasoning and scene classification. We present a hierarchical scene model for learning and reasoning about complex indoor scenes which is computationally tractable, can be learned from a reasonable amount of training data, and avoids oversimplification. At the core of this approach is the 3D Geometric Phrase Model which captures the semantic and geometric relationships between objects which frequently co-occur in the same 3D spatial configuration. Experiments show that this model effectively explains scene semantics, geometry and object groupings from a single image, while also improving individual object detections.
引用
收藏
页码:33 / 40
页数:8
相关论文
共 30 条
[1]  
[Anonymous], CVPR
[2]  
[Anonymous], 2004, P EUR C COMP VIS WOR
[3]  
[Anonymous], PAMI
[4]  
Bao S. Y., 2010, CVPR, V1, P2
[5]  
Bao SidYingze., 2011, CVPR
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[8]   Discriminative Models for Multi-Class Object Layout [J].
Desai, Chaitanya ;
Ramanan, Deva ;
Fowlkes, Charless C. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 95 (01) :1-12
[9]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[10]  
Fei-Fei L, 2005, PROC CVPR IEEE, P524