Indoor Scene Understanding with Geometric and Semantic Contexts

被引:31
|
作者
Choi, Wongun [1 ]
Chao, Yu-Wei [2 ]
Pantofaru, Caroline [3 ]
Savarese, Silvio [4 ]
机构
[1] NEC Labs Amer, Cupertino, CA 95014 USA
[2] Univ Michigan, Ann Arbor, MI 48109 USA
[3] Google Inc, Mountain View, CA USA
[4] Stanford Univ, Stanford, CA 94305 USA
关键词
Scene understanding; Scene parsing; Object recognition; 3D layout;
D O I
10.1007/s11263-014-0779-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by complicated real-world scenes with high variability, different viewpoints and occlusions. We propose a method that can automatically learn the interactions among scene elements and apply them to the holistic understanding of indoor scenes from a single image. This interpretation is performed within a hierarchical interaction model which describes an image by a parse graph, thereby fusing together object detection, layout estimation and scene classification. At the root of the parse graph is the scene type and layout while the leaves are the individual detections of objects. In between is the core of the system, our 3D Geometric Phrases (3DGP). We conduct extensive experimental evaluations on single image 3D scene understanding using both 2D and 3D metrics. The results demonstrate that our model with 3DGPs can provide robust estimation of scene type, 3D space, and 3D objects by leveraging the contextual relationships among the visual elements.
引用
收藏
页码:204 / 220
页数:17
相关论文
共 50 条
  • [31] On support relations and semantic scene graphs
    Yang, Michael Ying
    Liao, Wentong
    Ackermann, Hanno
    Rosenhahn, Bodo
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2017, 131 : 15 - 25
  • [32] BCINet: Bilateral cross-modal interaction network for indoor scene understanding in RGB-D images
    Zhou, Wujie
    Yue, Yuchun
    Fang, Meixin
    Qian, Xiaohong
    Yang, Rongwang
    Yu, Lu
    INFORMATION FUSION, 2023, 94 : 32 - 42
  • [33] Hierarchical scene understanding exploiting automatically derived contextual data
    Sullivan, Kenneth
    Chandrasekaran, Shivkumar
    Solanki, Kaushal
    Manjunath, B. S.
    Nayak, Jayanth
    Bertelli, Luca
    SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XIX, 2010, 7697
  • [34] The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding
    Genevieve Patterson
    Chen Xu
    Hang Su
    James Hays
    International Journal of Computer Vision, 2014, 108 : 59 - 81
  • [35] The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding
    Patterson, Genevieve
    Xu, Chen
    Su, Hang
    Hays, James
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 108 (1-2) : 59 - 81
  • [36] Performance Analysis of Holistic Feature Representation for Scene Understanding and Classification
    Fu Yi
    Tian Chang
    Wu Ze Min
    Zeng Ming Yong
    Hu Yinji
    2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 3756 - 3760
  • [37] SceneNet: A Perceptual Ontology for Scene Understanding
    Kadar, Ilan
    Ben-Shahar, Ohad
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, 2015, 8926 : 385 - 400
  • [38] Labeling Complete Surfaces in Scene Understanding
    Guo, Ruiqi
    Hoiem, Derek
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 172 - 187
  • [39] Labeling Complete Surfaces in Scene Understanding
    Ruiqi Guo
    Derek Hoiem
    International Journal of Computer Vision, 2015, 112 : 172 - 187
  • [40] Unified Perceptual Parsing for Scene Understanding
    Xiao, Tete
    Liu, Yingcheng
    Zhou, Bolei
    Jiang, Yuning
    Sun, Jian
    COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 432 - 448