PlanIT: Planning and Instantiating Indoor Scenes with Relation Graph and Spatial Prior Networks

被引：94

作者：

Wang, Kai ^{[1
]}

Lin, Yu-An ^{[1
]}

Weissmann, Ben ^{[1
]}

Savva, Manolis ^{[2
]}

Chang, Angel X. ^{[2
]}

Ritchie, Daniel ^{[1
]}

机构：

[1] Brown Univ, Providence, RI 02912 USA

[2] Simon Fraser Univ, Burnaby, BC, Canada

来源：

ACM TRANSACTIONS ON GRAPHICS | 2019年 / 38卷 / 04期

基金：

美国国家科学基金会;

关键词：

indoor scene synthesis; object layout; neural networks; convolutional networks; deep learning; relationship graphs; graph generation;

D O I：

10.1145/3306346.3322941

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We present a new framework for interior scene synthesis that combines a high-level relation graph representation with spatial prior neural networks. We observe that prior work on scene synthesis is divided into two camps: object-oriented approaches (which reason about the set of objects in a scene and their configurations) and space-oriented approaches (which reason about what objects occupy what regions of space). Our insight is that the object-oriented paradigm excels at high-level planning of how a room should be laid out, while the space-oriented paradigm performs well at instantiating a layout by placing objects in precise spatial configurations. With this in mind, we present PlanIT, a layout-generation framework that divides the problem into two distinct planning and instantiation phases. PlanIT represents the "plan" for a scene via a relation graph, encoding objects as nodes and spatial/semantic relationships between objects as edges. In the planning phase, it uses a deep graph convolutional generative model to synthesize relation graphs. In the instantiation phase, it uses image-based convolutional network modules to guide a search procedure that places objects into the scene in a manner consistent with the graph. By decomposing the problem in this way, PlanIT generates scenes of comparable quality to those generated by prior approaches (as judged by both people and learned classifiers), while also providing the modeling flexibility of the intermediate relationship graph representation. These graphs allow the system to support applications such as scene synthesis from a partial graph provided by a user.

引用

页数：15

共 55 条

[1] SPICE: Semantic Propositional Image Caption Evaluation [J].

Anderson, Peter ;

Fernando, Basura ;

Johnson, Mark ;

Gould, Stephen .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :382-398

[2]

[Anonymous], 2009, Artificial intelligence-A modern approach

[3]

[Anonymous], CVPR 2019

[4]

Bokeloh Martin, 2010, SIGGRAPH 2010

[5]

Chang Angel, 2015, ACL 2015

[6]

Chaos Group, 2018, PUTT CGI IKEA V RAY

[7] Automatic Semantic Modeling of Indoor Scenes from Low-quality RGB-D Data using Contextual Information [J].

Chen, Kang ;

Lai, Yu-Kun ;

Wu, Yu-Xin ;

Martin, Ralph ;

Hu, Shi-Min .

ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (06)

[8] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [J].

Dai, Angela ;

Ritchie, Daniel ;

Bokeloh, Martin ;

Reed, Scott ;

Sturm, Juergen ;

Niessner, Matthias .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4578-4587

[9]

Das Abhishek, 2018, P IEEE C COMP VIS PA

[10]

ERDOS P, 1960, B INT STATIST INST, V38, P343

← 1 2 3 4 5 6 →