3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents

被引：51

作者：

Kim, Ue-Hwan ^{[1
]}

Park, Jin-Man ^{[1
]}

Song, Taek-Jin ^{[1
]}

Kim, Jong-Hwan ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2020年 / 50卷 / 12期

关键词：

Semantics; Intelligent agents; Task analysis; Visualization; Usability; Scalability; Computational modeling; 3-D scene graph; environment model; intelligent agent; scene graph; scene understanding; MANIPULATION;

D O I：

10.1109/TCYB.2019.2931042

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Intelligent agents gather information and perceive semantics within the environments before taking on given tasks. The agents store the collected information in the form of environment models that compactly represent the surrounding environments. The agents, however, can only conduct limited tasks without an efficient and effective environment model. Thus, such an environment model takes a crucial role for the autonomy systems of intelligent agents. We claim the following characteristics for a versatile environment model: accuracy, applicability, usability, and scalability. Although a number of researchers have attempted to develop such models that represent environments precisely to a certain degree, they lack broad applicability, intuitive usability, and satisfactory scalability. To tackle these limitations, we propose 3-D scene graph as an environment model and the 3-D scene graph construction framework. The concise and widely used graph structure readily guarantees usability as well as scalability for 3-D scene graph. We demonstrate the accuracy and applicability of the 3-D scene graph by exhibiting the deployment of the 3-D scene graph in practical applications. Moreover, we verify the performance of the proposed 3-D scene graph and the framework by conducting a series of comprehensive experiments under various conditions.

引用

页码：4921 / 4933

页数：13

共 35 条

[1] VQA: Visual Question Answering [J].

Agrawal, Aishwarya ;

Lu, Jiasen ;

Antol, Stanislaw ;

Mitchell, Margaret ;

Zitnick, C. Lawrence ;

Parikh, Devi ;

Batra, Dhruv .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 123 (01) :4-31

[2]

[Anonymous], 2001, PSYCH221EE362

[3]

Chang A. X., 2014, P C EMP METH NAT LAN, P2028, DOI 10.3115/v1/D14-1217.

[4] MoveIt! [J].

Chitta, Sachin ;

Sucan, Ioan ;

Cousins, Steve .

IEEE ROBOTICS & AUTOMATION MAGAZINE, 2012, 19 (01) :18-19

[5] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[6] BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration [J].

Dai, Angela ;

Niessner, Matthias ;

Zollhofer, Michael ;

Izadi, Shahram ;

Theobalt, Christian .

ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (03)

[7] PDDL2.1: An extension to PDDL for expressing temporal planning domains [J].

Fox, M ;

Long, D .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2003, 20 :61-124

[8] Robot task planning and explanation in open and uncertain worlds [J].

Hanheide, Marc ;

Goebelbecker, Moritz ;

Horn, Graham S. ;

Pronobis, Andrzej ;

Sjoeoe, Kristoffer ;

Aydemir, Alper ;

Jensfelt, Patric ;

Gretton, Charles ;

Dearden, Richard ;

Janicek, Miroslav ;

Zender, Hendrik ;

Kruijff, Geert-Jan ;

Hawes, Nick ;

Wyatt, Jeremy L. .

ARTIFICIAL INTELLIGENCE, 2017, 247 :119-150

[9]

Hoffmann J, 2001, AI MAG, V22, P57

[10] OctoMap: an efficient probabilistic 3D mapping framework based on octrees [J].

Hornung, Armin ;

Wurm, Kai M. ;

Bennewitz, Maren ;

Stachniss, Cyrill ;

Burgard, Wolfram .

AUTONOMOUS ROBOTS, 2013, 34 (03) :189-206

← 1 2 3 4 →