Scene understanding using natural language description based on 3D semantic graph map

被引:0
作者
Jiyoun Moon
Beomhee Lee
机构
[1] Seoul National University,Automation and Systems Research Institute, Department of Electrical Engineering
来源
Intelligent Service Robotics | 2018年 / 11卷
关键词
Scene understanding; Natural language description; 3D semantic graph map;
D O I
暂无
中图分类号
学科分类号
摘要
A natural language description for working environment understanding is an important component in human–robot communication. Although 3D semantic graph mappings are widely studied for perceptual aspects of the environment, these approaches hardly apply to the communication issues such as natural language descriptions for a semantic graph map. There are many researches on workspace understanding over images in the field of computer vision, which automatically generate sentences while they usually never utilize multiple scenes and 3D information. In this paper, we introduce a novel natural language description method using 3D semantic graph map. An object-oriented semantic graph map is first constructed using 3D information. A graph convolutional neural network and a recurrent neural network are then used to generate a description of the map. A natural language sentence focusing on objects over 3D semantic graph map can be eventually generated consisting of a single scene or multiple scenes. We validate the proposed method using publicly available dataset and compare it with conventional methods.
引用
收藏
页码:347 / 354
页数:7
相关论文
共 29 条
[21]   3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents [J].
Kim, Ue-Hwan ;
Park, Jin-Man ;
Song, Taek-Jin ;
Kim, Jong-Hwan .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (12) :4921-4933
[22]   Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph [J].
Tahara, Tomu ;
Seno, Takashi ;
Narita, Gaku ;
Ishikawa, Tomoya .
ADJUNCT PROCEEDINGS OF THE 2020 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR-ADJUNCT 2020), 2020, :249-+
[23]   3D Scenes Semantic Understanding: New Approach Based on Image Processing for Time Learning Reducing [J].
Chandi, Meryem Ouazzani ;
Annich, Afafe ;
Satori, Khalid .
DIGITAL TECHNOLOGIES AND APPLICATIONS, ICDTA 2023, VOL 1, 2023, 668 :494-503
[24]   Scene Understanding and 3D Imagination: A Comparison between Machine Learning and Human Cognition [J].
Schoosleitner, Michael ;
Ullrich, Torsten .
HUCAPP: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 2: HUCAPP, 2020, :231-238
[25]   KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D [J].
Liao, Yiyi ;
Xie, Jun ;
Geiger, Andreas .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) :3292-3310
[26]   APPLICATION FOR 3D SCENE UNDERSTANDING IN DETECTING DISCHARGE OF DOMESTIC WASTE ALONG COMPLEX URBAN RIVERS [J].
Ninsalam, Y. ;
Qin, R. ;
Rekittke, J. .
XXIII ISPRS CONGRESS, COMMISSION III, 2016, 41 (B3) :663-667
[27]   Semantic Labelling of 3D Point Clouds using Spatial Object Constraints [J].
Goldhoorn, Malgorzata ;
Hartanto, Ronny .
2014 PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS THEORY AND APPLICATIONS (GRAPP 2014), 2014, :513-518
[28]   Semantic Labeling and Instance Segmentation of 3D Point Clouds Using Patch Context Analysis and Multiscale Processing [J].
Hu, Shi-Min ;
Cai, Jun-Xiong ;
Lai, Yu-Kun .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (07) :2485-2498
[29]   3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications [J].
Guo, Yun-Chih ;
Weng, Tzu-Hsuan ;
Fischer, Robin ;
Fu, Li-Chen .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 224