Granular3D: Delving into multi-granularity 3D scene graph prediction

被引：0

作者：

Huang, Kaixiang ^{[1
,2
]}

Yang, Jingru ^{[1
,2
]}

Wang, Jin ^{[1
,2
,6
]}

He, Shengfeng ^{[3
]}

Wang, Zhan ^{[4
]}

He, Haiyan ^{[1
,2
,5
]}

Zhang, Qifeng

Lu, Guodong ^{[1
,2
]}

机构：

[1] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou 310027, Zhejiang, Peoples R China

[2] Zhejiang Univ, Robot Inst, Hangzhou 310027, Zhejiang, Peoples R China

[3] Singapore Management Univ, Singapore 178903, Singapore

[4] Zhejiang Energy Digital Technol Co Ltd, Dept Artificial Intelligence & Robot, Hangzhou 310027, Zhejiang, Peoples R China

[5] Zhejiang Baima Lake Lab Co Ltd, Hangzhou 310000, Zhejiang, Peoples R China

[6] Jinhua Key Lab Robot Intelligent Welding Technol, Jinhua 321000, Zhejiang, Peoples R China

来源：

PATTERN RECOGNITION | 2024年 / 153卷

基金：

中国国家自然科学基金;

关键词：

3D point cloud; 3D semantic scene graph prediction; Multi-granularity; Gather point transformer; LANGUAGE;

D O I：

10.1016/j.patcog.2024.110562

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper addresses the significant challenges in 3D Semantic Scene Graph (3DSSG) prediction, essential for understanding complex 3D environments. Traditional approaches, primarily using PointNet and Graph Convolutional Networks, struggle with effectively extracting multi -grained features from intricate 3D scenes, largely due to a focus on global scene processing and single -scale feature extraction. To overcome these limitations, we introduce Granular3D, a novel approach that shifts the focus towards multi -granularity analysis by predicting relation triplets from specific sub -scenes. One key is the Adaptive Instance Enveloping Method (AIEM), which establishes an approximate envelope structure around irregular instances, providing shape -adaptive local point cloud sampling, thereby comprehensively covering the contextual environments of instances. Moreover, Granular3D incorporates a Hierarchical Dual -Stage Network (HDSN), which differentiates and processes features of instances and their pairs at varying scales, leading to a targeted prediction of instance categories and their relationships. To advance the perception of sub -scene in HDSN, we design a Gather Point Transformer structure (GaPT) that enables the combinatorial interaction of local information from multiple point cloud sets, achieving a more comprehensive local contextual feature extraction. Extensive evaluations on the challenging 3DSSG benchmark demonstrate that our methods provide substantial improvements, establishing a new state-of-the-art in 3DSSG prediction, boosting the top -50 triplet accuracy by + 2.8%.

引用

页数：12

共 50 条

[21] Intelligent 3D Layout Design with Shape Grammars
Grzesiak-Kopec, Katarzyna
Ogorzalek, Maciej
2013 6TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTIONS (HSI), 2013, : 265 - 270
[22] Deep generative models for 3D molecular structure
Baillif, Benoit
Cole, Jason
McCabe, Patrick
Bender, Andreas
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2023, 80
[23] A 3D shape generative method for aesthetic product design
Alcaide-Marzal, Jorge
Antonio Diego-Mas, Jose
Acosta-Zazueta, Gonzalo
DESIGN STUDIES, 2020, 66 : 144 - 176
[24] Querying 3D Cadastral Information from BIM Models
Atazadeh, Behnam
Rajabifard, Abbas
Zhang, Yibo
Barzegar, Maryam
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (08)
[25] FOSTERING INTERDISCIPLINARY RESEARCH IN COMPUTATIONAL THINKING: PROJECT 3D GEOMETRY
Basogain, X.
Monroy, F.
Duran, S.
Rocha, D.
Alvarado, M.
Olabe, M. A.
Olabe, J. C.
INTED2014: 8TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2014, : 7072 - 7079
[26] Virtually overcoming grammar learning with 3D application of Loci mnemonics?
Hagstrom, Josefin
Winman, Anders
APPLIED COGNITIVE PSYCHOLOGY, 2018, 32 (04) : 450 - 462
[27] Enhancing Medical Students' Communicative Skills in a 3D Virtual World
Wu, Yi-Ju Ariel
Lan, Yu-Ju
Huang, Sin-Bao Paul
Lin, Ting R.
EDUCATIONAL TECHNOLOGY & SOCIETY, 2019, 22 (04): : 18 - 32
[28] A review of platforms for simulating embodied agents in 3D virtual environments
Kaur, Deepti Prit
Singh, Narinder Pal
Banerjee, Bonny
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (04) : 3711 - 3753
[29] Does a 3D immersive experience enhance Mandarin writing by CSL students?
Lan, Yu-Ju
Lyu, Bo-Ning
Chin, Chee Kuen
LANGUAGE LEARNING & TECHNOLOGY, 2019, 23 (02): : 125 - 144
[30] A TEXT-TO-SL SYNTHESIS SYSTEM USING 3D AVATAR TECHNOLOGY
Gibet, Sylvie
Marteau, Pierre-Francois
2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,

← 1 2 3 4 5 →