Semantic Scene Completion With 2D and 3D Feature Fusion

被引：0

作者：

Park, Sang-Min ^{[1
]}

Ha, Jong-Eun ^{[2
]}

机构：

[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea

[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

新加坡国家研究基金会;

关键词：

Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;

D O I：

10.1109/ACCESS.2024.3470754

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.

引用

页码：141594 / 141603

页数：10

共 50 条

[1] Camera-Based 3D Semantic Scene Completion With Sparse Guidance Network
Mei, Jianbiao
Yang, Yu
Wang, Mengmeng
Zhu, Junyu
Ra, Jongwon
Ma, Yukai
Li, Laijian
Liu, Yong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5468 - 5481
[2] 3D Semantic Scene Completion: A Survey
Luis Roldão
Raoul de Charette
Anne Verroust-Blondet
International Journal of Computer Vision, 2022, 130 : 1978 - 2005
[3] MRFTrans: Multimodal Representation Fusion Transformer for monocular 3D semantic scene completion
Xu, Rongtao
Zhang, Jiguang
Sun, Jiaxi
Wang, Changwei
Wu, Yifan
Xu, Shibiao
Meng, Weiliang
Zhang, Xiaopeng
INFORMATION FUSION, 2024, 111
[4] 3D Semantic Scene Completion: A Survey
Roldao, Luis
de Charette, Raoul
Verroust-Blondet, Anne
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (08) : 1978 - 2005
[5] 2D Semantic-Guided Semantic Scene Completion
Liu, Xianzhu
Xie, Haozhe
Zhang, Shengping
Yao, Hongxun
Ji, Rongrong
Nie, Liqiang
Tao, Dacheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (03) : 1306 - 1325
[6] SSR-2D: Semantic 3D Scene Reconstruction From 2D Images
Huang, Junwen
Artemov, Alexey
Chen, Yujin
Zhi, Shuaifeng
Xu, Kai
Niessner, Matthias
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 8486 - 8501
[7] SceneDreamer: Unbounded 3D Scene Generation From 2D Image Collections
Chen, Zhaoxi
Wang, Guangcong
Liu, Ziwei
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15562 - 15576
[8] Geometry-semantic aware for monocular 3D Semantic Scene Completion
Lu, Zonghao
Cao, Bing
Xia, Shuyin
Hu, Qinghua
PATTERN RECOGNITION, 2025, 158
[9] Towards Balanced RGB-TSDF Fusion for Consistent Semantic Scene Completion by 3D RGB Feature Completion and a Classwise Entropy Loss Function
Ding, Laiyan
Hu, Panwen
Li, Jie
Huang, Rui
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT II, 2024, 14426 : 128 - 141
[10] Panorama-LiDAR Fusion for Dense Omnidirectional Depth Completion in 3D Street Scene
Liu, Ruyu
Qin, Yao
Pan, Yuqi
Li, Qi
Sun, Bo
Zhang, Jianhua
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (02) : 4756 - 4766

← 1 2 3 4 5 →