Semantic Scene Completion With 2D and 3D Feature Fusion

被引：0

作者：

Park, Sang-Min ^{[1
]}

Ha, Jong-Eun ^{[2
]}

机构：

[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea

[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

新加坡国家研究基金会;

关键词：

Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;

D O I：

10.1109/ACCESS.2024.3470754

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.

引用

页码：141594 / 141603

页数：10

共 50 条

[31] Reinventing 2D Convolutions for 3D Images
Yang, Jiancheng
Huang, Xiaoyang
He, Yi
Xu, Jingwei
Yang, Canqian
Xu, Guozheng
Ni, Bingbing
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (08) : 3009 - 3018
[32] Semantic and Context Information Fusion Network for View-Based 3D Model Classification and Retrieval
Liu, An-An
Guo, Fu-Bin
Zhou, He-Yu
Li, Wen-Hui
Song, Dan
IEEE ACCESS, 2020, 8 : 155939 - 155950
[33] Temporal Point Cloud Fusion With Scene Flow for Robust 3D Object Tracking
Yang, Yanding
Jiang, Kun
Yang, Diange
Jiang, Yanqin
Lu, Xiaowei
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1579 - 1583
[34] Quality Judgment of 3D Face Point Cloud Based on Feature Fusion
Gao, Gong
Liu, Hong
Yang, Hongyu
IEEE ACCESS, 2022, 10 : 106513 - 106519
[35] A Feature Transformation Framework With Selective Pseudo-Labeling for 2D Image-Based 3D Shape Retrieval
Hu, Nian
Zhou, Heyu
Huang, Xiangdong
Li, Xuanya
Liu, An-An
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 8010 - 8021
[36] 3D Scene Graph Generation From Point Clouds
Wei, Wenwen
Wei, Ping
Qin, Jialu
Liao, Zhimin
Wang, Shuaijie
Cheng, Xiang
Liu, Meiqin
Zheng, Nanning
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5358 - 5368
[37] KPV3D: Enhancing Multiview 3-D Vehicle Detection With 2-D Keypoint Priors
Yao, Ziying
Xiong, Zhongxia
Liu, Xuan
Wu, Xinkai
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
[38] Temporal Feature Matching and Propagation for Semantic Segmentation on 3D Point Cloud Sequences
Shi, Hanyu
Li, Ruibo
Liu, Fayao
Lin, Guosheng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7491 - 7502
[39] Transfer Learning for Nonrigid 2D/3D Cardiovascular Images Registration
Guan, Shaoya
Wang, Tianmiao
Sun, Kai
Meng, Cai
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (09) : 3300 - 3309
[40] 3D Pose Estimation Based on Reinforce Learning for 2D Image-Based 3D Model Retrieval
Nie, Wei-Zhi
Jia, Wen-Wu
Li, Wen-Hui
Liu, An-An
Zhao, Si-Cheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 (23) : 1021 - 1034

← 1 2 3 4 5 →