Semantic Scene Completion With 2D and 3D Feature Fusion

被引:0
|
作者
Park, Sang-Min [1 ]
Ha, Jong-Eun [2 ]
机构
[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea
[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;
D O I
10.1109/ACCESS.2024.3470754
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.
引用
收藏
页码:141594 / 141603
页数:10
相关论文
共 50 条
  • [31] Reinventing 2D Convolutions for 3D Images
    Yang, Jiancheng
    Huang, Xiaoyang
    He, Yi
    Xu, Jingwei
    Yang, Canqian
    Xu, Guozheng
    Ni, Bingbing
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (08) : 3009 - 3018
  • [32] Semantic and Context Information Fusion Network for View-Based 3D Model Classification and Retrieval
    Liu, An-An
    Guo, Fu-Bin
    Zhou, He-Yu
    Li, Wen-Hui
    Song, Dan
    IEEE ACCESS, 2020, 8 : 155939 - 155950
  • [33] Temporal Point Cloud Fusion With Scene Flow for Robust 3D Object Tracking
    Yang, Yanding
    Jiang, Kun
    Yang, Diange
    Jiang, Yanqin
    Lu, Xiaowei
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1579 - 1583
  • [34] Quality Judgment of 3D Face Point Cloud Based on Feature Fusion
    Gao, Gong
    Liu, Hong
    Yang, Hongyu
    IEEE ACCESS, 2022, 10 : 106513 - 106519
  • [35] A Feature Transformation Framework With Selective Pseudo-Labeling for 2D Image-Based 3D Shape Retrieval
    Hu, Nian
    Zhou, Heyu
    Huang, Xiangdong
    Li, Xuanya
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 8010 - 8021
  • [36] 3D Scene Graph Generation From Point Clouds
    Wei, Wenwen
    Wei, Ping
    Qin, Jialu
    Liao, Zhimin
    Wang, Shuaijie
    Cheng, Xiang
    Liu, Meiqin
    Zheng, Nanning
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5358 - 5368
  • [37] KPV3D: Enhancing Multiview 3-D Vehicle Detection With 2-D Keypoint Priors
    Yao, Ziying
    Xiong, Zhongxia
    Liu, Xuan
    Wu, Xinkai
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [38] Temporal Feature Matching and Propagation for Semantic Segmentation on 3D Point Cloud Sequences
    Shi, Hanyu
    Li, Ruibo
    Liu, Fayao
    Lin, Guosheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7491 - 7502
  • [39] Transfer Learning for Nonrigid 2D/3D Cardiovascular Images Registration
    Guan, Shaoya
    Wang, Tianmiao
    Sun, Kai
    Meng, Cai
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (09) : 3300 - 3309
  • [40] 3D Pose Estimation Based on Reinforce Learning for 2D Image-Based 3D Model Retrieval
    Nie, Wei-Zhi
    Jia, Wen-Wu
    Li, Wen-Hui
    Liu, An-An
    Zhao, Si-Cheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 (23) : 1021 - 1034