Semantic Scene Completion With 2D and 3D Feature Fusion

被引:0
|
作者
Park, Sang-Min [1 ]
Ha, Jong-Eun [2 ]
机构
[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea
[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;
D O I
10.1109/ACCESS.2024.3470754
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.
引用
收藏
页码:141594 / 141603
页数:10
相关论文
共 50 条
  • [21] Semantic Feature Extraction of 3D human model From 2D Orthographic projection
    Hu, Yuhui
    Wang, Jianping
    Jiang, Tao
    Lin, Shujin
    2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH), 2014, : 53 - 57
  • [22] 2D/3D facial feature extraction
    Akakin, Hatice Cmar
    Salah, Albert Ali
    Akarun, Lale
    Sankur, Bulent
    IMAGE PROCESSING: ALGORITHMS AND SYSTEMS, NEURAL NETWORKS, AND MACHINE LEARNING, 2006, 6064
  • [23] Learning 3D Scene Priors with 2D Supervision
    Nie, Yinyu
    Dai, Angela
    Han, Xiaoguang
    Niessner, Matthias
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 792 - 802
  • [24] 2D and 3D Video Scene Text Classification
    Xu, Jiamin
    Shivakumara, Palaiahnakote
    Lu, Tong
    Tan, Chew Lim
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2932 - 2937
  • [25] Robust multimodal 2D and 3D face authentication using local feature fusion
    A. Ouamane
    M. Belahcene
    A. Benakcha
    S. Bourennane
    A. Taleb-Ahmed
    Signal, Image and Video Processing, 2016, 10 : 129 - 137
  • [26] Robust multimodal 2D and 3D face authentication using local feature fusion
    Ouamane, A.
    Belahcene, M.
    Benakcha, A.
    Bourennane, S.
    Taleb-Ahmed, A.
    SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (01) : 129 - 137
  • [27] Joint 2D and 3D Semantic Segmentation with Consistent Instance Semantic
    Wan, Yingcai
    Fang, Lijin
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2024, E107A (08) : 1309 - 1318
  • [28] Indoor Semantic Scene Understanding Using 2D-3D Fusion
    Gopinathan, Muraleekrishna
    Truong, Giang
    Abu-Khalaf, Jumana
    2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 133 - 140
  • [29] SEMANTIC CONSTRAINT MODELER FOR 2D AND 3D GEOMETRY
    JIAO Guofang LIU Shenquan CAD LabInstitute of Computing Technology Academia SinicaBeijing PRChina
    Computer Aided Drafting,Design and Manufacturing, 1992, Design and Manufacturing.1992 (01) : 46 - 57
  • [30] 2D/3D SEMANTIC CATEGORIZATION OF VISUAL OBJECTS
    Petre, Raluca Diana
    Zaharia, Titus
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2387 - 2391