Semantic Scene Completion With 2D and 3D Feature Fusion

被引:0
|
作者
Park, Sang-Min [1 ]
Ha, Jong-Eun [2 ]
机构
[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea
[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;
D O I
10.1109/ACCESS.2024.3470754
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.
引用
收藏
页码:141594 / 141603
页数:10
相关论文
共 50 条
  • [21] 3D SEMANTIC SCENE COMPLETION FROM A SINGLE DEPTH IMAGE USING ADVERSARIAL TRAINING
    Chen, Yueh-Tung
    Garbade, Martin
    Gall, Juergen
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1835 - 1839
  • [22] Easing 3D Pattern Reasoning with Side-View Features for Semantic Scene Completion
    Huan, Linxi
    Dong, Mingyue
    Yue, Linwei
    Shen, Shuhan
    Zheng, Xianwei
    COMPUTER VISION - ECCV 2024, PT LXIX, 2025, 15127 : 440 - 455
  • [23] RGB-D Semantic Segmentation and Label-Oriented Voxelgrid Fusion for Accurate 3D Semantic Mapping
    Shi, Wenjun
    Xu, Jingwei
    Zhu, Dongchen
    Zhang, Guanghui
    Wang, Xianshun
    Li, Jiamao
    Zhang, Xiaolin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (01) : 183 - 197
  • [24] BiCo-Fusion: Bidirectional Complementary LiDAR-Camera Fusion for Semantic- and Spatial-Aware 3D Object Detection
    Song, Yang
    Wang, Lin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (02): : 1457 - 1464
  • [25] Co-Occ: Coupling Explicit Feature Fusion With Volume Rendering Regularization for Multi-Modal 3D Semantic Occupancy Prediction
    Pan, Jingyi
    Wang, Zipeng
    Wang, Lin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06): : 5687 - 5694
  • [26] WSSIC-Net: Weakly-Supervised Semantic Instance Completion of 3D Point Cloud Scenes
    Fu, Zhiheng
    Guo, Yulan
    Chen, Minglin
    Hu, Qingyong
    Laga, Hamid
    Boussaid, Farid
    Bennamoun, Mohammed
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 2008 - 2019
  • [27] Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries
    Song, Qi
    Hu, Qingyong
    Zhang, Chi
    Chen, Yongquan
    Huang, Rui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 897 - 909
  • [28] Joint Heterogeneous Feature Learning and Distribution Alignment for 2D Image-Based 3D Object Retrieval
    Su, Yuting
    Li, Yuqian
    Nie, Weizhi
    Song, Dan
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) : 3765 - 3776
  • [29] Fusion of 4D Point Clouds From a 2D Profilometer and a 3D Lidar on an Excavator
    Immonen, Matti
    Niskanen, Ilpo
    Hallman, Lauri
    Keranen, Pekka
    Hiltunen, Mikko
    Kostamovaara, Juha
    Heikkila, Rauno
    IEEE SENSORS JOURNAL, 2021, 21 (15) : 17200 - 17206
  • [30] Learning 3D Shape Latent for Point Cloud Completion
    Chen, Zhikai
    Long, Fuchen
    Qiu, Zhaofan
    Yao, Ting
    Zhou, Wengang
    Luo, Jiebo
    Mei, Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8717 - 8729