Semantic Scene Completion With 2D and 3D Feature Fusion

被引:0
|
作者
Park, Sang-Min [1 ]
Ha, Jong-Eun [2 ]
机构
[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea
[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;
D O I
10.1109/ACCESS.2024.3470754
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.
引用
收藏
页码:141594 / 141603
页数:10
相关论文
共 50 条
  • [31] A 2D/3D Data Fusion with Range Estimation on 2D sensor
    Takashi, Matsuzaki
    Hiroshi, Kameda
    Kazuhiko, Yamamoto
    Tatsuo, Fuji
    Ryoji, Maekawa
    2008 PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-7, 2008, : 3296 - +
  • [32] SEMANTIC CONSTRAINT MODELER FOR 2D AND 3D GEOMETRY
    JIAO Guofang LIU Shenquan CAD Lab.
    CADDM, 1992, (01) : 46 - 57
  • [33] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
    Dai, Angela
    Ritchie, Daniel
    Bokeloh, Martin
    Reed, Scott
    Sturm, Juergen
    Niessner, Matthias
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4578 - 4587
  • [34] RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion
    Li, Jie
    Liu, Yu
    Gong, Dong
    Shi, Qinfeng
    Yuan, Xia
    Zhao, Chunxia
    Reid, Ian
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7685 - 7694
  • [35] Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion
    Zhang, Pingping
    Liu, Wei
    Lei, Yinjie
    Lu, Huchuan
    Yang, Xiaoyun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7800 - 7809
  • [36] Camera-Based 3D Semantic Scene Completion With Sparse Guidance Network
    Mei, Jianbiao
    Yang, Yu
    Wang, Mengmeng
    Zhu, Junyu
    Ra, Jongwon
    Ma, Yukai
    Li, Laijian
    Liu, Yong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5468 - 5481
  • [37] DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions
    Shi, Yunxiao
    Singh, Manish Kumar
    Cai, Hong
    Porikli, Fatih
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10736 - 10746
  • [38] A Preliminary Exploration to Make Stereotactic Surgery Robots Aware of the Semantic 2D/3D Working Scene
    Li, Liang
    Feng, Pengfei
    Ding, Hui
    Wang, Guangzhi
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (01): : 17 - 27
  • [39] Language-Assisted 3D Feature Learning for Semantic Scene Understanding
    Zhang, Junbo
    Fan, Guofan
    Wang, Guanghan
    Su, Zhengyuan
    Ma, Kaisheng
    Yi, Li
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3445 - 3453
  • [40] Geodesic pixel neighborhoods for 2D and 3D scene understanding
    Haltakov, Vladimir
    Unger, Christian
    Ilic, Slobodan
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2016, 148 : 164 - 180