Semantic Scene Completion With 2D and 3D Feature Fusion

被引:0
|
作者
Park, Sang-Min [1 ]
Ha, Jong-Eun [2 ]
机构
[1] Seoul Natl Univ Sci & Technol, Grad Sch Automot Engn, Seoul 01811, South Korea
[2] Seoul Natl Univ Sci & Technol, Dept Mech & Automot Engn, Seoul 01811, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Feature extraction; Semantics; Solid modeling; Transformers; Cameras; Estimation; Decoding; Proposals; Predictive models; Semantic scene completion; transformer; 3D scene understanding; occupancy;
D O I
10.1109/ACCESS.2024.3470754
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic scene completion (SSC) aims to get a dense semantic understanding of an environment in 3D. It requires a geometric and semantic knowledge of the surrounding environment and the filling of void areas. In this paper, we propose an improved algorithm by modifying VoxFormer. VoxFormer consists of two steps for 3D semantic scene completion. First, it predicts the occupancy of an environment. Then, it completes the semantic scene completion through a masked autoencoder. It requires separate training for two stages, which can cause a disconnect of information from input to output. We propose an improved VoxFormer algorithm that makes end-to-end training possible by integrating occupancy prediction and scene completion. We use pseudo-LiDAR computed by depth estimation as input of 3D CNN, which generates queries for cross attention with 2D features. This makes the process end-to-end by connecting occupancy prediction and semantic scene completion. Experimental results using SemanticKITTI show improvement in the proposed algorithm.
引用
收藏
页码:141594 / 141603
页数:10
相关论文
共 50 条
  • [41] 3D reconstruction based on 2D view feature
    Gao, Wei
    Peng, Qunsheng
    Jisuanji Xuebao/Chinese Journal of Computers, 1999, 22 (05): : 481 - 485
  • [42] Deep 3D semantic scene extrapolation
    Ali Abbasi
    Sinan Kalkan
    Yusuf Sahillioğlu
    The Visual Computer, 2019, 35 : 271 - 279
  • [43] Deep 3D semantic scene extrapolation
    Abbasi, Ali
    Kalkan, Sinan
    Sahillioglu, Yusuf
    VISUAL COMPUTER, 2019, 35 (02): : 271 - 279
  • [44] 2D compressive sensing and multi-feature fusion for effective 3D shape retrieval
    Zhou, Yan
    Zeng, Fanzhi
    INFORMATION SCIENCES, 2017, 409 : 101 - 120
  • [45] NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space
    Yao, Jiawei
    Li, Chuming
    Sun, Keqiang
    Cai, Yingjie
    Li, Hao
    Ouyang, Wanli
    Li, Hongsheng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9421 - 9431
  • [46] H2GFormer: Horizontal-to-Global Voxel Transformer for 3D Semantic Scene Completion
    Wang, Yu
    Tong, Chao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5722 - 5730
  • [47] Semantic Scene Completion through Multi-Level Feature Fusion
    Fu, Ruochong
    Wu, Hang
    Hao, Mengxiang
    Miao, Yubin
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8399 - 8406
  • [48] VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
    Li, Yiming
    Yu, Zhiding
    Choy, Christopher
    Xiao, Chaowei
    Alvarez, Jose M.
    Fidler, Sanja
    Feng, Chen
    Anandkumar, Anima
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9087 - 9098
  • [49] 3D SEMANTIC SCENE COMPLETION FROM A SINGLE DEPTH IMAGE USING ADVERSARIAL TRAINING
    Chen, Yueh-Tung
    Garbade, Martin
    Gall, Juergen
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1835 - 1839
  • [50] Easing 3D Pattern Reasoning with Side-View Features for Semantic Scene Completion
    Huan, Linxi
    Dong, Mingyue
    Yue, Linwei
    Shen, Shuhan
    Zheng, Xianwei
    COMPUTER VISION - ECCV 2024, PT LXIX, 2025, 15127 : 440 - 455