Occlusion-Aware Amodal Depth Estimation for Enhancing 3D Reconstruction From a Single Image

被引:0
作者
Jo, Seong-Uk [1 ]
Lee, Du Yeol [2 ]
Rhee, Chae Eun [2 ]
机构
[1] Inha Univ, Dept Elect & Comp Engn, Incheon 22212, South Korea
[2] Hanyang Univ, Dept Elect Engn, Seoul 04763, South Korea
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Estimation; Image reconstruction; Transformers; Image restoration; Decoding; Solid modeling; Occlusion; amodal segmentation; depth estimation; 3D-reconstruction;
D O I
10.1109/ACCESS.2024.3436570
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In various fields, such as robotics navigation, autonomous driving, and augmented reality, the demand for the reconstructing of three-dimensional (3D) scenes from two-dimensional (2D) images captured by a camera is growing. With advancements in deep learning, monocular depth prediction research has gained momentum, leading to the exploration of 3D reconstruction from a single image. While previous studies have attempted to restore occluded regions by training deep networks on high-resolution 3D data or with jointly learned 3D segmentation, achieving perfect restoration of occluded objects remains challenging. Such 3D mesh generation methods often result in unrealistic interactions with graphic objects, limiting their applicability. To address this, this paper introduces an amodal depth estimation approach to enhance the completeness of 3D reconstruction. By utilizing amodal masks that recover occluded regions, the method predicts the depths of obscured areas. Employing an iterative amodal depth estimation framework allows this approach to work even with scenes containing deep occlusions. Incorporating a spatially-adaptive normalization (SPADE) fusion block within the amodal depth estimation model effectively combines amodal mask features and image features to improve the accuracy of depth estimation for occluded regions. The proposed system exhibits superior performance on occluded region depth estimation tasks compared to conventional depth inpainting networks. Unlike models that explicitly rely on multiple RGB or depth images to handle instances of occlusion, the proposed model implicitly extracts amodal depth information from a single image. Consequently, it significantly enhances the quality of 3D reconstruction even when single images serve as input. The code and data used in the paper are available at https://github.com/Seonguke/Occlusion-aware-Amodal-Depth-Estimation-for-Enhancing-3D-Reconstruction-from-a-Single-Image/ for further research and feedback.
引用
收藏
页码:106524 / 106536
页数:13
相关论文
共 49 条
[1]   Multi-Sensor Depth Fusion Framework for Real-Time 3D Reconstruction [J].
Ali, Muhammad Kashif ;
Raiput, Asif ;
Shahzad, Muhammad ;
Khan, Farhan ;
Akhtar, Faheem ;
Borner, Anko .
IEEE ACCESS, 2019, 7 :136471-136480
[2]  
Batty Christopher., Sdfgen
[3]   LocalBins: Improving Depth Estimation by Learning Local Distributions [J].
Bhat, Shariq Farooq ;
Alhashim, Ibraheem ;
Wonka, Peter .
COMPUTER VISION - ECCV 2022, PT I, 2022, 13661 :480-496
[4]   POCO: Point Convolution for Surface Reconstruction [J].
Boulch, Alexandre ;
Marlet, Renaud .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6292-6304
[5]   Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic Segmentation Challenge Baseline [J].
Breitenstein, Jasmin ;
Fingscheidt, Tim .
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, :1018-1025
[6]  
Chang AE, 2017, Arxiv, DOI arXiv:1709.06158
[7]  
Chen C, 2019, PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), P1753, DOI [10.1109/ITAIC.2019.8785774, 10.1109/itaic.2019.8785774]
[8]   VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [J].
Choe, Jaesung ;
Im, Sunghoon ;
Rameau, Francois ;
Kang, Minjun ;
Kweon, In So .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :16066-16075
[9]   4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].
Choy, Christopher ;
Gwak, JunYoung ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079
[10]  
Dahnert M, 2021, ADV NEUR IN, V34