Occlusion-Aware Amodal Depth Estimation for Enhancing 3D Reconstruction From a Single Image

被引:0
作者
Jo, Seong-Uk [1 ]
Lee, Du Yeol [2 ]
Rhee, Chae Eun [2 ]
机构
[1] Inha Univ, Dept Elect & Comp Engn, Incheon 22212, South Korea
[2] Hanyang Univ, Dept Elect Engn, Seoul 04763, South Korea
基金
新加坡国家研究基金会;
关键词
Three-dimensional displays; Estimation; Image reconstruction; Transformers; Image restoration; Decoding; Solid modeling; Occlusion; amodal segmentation; depth estimation; 3D-reconstruction;
D O I
10.1109/ACCESS.2024.3436570
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In various fields, such as robotics navigation, autonomous driving, and augmented reality, the demand for the reconstructing of three-dimensional (3D) scenes from two-dimensional (2D) images captured by a camera is growing. With advancements in deep learning, monocular depth prediction research has gained momentum, leading to the exploration of 3D reconstruction from a single image. While previous studies have attempted to restore occluded regions by training deep networks on high-resolution 3D data or with jointly learned 3D segmentation, achieving perfect restoration of occluded objects remains challenging. Such 3D mesh generation methods often result in unrealistic interactions with graphic objects, limiting their applicability. To address this, this paper introduces an amodal depth estimation approach to enhance the completeness of 3D reconstruction. By utilizing amodal masks that recover occluded regions, the method predicts the depths of obscured areas. Employing an iterative amodal depth estimation framework allows this approach to work even with scenes containing deep occlusions. Incorporating a spatially-adaptive normalization (SPADE) fusion block within the amodal depth estimation model effectively combines amodal mask features and image features to improve the accuracy of depth estimation for occluded regions. The proposed system exhibits superior performance on occluded region depth estimation tasks compared to conventional depth inpainting networks. Unlike models that explicitly rely on multiple RGB or depth images to handle instances of occlusion, the proposed model implicitly extracts amodal depth information from a single image. Consequently, it significantly enhances the quality of 3D reconstruction even when single images serve as input. The code and data used in the paper are available at https://github.com/Seonguke/Occlusion-aware-Amodal-Depth-Estimation-for-Enhancing-3D-Reconstruction-from-a-Single-Image/ for further research and feedback.
引用
收藏
页码:106524 / 106536
页数:13
相关论文
共 49 条
[11]  
Denninger M, 2023, J OPEN SOURCE SOFTW, V8, P4901, DOI [10.21105/joss.04901, 10.21105/joss.04901, DOI 10.21105/JOSS.04901]
[12]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[13]  
Eigen D, 2014, ADV NEUR IN, V27
[14]   Learning to See the Invisible: End-to-End Trainable Amodal Instance Segmentation [J].
Follmann, Patrick ;
Koenig, Rebecca ;
Haertinger, Philipp ;
Klostermann, Michael ;
Boettger, Tobias .
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1328-1336
[15]   3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics [J].
Fu, Huan ;
Cai, Bowen ;
Gao, Lin ;
Zhang, Ling-Xiao ;
Wang, Jiaming ;
Li, Cao ;
Zeng, Qixun ;
Sun, Chengyue ;
Jia, Rongfei ;
Zhao, Binqiang ;
Zhang, Hao .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :10913-10922
[16]   SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting [J].
Jampani, Varun ;
Chang, Huiwen ;
Sargent, Kyle ;
Kar, Abhishek ;
Tucker, Richard ;
Krainin, Michael ;
Kaeser, Dominik ;
Freeman, William T. ;
Salesin, David ;
Curless, Brian ;
Liu, Ce .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :12498-12507
[17]   Depth Map Decomposition for Monocular Depth Estimation [J].
Jun, Jinyoung ;
Lee, Jae-Han ;
Lee, Chul ;
Kim, Chang-Su .
COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 :18-34
[18]   Amodal Completion and Size Constancy in Natural Scenes [J].
Kar, Abhishek ;
Tulsiani, Shubham ;
Carreira, Joao ;
Malik, Jitendra .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :127-135
[19]   Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [J].
Ke, Lei ;
Tai, Yu-Wing ;
Tang, Chi-Keung .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4018-4027
[20]  
Ling H., 2020, Advances in neural information processing systems. Vol. 33, V33, P16246