Digging into the multi-scale structure for a more refined depth map and 3D reconstruction

被引：0

作者：

Yinzhang Ding

Lu Lin

Lianghao Wang

Ming Zhang

Dongxiao Li

机构：

[1] Zhejiang University,Institution of Information Science and Electrical Engineering

[2] Zhejiang Provincial Key Laboratory of Information Processing,undefined

[3] Communication and Networking,undefined

来源：

Neural Computing and Applications | 2020年 / 32卷

关键词：

Depth estimation; Multi-scale; 3D reconstruction; Uncertainty estimation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Extracting dense depth from a single image is an important yet challenging computer vision task. Compared with stereo depth estimation, sensing the depth of a scene from monocular images is much more difficult and ambiguous because the epipolar geometry constraints cannot be exploited. The recent development of deep learning technologies has introduced significant progress in monocular depth estimation. This paper aims to explore the effects of multi-scale structures on the performance of monocular depth estimation and further obtain a more refined 3D reconstruction by using our predicted depth and corresponding uncertainty. First, we explore three multi-scale architectures and compare the qualitative and quantitative results of some state-of-the-art approaches. Second, in order to improve the robustness of the system and provide the reliability of the predicted depth for subsequent 3D reconstruction, we estimate the uncertainty of noisy data by modeling such uncertainty in a new loss function. Last, the predicted depth map and corresponding depth uncertainty are incorporated into a monocular reconstruction system. The experiments of monocular depth estimation are mainly performed on the widely used NYU V2 depth dataset, on which the proposed method achieves a state-of-the-art performance. For the 3D reconstruction, the implementation of our proposed framework can reconstruct more smooth and dense models on various scenes.

引用

页码：11217 / 11228

页数：11

共 44 条

[1] Li C(2018)3D reconstruction of indoor scenes via image registration Neural Process Lett 48 1281-1304
[2] Lu B(2020)IoT-based 3D convolution for video salient object detection Neural Comput Appl 32 735-746
[3] Zhang Y(2019)A novel edge-enabled slam solution using projected depth image information Neural Comput Appl 2019 1-13
[4] Dong S(2006)Two-view multibody structure from motion Int J Comput Vis 68 7-25
[5] Gao Z(2017)Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras IEEE Trans Robot 33 1255-1262
[6] Pirbhulal S(2016)Learning depth from single monocular images using deep convolutional neural fields IEEE Trans Pattern Anal Mach Intell 38 2024-2039
[7] Li J(2017)Unsupervised learning of depth and ego-motion from video CVPR 2 7-3182
[8] Zhang Y(2018)Estimating depth from monocular images as classification using deep fully convolutional residual networks IEEE Trans Circuits Syst Video Technol 28 3174-339
[9] Chen Z(2018)Monocular depth estimation with hierarchical fusion of dilated cnns and soft-weighted-sum inference Pattern Recognit 83 328-140
[10] Vidal R(2018)Salient object detection via multi-scale attention cnn Neurocomputing 322 130-252

← 1 2 3 4 5 →