Deep learning based multi-view dense matching with joint depth and surface normal estimation

被引:0
作者
Liu, Jin [1 ,2 ]
Ji, Shunping [1 ]
机构
[1] School of Remote Sensing and Information Engineering, Wuhan University, Wuhan
[2] School of Communication Engineering, Hangzhou Dianzi University, Hangzhou
来源
Cehui Xuebao/Acta Geodaetica et Cartographica Sinica | 2025年 / 53卷 / 12期
基金
中国国家自然科学基金;
关键词
3D reconstruction; deep learning; depth estimation; multi-view dense matching; normal estimation;
D O I
10.11947/j.AGCS.2024.20230579
中图分类号
学科分类号
摘要
In recent years, deep learning-based multi-view stereo matching methods have demonstrated significant potential in 3D reconstruction tasks. Ho wever, they still exhibit limitations in recovering fine geometrie details of scènes. In some traditional multi-view stereo matching methods, surface normal of ten serves as a crucial geometrie constraint to assist in finer depth inference. Nevertheless, the surface normal information, which encapsulates the geometrie information of the scène, has not been fully utilized in modern learning-based methods. This paper introduces a deep learning-based joint depth and surface normal estimation method for multi-view dense matching and 3D scène reconstruction task. The proposed method employs a multi-stage pyramid structure to simultaneously infer depth and surface normal from multi-view images and promote their joint optimization. It consists of a feature extraction module, a normal-assisted depth estimation module, a depth-assisted normal estimation module, and a depth-normal joint optimization module. Specifically, the depth estimation module constructs a geometry-aware cost volume by integrating surface normal information for fine depth estimation. The normal estimation module utilizes depth constraints to build a local cost volume for inferring fine-grained normal maps. The joint optimization module further enhances the geometrie consistency between depth and normal estimation. Experimental results on the WHU-OMVS dataset demonstrate that the proposed method performs exceptionally well in both depth and surface normal estimation, outperforming existing methods. Furthermore, the 3D reconstruction results on two different datasets indicate that the proposed method effectively recovers the geometrie structures of both local high-curvature areas and global planar regions, contributing to well-structured and high-quality 3D scène models. © 2025 SinoMaps Press. All rights reserved.
引用
收藏
页码:2391 / 2403
页数:12
相关论文
共 31 条
  • [11] LAGA H, JOSPIN L V, BOUSSAID F, Et al., A survey on deep learning techniques for stereo-based depth estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 4, pp. 1738-1764, (2022)
  • [12] YAO Yao, LUO Zixin, LI Shiwei, Et al., MVSNet: depth inference for unstructured multi-view stereo, Proceedings of 2018 European Conference on Computer Vision, pp. 767-783, (2018)
  • [13] GU Xiaodong, FAN Zhiwen, ZHU Siyu, Et al., Cascade cost volume for high-resolution multi-view stereo and stereo matching, Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495-2504, (2020)
  • [14] WEI Zizhuang, ZHU Qingtian, MIN Chen, Et al., AA-RMVSNet: adaptive aggregation recurrent multi-view stereo networkCClz, Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, pp. 6187-6196, (2021)
  • [15] CHANG Jiaren, CHANG Peichun, CHEN Yongsheng, Attention-aware feature aggregation for real-time stereo matching on edge devices, Proceedings of 2020 Asian Conference on Computer Vision, pp. 365-380, (2020)
  • [16] LIU Jin, JI Shunping, A novel recurrent encoder-decoder structure for large-scale multi-view stereo reconstruction from an open aerial dataset, Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6050-6059, (2020)
  • [17] YAN Jianfeng, WEI Zizhuang, YI Hongwei, Et al., Dense hybrid recurrent multi-view stereo net with dynamic consistency checking, Proceedings of 2020 European Conference on Computer Vision, pp. 674-689, (2020)
  • [18] YU Dawen, JI Shunping, LIU Jin, Et al., Automatic 3D building reconstruction from multi-view aerial images with deep learning, ISPRS Journal of Photogrammetry and Remote Sensing, 171, pp. 155-170, (2021)
  • [19] LIU Jin, GAO Jian, JI Shunping, Et al., Deep learning based multi-view stereo matching and 3D scène reconstruction from oblique aerial images, ISPRS Journal of Photogrammetry and Remote Sensing, 204, pp. 42-60, (2023)
  • [20] GAO Jian, LIU Jin, JI Shunping, Rational polynomial camera model warping for deep learning based satellite multi-view stereo matching, Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, pp. 6128-6137, (2021)