ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive depth range and depth interval

被引:3
|
作者
Zhang, Song [1 ,2 ,3 ]
Xu, Wenjia [4 ]
Wei, Zhiwei [1 ,2 ]
Zhang, Lili [1 ,2 ]
Wang, Yang [1 ,2 ]
Liu, Junyi [1 ,2 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Inst Elect, Key Lab Network Informat Syst Technol NIST, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100190, Peoples R China
[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
关键词
Multi-view stereo; Depth estimation; Adaptive range; Adaptive interval;
D O I
10.1016/j.patcog.2023.109885
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-View Stereo (MVS) is a fundamental problem in geometric computer vision which aims to reconstruct a scene using multi-view images with known camera parameters. However, the mainstream approaches represent the scene with a fixed all-pixel depth range and equal depth interval partition, which will result in inadequate utilization of depth planes and imprecise depth estimation. In this paper, we present a novel multi-stage coarse to-fine framework to achieve adaptive all-pixel depth range and depth interval. We predict a coarse depth map in the first stage, then an Adaptive Depth Range Prediction module is proposed in the second stage to zoom in the scene by leveraging the reference image and the obtained depth map in the first stage and predict a more accurate all-pixel depth range for the following stages. In the third and fourth stages, we propose an Adaptive Depth Interval Adjustment module to achieve adaptive variable interval partition for pixel-wise depth range. The depth interval distribution in this module is normalized by Z-score, which can allocate dense depth hypothesis planes around the potential ground truth depth value and vice versa to achieve more accurate depth estimation. Extensive experiments on four widely used benchmark datasets (DTU, TnT, BlendedMVS, ETH 3D) demonstrate that our model achieves state-of-the-art performance and yields competitive generalization ability. Particularly, our method achieves the highest Acc and Overall on the DTU dataset, while attaining the highest Recall and F1-score on the Tanks and Temples intermediate and advanced dataset. Moreover, our method also achieves the lowest e1 and e3 on the BlendedMVS dataset and the highest Acc and F1-score on the ETH 3D dataset, surpassing all listed methods. Project website: https://github.com/zs670980918/ARAI-MVSNet
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Visual odometry combined with depth estimation network of improved dense block and multi-view geometry
    Peng D.-G.
    Ouyang H.-L.
    Qi E.-J.
    Wang D.-H.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (04): : 980 - 988
  • [42] Vis-MVSNet: Visibility-Aware Multi-view Stereo Network
    Zhang, Jingyang
    Li, Shiwei
    Luo, Zixin
    Fang, Tian
    Yao, Yao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (01) : 199 - 214
  • [43] Vis-MVSNet: Visibility-Aware Multi-view Stereo Network
    Jingyang Zhang
    Shiwei Li
    Zixin Luo
    Tian Fang
    Yao Yao
    International Journal of Computer Vision, 2023, 131 : 199 - 214
  • [44] Charting the Landscape of Multi-view Stereo: An In-Depth Exploration of Deep Learning Techniques
    Zhou, Zhe
    Liu, Xiaozhang
    Tang, Xiangyan
    BIG DATA AND SECURITY, ICBDS 2023, PT I, 2024, 2099 : 152 - 165
  • [45] Multi-view stereo via depth map fusion: A coordinate decent optimization method
    Li, Zhaoxin
    Wang, Kuanquan
    Meng, Deyu
    Xu, Chao
    NEUROCOMPUTING, 2016, 178 : 46 - 61
  • [46] EI-MVSNet: Epipolar-Guided Multi-View Stereo Network With Interval-Aware Label
    Chang, Jiahao
    He, Jianfeng
    Zhang, Tianzhu
    Yu, Jiyang
    Wu, Feng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 753 - 766
  • [47] Prior depth-based multi-view stereo network for online 3D model reconstruction
    Song, Soohwan
    Truong, Khang Giang
    Kim, Daekyum
    Jo, Sungho
    PATTERN RECOGNITION, 2023, 136
  • [48] Content Adaptive Enhancement of Multi-View Depth Maps for Free Viewpoint Video
    Ekmekcioglu, Erhan
    Velisavljevic, Vladan
    Worrall, Stewart T.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (02) : 352 - 361
  • [49] Integration of colour and affine invariant feature for multi-view depth video estimation
    Zuo, Y.
    An, P.
    Shen, L.
    Li, C.
    Ma, R.
    IMAGING SCIENCE JOURNAL, 2016, 64 (06) : 313 - 320
  • [50] Multi-view depth estimation based on multi-feature aggregation for 3D reconstruction
    Zhang, Chi
    Liang, Lingyu
    Zhou, Jijun
    Xu, Yong
    COMPUTERS & GRAPHICS-UK, 2024, 122