ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive depth range and depth interval

被引：3

作者：

Zhang, Song ^{[1
,2
,3
]}

Xu, Wenjia ^{[4
]}

Wei, Zhiwei ^{[1
,2
]}

Zhang, Lili ^{[1
,2
]}

Wang, Yang ^{[1
,2
]}

Liu, Junyi ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100190, Peoples R China

[2] Chinese Acad Sci, Inst Elect, Key Lab Network Informat Syst Technol NIST, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100190, Peoples R China

[4] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China

来源：

PATTERN RECOGNITION | 2023年 / 144卷

关键词：

Multi-view stereo; Depth estimation; Adaptive range; Adaptive interval;

D O I：

10.1016/j.patcog.2023.109885

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-View Stereo (MVS) is a fundamental problem in geometric computer vision which aims to reconstruct a scene using multi-view images with known camera parameters. However, the mainstream approaches represent the scene with a fixed all-pixel depth range and equal depth interval partition, which will result in inadequate utilization of depth planes and imprecise depth estimation. In this paper, we present a novel multi-stage coarse to-fine framework to achieve adaptive all-pixel depth range and depth interval. We predict a coarse depth map in the first stage, then an Adaptive Depth Range Prediction module is proposed in the second stage to zoom in the scene by leveraging the reference image and the obtained depth map in the first stage and predict a more accurate all-pixel depth range for the following stages. In the third and fourth stages, we propose an Adaptive Depth Interval Adjustment module to achieve adaptive variable interval partition for pixel-wise depth range. The depth interval distribution in this module is normalized by Z-score, which can allocate dense depth hypothesis planes around the potential ground truth depth value and vice versa to achieve more accurate depth estimation. Extensive experiments on four widely used benchmark datasets (DTU, TnT, BlendedMVS, ETH 3D) demonstrate that our model achieves state-of-the-art performance and yields competitive generalization ability. Particularly, our method achieves the highest Acc and Overall on the DTU dataset, while attaining the highest Recall and F1-score on the Tanks and Temples intermediate and advanced dataset. Moreover, our method also achieves the lowest e1 and e3 on the BlendedMVS dataset and the highest Acc and F1-score on the ETH 3D dataset, surpassing all listed methods. Project website: https://github.com/zs670980918/ARAI-MVSNet

引用

页数：10

共 50 条

[21] Recurrent Multi-view Stereo Depth Inference with Pyramid of Images
Wang, Xiaobao
Dong, Enzeng
Tong, Jigang
Sun, Zhe
Li, Wenyu
Duan, Feng
PROCEEDINGS OF 2022 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2022), 2022, : 259 - 263
[22] Image depth estimation assisted by multi-view projection
Liu, Liman
Tian, Jinshan
Luo, Guansheng
Xu, Siyuan
Zhang, Chen
Hu, Huaifei
Tao, Wenbing
COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
[23] Multiple Candidates and Multiple Constraints Based Accurate Depth Estimation for Multi-View Stereo
Zhang, Chao
Zhou, Fugen
Xue, Bindang
EIGHTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2016), 2017, 10225
[24] Monocular depth estimation with multi-view attention autoencoder
Jung, Geunho
Yoon, Sang Min
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (23) : 33759 - 33770
[25] Context-Guided Multi-view Stereo with Depth Back-Projection
Feng, Tianxing
Zhang, Zhe
Xiong, Kaiqiang
Wang, Ronggang
MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 91 - 102
[26] Monocular depth estimation with multi-view attention autoencoder
Geunho Jung
Sang Min Yoon
Multimedia Tools and Applications, 2022, 81 : 33759 - 33770
[27] Multi-View Stereo using Cross-View Depth Map Completion and Row-Column Depth Refinement
Nair, Nirmal S.
Nair, Madhu S.
THIRTEENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2021), 2021, 11878
[28] Multi-view Stereo by Fusing Monocular and a Combination of Depth Representation Methods
Yu, Fanqi
Sun, Xinyang
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 298 - 309
[29] Cost Volume Pyramid Based Depth Inference for Multi-View Stereo
Yang, Jiayu
Mao, Wei
Alvarez, Jose
Liu, Miaomiao
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 4748 - 4760
[30] Self-supervised Multi-view Stereo via Inter and Intra Network Pseudo Depth
Qiu, Ke
Lai, Yawen
Liu, Shiyi
Wang, Ronggang
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2305 - 2313

← 1 2 3 4 5 →