Context-Aware Multi-view Stereo Network for Efficient Edge-Preserving Depth Estimation

被引:0
|
作者
Su, Wanjuan [1 ]
Tao, Wenbing [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view stereo; Depth estimation; Depth refinement; 3D dense reconstruction; Correspondence matching;
D O I
10.1007/s11263-024-02337-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning-based multi-view stereo methods have achieved great progress in recent years by employing the coarse-to-fine depth estimation framework. However, existing methods still encounter difficulties in recovering depth in featureless areas, object boundaries, and thin structures which mainly due to the poor distinguishability of matching clues in low-textured regions, the inherently smooth properties of 3D convolution neural networks used for cost volume regularization, and information loss of the coarsest scale features. To address these issues, we propose a Context-Aware multi-view stereo Network (CANet) that leverages contextual cues in images to achieve efficient edge-preserving depth estimation. The structural self-similarity information in the reference view is exploited by the introduced self-similarity attended cost aggregation module to perform long-range dependencies modeling in the cost volume, which can boost the matchability of featureless regions. The context information in the reference view is subsequently utilized to progressively refine multi-scale depth estimation through the proposed hierarchical edge-preserving residual learning module, resulting in delicate depth estimation at edges. To enrich features at the coarsest scale by making it focus more on delicate areas, a focal selection module is presented which can enhance the recovery of initial depth with finer details such as thin structure. By integrating the strategies above into the well-designed lightweight cascade framework, CANet achieves superior performance and efficiency trade-offs. Extensive experiments show that the proposed method achieves state-of-the-art performance with fast inference speed and low memory usage. Notably, CANet ranks first on challenging Tanks and Temples advanced dataset and ETH3D high-res benchmark among all published learning-based methods.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Efficient Edge-Preserving Multi-View Stereo Network for Depth Estimation
    Su, Wanjuan
    Tao, Wenbing
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2348 - 2356
  • [2] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers
    Ding, Yikang
    Yuan, Wentao
    Zhu, Qingtian
    Zhang, Haotian
    Liu, Xiangyue
    Wang, Yuanjiang
    Liu, Xiao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8575 - 8584
  • [3] Edge-Aware Spatial Propagation Network for Multi-view Depth Estimation
    Siyuan Xu
    Qingshan Xu
    Wanjuan Su
    Wenbing Tao
    Neural Processing Letters, 2023, 55 : 10905 - 10923
  • [4] Edge-Aware Spatial Propagation Network for Multi-view Depth Estimation
    Xu, Siyuan
    Xu, Qingshan
    Su, Wanjuan
    Tao, Wenbing
    NEURAL PROCESSING LETTERS, 2023, 55 (08) : 10905 - 10923
  • [5] Uncertainty Guided Multi-View Stereo Network for Depth Estimation
    Su, Wanjuan
    Xu, Qingshan
    Tao, Wenbing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 7796 - 7808
  • [6] Continuous Depth Estimation for Multi-view Stereo
    Liu, Yebin
    Cao, Xun
    Dai, Qionghai
    Xu, Wenli
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2121 - 2128
  • [7] Efficient Edge-Preserving Stereo Matching
    Cigla, Cevahir
    Alatan, A. Aydin
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [8] Multi-view learning for context-aware extractive summarization
    Yang, Zhenyu
    Yang, Jie
    Yecies, Brian
    Li, Wanqing
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 1762 - 1769
  • [9] Context-Aware Multi-View Summarization Network for Image-Text Matching
    Qu, Leigang
    Liu, Meng
    Cao, Da
    Nie, Liqiang
    Tian, Qi
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1047 - 1055
  • [10] Unsupervised multi-view stereo network based on multi-stage depth estimation
    Qi, Shuai
    Sang, Xinzhu
    Yan, Binbin
    Wang, Peng
    Chen, Duo
    Wang, Huachun
    Ye, Xiaoqian
    IMAGE AND VISION COMPUTING, 2022, 122