Uanet: uncertainty-aware cost volume aggregation-based multi-view stereo for 3D reconstruction

被引:1
作者
Lu, Ping [1 ]
Cai, Youcheng [2 ]
Yang, Jiale [3 ]
Wang, Dong [4 ]
Wu, Tingting [5 ]
机构
[1] State Key Lab Mobile Network & Mobile Multimedia T, Shenzhen, Peoples R China
[2] Univ Sci & Technol China, Hefei, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[4] Anhui Jianzhu Univ, Hefei, Peoples R China
[5] Anhui Agr Univ, Hefei, Peoples R China
关键词
Multi-view stereo; Uncertainty; Group-wise correlation; Cost volume aggregation; NETWORK;
D O I
10.1007/s00371-024-03678-8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multi-view stereo (MVS) plays a vital role in 3D reconstruction, which aims to reconstruct the 3D point cloud model from multi-view images. Recently, learning-based MVS methods have demonstrated excellent performance compared with traditional MVS methods. Almost all current learning-based MVS methods focus on improving the accuracy and completeness of the reconstruction results. However, scalability remains a major limitation due to the memory constraint. In this paper, a cascaded network with an uncertainty-aware cost volume aggregation named UANet is proposed for efficient and effective dense 3D reconstruction. In particular, we present a novel uncertainty-aware cost volume aggregation approach that takes pair-wise uncertainty maps as guidance to adaptively aggregate cost volumes. Instead of applying 3D convolutional neural networks (CNNs), we introduce the feature difference with a shallow 2D CNN to compute uncertainty maps, which guarantees both efficiency and effectiveness. Moreover, we adopt a coarse-to-fine strategy and use a group-wise correlation to construct lightweight cost volumes, thus reducing the memory consumption and enabling high-resolution depth map inference. Finally, an uncertainty loss is designed to construct the uncertainty map, which can further boost the performance. Experimental results show that UANet outperforms the previous state-of-the-art methods on three benchmarks of DTU benchmark dataset, Tanks and Temples dataset, and BlendedMVS dataset. Besides, the runtime and memory requirements validate the effectiveness of UANet.
引用
收藏
页码:4567 / 4580
页数:14
相关论文
共 54 条
  • [1] Large-Scale Data for Multiple-View Stereopsis
    Aanaes, Henrik
    Jensen, Rasmus Ramsbol
    Vogiatzis, George
    Tola, Engin
    Dahl, Anders Bjorholm
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) : 153 - 168
  • [2] Campbell NDF, 2008, LECT NOTES COMPUT SC, V5302, P766, DOI 10.1007/978-3-540-88682-2_58
  • [3] EI-MVSNet: Epipolar-Guided Multi-View Stereo Network With Interval-Aware Label
    Chang, Jiahao
    He, Jianfeng
    Zhang, Tianzhu
    Yu, Jiyang
    Wu, Feng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 753 - 766
  • [4] Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving
    Chen, Guancheng
    Qin, Huabiao
    [J]. VISUAL COMPUTER, 2022, 38 (03) : 1051 - 1063
  • [5] MVSNet plus plus : Learning Depth-Based Attention Pyramid Features for Multi-View Stereo
    Chen, Po-Heng
    Yang, Hsiao-Chien
    Chen, Kuan-Wen
    Chen, Yong-Sheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 7261 - 7273
  • [6] Chen R., 2022, Vis. Comput., P1
  • [7] Point-Based Multi-View Stereo Network
    Chen, Rui
    Han, Songfang
    Xu, Jing
    Su, Hao
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1538 - 1547
  • [8] Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness
    Cheng, Shuo
    Xu, Zexiang
    Zhu, Shilin
    Li, Zhuwen
    Li, Li Erran
    Ramamoorthi, Ravi
    Su, Hao
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2521 - 2531
  • [9] VRCAT: VR collision alarming technique for user safety
    Chung, SeungJeh
    Lee, TaeHun
    Jeong, BoRa
    Jeong, JongWook
    Kang, HyeongYeop
    [J]. VISUAL COMPUTER, 2023, 39 (07) : 3145 - 3159
  • [10] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers
    Ding, Yikang
    Yuan, Wentao
    Zhu, Qingtian
    Zhang, Haotian
    Liu, Xiangyue
    Wang, Yuanjiang
    Liu, Xiao
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8575 - 8584