Learnable Cost Metric-Based Multi-View Stereo for Point Cloud Reconstruction

被引:6
作者
Yang, Guidong [1 ]
Zhou, Xunkuai [1 ]
Gao, Chuanxiang [1 ]
Chen, Xi [1 ]
Chen, Ben M. [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Shatin, Hong Kong, Peoples R China
关键词
Defect inspection; depth estimation; diagnosis and monitoring; intelligent system; multi-view stereo (MVS); reconstruction; unmanned aerial vehicle (UAV);
D O I
10.1109/TIE.2023.3337697
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3-D reconstruction is essential to defect localization. This article proposes LCM-MVSNet, a novel multi-view stereo (MVS) network with learnable cost metric (LCM) for more accurate and complete dense point cloud reconstruction. To adapt to the scene variation and improve the reconstruction quality in non-Lambertian low-textured scenes, we propose LCM to adaptively aggregate multi-view matching similarity into the 3-D cost volume by leveraging sparse point hints. The proposed LCM benefits the MVS approaches in four folds, including depth estimation enhancement, reconstruction quality improvement, memory footprint reduction, and computational burden alleviation, allowing the depth inference for high-resolution images to achieve more accurate and complete reconstruction. In addition, we improve the depth estimation by enhancing the shallow feature propagation via a bottom-up pathway and strengthen the end-to-end supervision by adapting the focal loss to reduce ambiguity caused by sample imbalance. Extensive experiments on three benchmark datasets show that our method achieves state-of-the-art performance on the DTU and BlendedMVS dataset, and exhibits strong generalization ability with a competitive performance on the Tanks and Temples benchmark. Furthermore, we deploy our LCM-MVSNet into our UAV-based infrastructure defect inspection framework for infrastructure reconstruction and defect localization, demonstrating the effectiveness and efficiency of our method. More experiment results can be found in the Appendix.
引用
收藏
页码:11519 / 11528
页数:10
相关论文
共 43 条
  • [1] Large-Scale Data for Multiple-View Stereopsis
    Aanaes, Henrik
    Jensen, Rasmus Ramsbol
    Vogiatzis, George
    Tola, Engin
    Dahl, Anders Bjorholm
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) : 153 - 168
  • [2] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.445
  • [3] In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Kontschieder, Peter
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5639 - 5647
  • [4] Chen WT, 2023, PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, P599
  • [5] Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness
    Cheng, Shuo
    Xu, Zexiang
    Zhu, Shilin
    Li, Zhuwen
    Li, Li Erran
    Ramamoorthi, Ravi
    Su, Hao
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2521 - 2531
  • [6] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers
    Ding, Yikang
    Yuan, Wentao
    Zhu, Qingtian
    Zhang, Haotian
    Liu, Xiangyue
    Wang, Yuanjiang
    Liu, Xiao
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8575 - 8584
  • [7] Multi-View Stereo: A Tutorial
    Furukawa, Yasutaka
    Hernandez, Carlos
    [J]. FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2013, 9 (1-2): : 1 - 148
  • [8] Massively Parallel Multiview Stereopsis by Surface Normal Diffusion
    Galliani, Silvano
    Lasinger, Katrin
    Schindler, Konrad
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 873 - 881
  • [9] Giang K. T., 2022, P 10 INT C LEARN REP, P1
  • [10] Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
    Gu, Xiaodong
    Fan, Zhiwen
    Zhu, Siyu
    Dai, Zuozhuo
    Tan, Feitong
    Tan, Ping
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2492 - 2501