Learnable Cost Metric-Based Multi-View Stereo for Point Cloud Reconstruction

被引：6

作者：

Yang, Guidong ^{[1
]}

Zhou, Xunkuai ^{[1
]}

Gao, Chuanxiang ^{[1
]}

Chen, Xi ^{[1
]}

Chen, Ben M. ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Shatin, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS | 2024年 / 71卷 / 09期

关键词：

Defect inspection; depth estimation; diagnosis and monitoring; intelligent system; multi-view stereo (MVS); reconstruction; unmanned aerial vehicle (UAV);

D O I：

10.1109/TIE.2023.3337697

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

3-D reconstruction is essential to defect localization. This article proposes LCM-MVSNet, a novel multi-view stereo (MVS) network with learnable cost metric (LCM) for more accurate and complete dense point cloud reconstruction. To adapt to the scene variation and improve the reconstruction quality in non-Lambertian low-textured scenes, we propose LCM to adaptively aggregate multi-view matching similarity into the 3-D cost volume by leveraging sparse point hints. The proposed LCM benefits the MVS approaches in four folds, including depth estimation enhancement, reconstruction quality improvement, memory footprint reduction, and computational burden alleviation, allowing the depth inference for high-resolution images to achieve more accurate and complete reconstruction. In addition, we improve the depth estimation by enhancing the shallow feature propagation via a bottom-up pathway and strengthen the end-to-end supervision by adapting the focal loss to reduce ambiguity caused by sample imbalance. Extensive experiments on three benchmark datasets show that our method achieves state-of-the-art performance on the DTU and BlendedMVS dataset, and exhibits strong generalization ability with a competitive performance on the Tanks and Temples benchmark. Furthermore, we deploy our LCM-MVSNet into our UAV-based infrastructure defect inspection framework for infrastructure reconstruction and defect localization, demonstrating the effectiveness and efficiency of our method. More experiment results can be found in the Appendix.

引用

页码：11519 / 11528

页数：10

共 43 条

[1] Large-Scale Data for Multiple-View Stereopsis
Aanaes, Henrik
Jensen, Rasmus Ramsbol
Vogiatzis, George
Tola, Engin
Dahl, Anders Bjorholm
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) : 153 - 168
[2] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.445
[3] In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
Bulo, Samuel Rota
Porzi, Lorenzo
Kontschieder, Peter
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5639 - 5647
[4] Chen WT, 2023, PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, P599
[5] Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness
Cheng, Shuo
Xu, Zexiang
Zhu, Shilin
Li, Zhuwen
Li, Li Erran
Ramamoorthi, Ravi
Su, Hao
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2521 - 2531
[6] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers
Ding, Yikang
Yuan, Wentao
Zhu, Qingtian
Zhang, Haotian
Liu, Xiangyue
Wang, Yuanjiang
Liu, Xiao
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8575 - 8584
[7] Multi-View Stereo: A Tutorial
Furukawa, Yasutaka
Hernandez, Carlos
[J]. FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2013, 9 (1-2): : 1 - 148
[8] Massively Parallel Multiview Stereopsis by Surface Normal Diffusion
Galliani, Silvano
Lasinger, Katrin
Schindler, Konrad
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 873 - 881
[9] Giang K. T., 2022, P 10 INT C LEARN REP, P1
[10] Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
Gu, Xiaodong
Fan, Zhiwen
Zhu, Siyu
Dai, Zuozhuo
Tan, Feitong
Tan, Ping
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2492 - 2501

← 1 2 3 4 5 →