Multi-View Stereo with Learnable Cost Metric

被引:1
作者
Yang, Guidong [1 ]
Zhou, Xunkuai [1 ,2 ]
Gao, Chuanxiang [1 ]
Zhao, Benyun [1 ]
Zhang, Jihan [1 ]
Chen, Yizhou [1 ]
Chen, Xi [1 ]
Chen, Ben M. [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Shatin, Hong Kong, Peoples R China
[2] Tongji Univ, Sch Elect & Informat Engn, Shanghai, Peoples R China
来源
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS | 2023年
关键词
depth estimation; cost volume aggregation; multi-view stereo; 3D reconstruction; UAV;
D O I
10.1109/IROS55552.2023.10341606
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present LCM-MVSNet, a novel multi-view stereo (MVS) network with learnable cost metric (LCM) for more accurate and complete depth estimation and dense point cloud reconstruction. To adapt to the scene variation and improve the reconstruction quality in non-Lambertian lowtextured scenes, we propose LCM to adaptively aggregate multiview matching similarity into the 3D cost volume by leveraging sparse points hints. The proposed LCM benefits the MVS approaches in four folds, including depth estimation enhancement, reconstruction quality improvement, memory footprint reduction, and computational burden alleviation, allowing the depth inference for high-resolution images to achieve more accurate and complete reconstruction. Moreover, we improve the depth estimation by enhancing the propagation of shallow features via a bottom-up path and strengthen the end-to-end supervision by adapting the focal loss to reduce ambiguity caused by sample imbalance. Extensive experiments on two benchmark datasets show that our network achieves state-of-the-art performance on the DTU dataset and exhibits strong generalization ability with a competitive performance on the Tanks and Temples benchmark. Furthermore, we deploy our LCM-MVSNet into the real-world application for large-scale 3D reconstruction based on multi-view aerial images collected by self-developed UAV, demonstrating the robustness and scalability of our method. More detailed results are available in the Appendix(1).
引用
收藏
页码:3017 / 3024
页数:8
相关论文
共 36 条
[1]   Large-Scale Data for Multiple-View Stereopsis [J].
Aanaes, Henrik ;
Jensen, Rasmus Ramsbol ;
Vogiatzis, George ;
Tola, Engin ;
Dahl, Anders Bjorholm .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) :153-168
[2]   In-Place Activated BatchNorm for Memory-Optimized Training of DNNs [J].
Bulo, Samuel Rota ;
Porzi, Lorenzo ;
Kontschieder, Peter .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5639-5647
[3]  
Campbell NDF, 2008, LECT NOTES COMPUT SC, V5302, P766, DOI 10.1007/978-3-540-88682-2_58
[4]   Point-Based Multi-View Stereo Network [J].
Chen, Rui ;
Han, Songfang ;
Xu, Jing ;
Su, Hao .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1538-1547
[5]   Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness [J].
Cheng, Shuo ;
Xu, Zexiang ;
Zhu, Shilin ;
Li, Zhuwen ;
Li, Li Erran ;
Ramamoorthi, Ravi ;
Su, Hao .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2521-2531
[6]   TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers [J].
Ding, Yikang ;
Yuan, Wentao ;
Zhu, Qingtian ;
Zhang, Haotian ;
Liu, Xiangyue ;
Wang, Yuanjiang ;
Liu, Xiao .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8575-8584
[7]   Accurate, Dense, and Robust Multiview Stereopsis [J].
Furukawa, Yasutaka ;
Ponce, Jean .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (08) :1362-1376
[8]   Massively Parallel Multiview Stereopsis by Surface Normal Diffusion [J].
Galliani, Silvano ;
Lasinger, Katrin ;
Schindler, Konrad .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :873-881
[9]  
Giang K. T., 2021, ARXIV211205999
[10]   Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching [J].
Gu, Xiaodong ;
Fan, Zhiwen ;
Zhu, Siyu ;
Dai, Zuozhuo ;
Tan, Feitong ;
Tan, Ping .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2492-2501