Adaptive Patch Deformation for Textureless-Resilient Multi-View Stereo

被引:22
作者
Wang, Yuesong [1 ]
Zeng, Zhaojie [1 ]
Guan, Tao [1 ]
Yang, Wei [1 ]
Chen, Zhuo [1 ]
Liu, Wenkai [1 ]
Xu, Luoyuan [1 ]
Luo, Yawei [2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Zhejiang Univ, Sch Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00162
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning-based approaches have shown great strength in multi-view stereo because of their outstanding ability to extract robust visual features. However, most learning-based methods need to build the cost volume and increase the receptive field enormously to get a satisfactory result when dealing with large-scale textureless regions, consequently leading to prohibitive memory consumption. To ensure both memory-friendly and textureless-resilient, we innovatively transplant the spirit of deformable convolution from deep learning into the traditional PatchMatch-based method. Specifically, for each pixel with matching ambiguity (termed unreliable pixel), we adaptively deform the patch centered on it to extend the receptive field until covering enough correlative reliable pixels (without matching ambiguity) that serve as anchors. When performing PatchMatch, constrained by the anchor pixels, the matching cost of an unreliable pixel is guaranteed to reach the global minimum at the correct depth and therefore increases the robustness of multi-view stereo significantly. To detect more anchor pixels to ensure better adaptive patch deformation, we propose to evaluate the matching ambiguity of a certain pixel by checking the convergence of the estimated depth as optimization proceeds. As a result, our method achieves state-of-the-art performance on ETH3D and Tanks and Temples while preserving low memory consumption.
引用
收藏
页码:1621 / 1630
页数:10
相关论文
共 47 条
[41]   Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking [J].
Yan, Jianfeng ;
Wei, Zizhuang ;
Yi, Hongwei ;
Ding, Mingyu ;
Zhang, Runze ;
Chen, Yisong ;
Wang, Guoping ;
Tai, Yu-Wing .
COMPUTER VISION - ECCV 2020, PT IV, 2020, 12349 :674-689
[42]   Cost Volume Pyramid Based Depth Inference for Multi-View Stereo [J].
Yang, Jiayu ;
Mao, Wei ;
Alvarez, Jose M. ;
Liu, Miaomiao .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4876-4885
[43]   MVSNet: Depth Inference for Unstructured Multi-view Stereo [J].
Yao, Yao ;
Luo, Zixin ;
Li, Shiwei ;
Fang, Tian ;
Quan, Long .
COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 :785-801
[44]   BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks [J].
Yao, Yao ;
Luo, Zixin ;
Li, Shiwei ;
Zhang, Jingyang ;
Ren, Yufan ;
Zhou, Lei ;
Fang, Tian ;
Quan, Long .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1787-1796
[45]   Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation [J].
Yi, Hongwei ;
Wei, Zizhuang ;
Ding, Mingyu ;
Zhang, Runze ;
Chen, Yisong ;
Wang, Guoping ;
Tai, Yu-Wing .
COMPUTER VISION - ECCV 2020, PT IX, 2020, 12354 :766-782
[46]  
Zhang J., 2020, BRIT MACH VIS C BMVC
[47]   PatchMatch Based Joint View Selection and Depthmap Estimation [J].
Zheng, Enliang ;
Dunn, Enrique ;
Jojic, Vladimir ;
Frahm, Jan-Michael .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1510-1517