Adaptive Patch Deformation for Textureless-Resilient Multi-View Stereo

被引:22
作者
Wang, Yuesong [1 ]
Zeng, Zhaojie [1 ]
Guan, Tao [1 ]
Yang, Wei [1 ]
Chen, Zhuo [1 ]
Liu, Wenkai [1 ]
Xu, Luoyuan [1 ]
Luo, Yawei [2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Zhejiang Univ, Sch Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00162
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning-based approaches have shown great strength in multi-view stereo because of their outstanding ability to extract robust visual features. However, most learning-based methods need to build the cost volume and increase the receptive field enormously to get a satisfactory result when dealing with large-scale textureless regions, consequently leading to prohibitive memory consumption. To ensure both memory-friendly and textureless-resilient, we innovatively transplant the spirit of deformable convolution from deep learning into the traditional PatchMatch-based method. Specifically, for each pixel with matching ambiguity (termed unreliable pixel), we adaptively deform the patch centered on it to extend the receptive field until covering enough correlative reliable pixels (without matching ambiguity) that serve as anchors. When performing PatchMatch, constrained by the anchor pixels, the matching cost of an unreliable pixel is guaranteed to reach the global minimum at the correct depth and therefore increases the robustness of multi-view stereo significantly. To detect more anchor pixels to ensure better adaptive patch deformation, we propose to evaluate the matching ambiguity of a certain pixel by checking the convergence of the estimated depth as optimization proceeds. As a result, our method achieves state-of-the-art performance on ETH3D and Tanks and Temples while preserving low memory consumption.
引用
收藏
页码:1621 / 1630
页数:10
相关论文
共 47 条
[1]   Large-Scale Data for Multiple-View Stereopsis [J].
Aanaes, Henrik ;
Jensen, Rasmus Ramsbol ;
Vogiatzis, George ;
Tola, Engin ;
Dahl, Anders Bjorholm .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) :153-168
[2]  
[Anonymous], J OPT NETW
[3]   PatchMatch Stereo - Stereo Matching with Slanted Support Windows [J].
Bleyer, Michael ;
Rhemann, Christoph ;
Rother, Carsten .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[4]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[5]   KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo [J].
Ding, Yikang ;
Zhu, Qingtian ;
Liu, Xiangyue ;
Yuan, Wentao ;
Zhang, Haotian ;
Zhang, Chi .
COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 :630-646
[6]   Accurate, Dense, and Robust Multiview Stereopsis [J].
Furukawa, Yasutaka ;
Ponce, Jean .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (08) :1362-1376
[7]   Massively Parallel Multiview Stereopsis by Surface Normal Diffusion [J].
Galliani, Silvano ;
Lasinger, Katrin ;
Schindler, Konrad .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :873-881
[8]   Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching [J].
Gu, Xiaodong ;
Fan, Zhiwen ;
Zhu, Siyu ;
Dai, Zuozhuo ;
Tan, Feitong ;
Tan, Ping .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2492-2501
[9]   Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction [J].
Knapitsch, Arno ;
Park, Jaesik ;
Zhou, Qian-Yi ;
Koltun, Vladlen .
ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04)
[10]   Plane Completion and Filtering for Multi-View Stereo Reconstruction [J].
Kuhn, Andreas ;
Lin, Shan ;
Erdler, Oliver .
PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 :18-32