Visibility-Aware Point-Based Multi-View Stereo Network

被引:38
作者
Chen, Rui [1 ]
Han, Songfang [2 ]
Xu, Jing [1 ]
Su, Hao [3 ]
机构
[1] Tsinghua Univ, State Key Lab Tribol, Beijing Key Lab Precis Ultraprecis Mfg Equipment, Dept Mech Engn, Beijing 100084, Peoples R China
[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[3] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92093 USA
基金
中国国家自然科学基金;
关键词
Three-dimensional displays; Image reconstruction; Geometry; Two dimensional displays; Task analysis; Aggregates; Surface reconstruction; Multi-view stereo; 3D deep learning; GRAPH-CUTS;
D O I
10.1109/TPAMI.2020.2988729
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce VA-Point-MVSNet, a novel visibility-aware point-based deep framework for multi-view stereo (MVS). Distinct from existing cost volume approaches, our method directly processes the target scene as point clouds. More specifically, our method predicts the depth in a coarse-to-fine manner. We first generate a coarse depth map, convert it into a point cloud and refine the point cloud iteratively by estimating the residual between the depth of the current iteration and that of the ground truth. Our network leverages 3D geometry priors and 2D texture information jointly and effectively by fusing them into a feature-augmented point cloud, and processes the point cloud to estimate the 3D flow for each point. This point-based architecture allows higher accuracy, more computational efficiency and more flexibility than cost-volume-based counterparts. Furthermore, our visibility-aware multi-view feature aggregation allows the network to aggregate multi-view appearance cues while taking into account visibility. Experimental results show that our approach achieves a significant improvement in reconstruction quality compared with state-of-the-art methods on the DTU and the Tanks and Temples dataset. The code of VA-Point-MVSNet proposed in this work will be released at https://github.com/callmeray/PointMVSNet.
引用
收藏
页码:3695 / 3708
页数:14
相关论文
共 53 条
[1]   Large-Scale Data for Multiple-View Stereopsis [J].
Aanaes, Henrik ;
Jensen, Rasmus Ramsbol ;
Vogiatzis, George ;
Tola, Engin ;
Dahl, Anders Bjorholm .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) :153-168
[2]   Building Rome in a Day [J].
Agarwal, Sameer ;
Furukawa, Yasutaka ;
Snavely, Noah ;
Simon, Ian ;
Curless, Brian ;
Seitz, Steven M. ;
Szeliski, Richard .
COMMUNICATIONS OF THE ACM, 2011, 54 (10) :105-112
[3]   PatchMatch Stereo - Stereo Matching with Slanted Support Windows [J].
Bleyer, Michael ;
Rhemann, Christoph ;
Rother, Carsten .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[4]  
Calli B, 2015, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), P510, DOI 10.1109/ICAR.2015.7251504
[5]  
Campbell NDF, 2008, LECT NOTES COMPUT SC, V5302, P766, DOI 10.1007/978-3-540-88682-2_58
[6]   Point-Based Multi-View Stereo Network [J].
Chen, Rui ;
Han, Songfang ;
Xu, Jing ;
Su, Hao .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1538-1547
[7]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[8]   Learning Non-volumetric Depth Fusion using Successive Reprojections [J].
Donne, Simon ;
Geiger, Andreas .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7626-7635
[9]  
Dyer CR, 2001, SPRING INT SER ENG C, V628, P469
[10]   Silhouette and stereo fusion for 3D object modeling [J].
Esteban, CH ;
Schmitt, F .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2004, 96 (03) :367-392