StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction

被引:246
作者
Khamis, Sameh [1 ]
Fanello, Sean [1 ]
Rhemann, Christoph [1 ]
Kowdle, Adarsh [1 ]
Valentin, Julien [1 ]
Izadi, Shahram [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
来源
COMPUTER VISION - ECCV 2018, PT 15 | 2018年 / 11219卷
关键词
Stereo matching; Depth estimation; Edge-aware refinement; Cost volume filtering; Deep learning; BELIEF PROPAGATION;
D O I
10.1007/978-3-030-01267-0_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free disparity maps. A key insight of this paper is that the network achieves a sub-pixel matching precision than is a magnitude higher than those of traditional stereo matching approaches. This allows us to achieve real-time performance by using a very low resolution cost volume that encodes all the information needed to achieve high disparity precision. Spatial precision is achieved by employing a learned edge-aware upsampling function. Our model uses a Siamese network to extract features from the left and right image. A first estimate of the disparity is computed in a very low resolution cost volume, then hierarchically the model re-introduces high-frequency details through a learned upsampling function that uses compact pixel-to-pixel refinement networks. Leveraging color input as a guide, this function is capable of producing high-quality edge-aware output. We achieve compelling results on multiple benchmarks, showing how the proposed method offers extreme flexibility at an acceptable computational budget.
引用
收藏
页码:596 / 613
页数:18
相关论文
共 63 条
[1]  
Abadi M., 2016, TENSORFLOW LARGESCAL
[2]  
Barron J.T., 2017, A more general robust loss function
[3]   PMBP: PatchMatch Belief Propagation for Correspondence Field Estimation [J].
Besse, Frederic ;
Rother, Carsten ;
Fitzgibbon, Andrew ;
Kautz, Jan .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 110 (01) :2-13
[4]   PatchMatch Stereo - Stereo Matching with Slanted Support Windows [J].
Bleyer, Michael ;
Rhemann, Christoph ;
Rother, Carsten .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[5]   DSAC - Differentiable RANSAC for Camera Localization [J].
Brachmann, Eric ;
Krull, Alexander ;
Nowozin, Sebastian ;
Shotton, Jamie ;
Michel, Frank ;
Gumhold, Stefan ;
Rother, Carsten .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2492-2500
[6]   Gradient descent optimization of smoothed information retrieval metrics [J].
Chapelle, Olivier ;
Wu, Mingrui .
INFORMATION RETRIEVAL, 2010, 13 (03) :216-235
[7]  
Chen Q, 2017, ASIA PACIF MICROWAVE, P9, DOI 10.1109/APMC.2017.8251364
[8]   Abstract Representations of Object-Directed Action in the Left Inferior Parietal Lobule [J].
Chen, Quanjing ;
Garcea, Frank E. ;
Jacobs, Robert A. ;
Mahon, Bradford Z. .
CEREBRAL CORTEX, 2018, 28 (06) :2162-2174
[9]   A Deep Visual Correspondence Embedding Model for Stereo Matching Costs [J].
Chen, Zhuoyuan ;
Sun, Xun ;
Wang, Liang ;
Yu, Yinan ;
Huang, Chang .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :972-980
[10]  
Delon J., 2007, IMAGING VIS