A Deep Visual Correspondence Embedding Model for Stereo Matching Costs

被引:142
作者
Chen, Zhuoyuan [1 ]
Sun, Xun [1 ]
Wang, Liang [1 ]
Yu, Yinan [2 ]
Huang, Chang [2 ]
机构
[1] Baidu Res Inst Deep Learning, Beijing, Peoples R China
[2] Horizon Robot, Beijing, Peoples R China
来源
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2015年
关键词
LOCAL STEREO;
D O I
10.1109/ICCV.2015.117
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a data-driven matching cost for stereo matching. A novel deep visual correspondence embedding model is trained via Convolutional Neural Network on a large set of stereo images with ground truth disparities. This deep embedding model leverages appearance data to learn visual similarity relationships between corresponding image patches, and explicitly maps intensity values into an embedding feature space to measure pixel dissimilarities. Experimental results on KITTI and Middlebury data sets demonstrate the effectiveness of our model. First, we prove that the new measure of pixel dissimilarity outperforms traditional matching costs. Furthermore, when integrated with a global stereo framework, our method ranks top 3 among all two-frame algorithms on the KITTI benchmark. Finally, cross-validation results show that our model is able to make correct predictions for unseen data which are outside of its labeled training set.
引用
收藏
页码:972 / 980
页数:9
相关论文
共 41 条
[1]  
[Anonymous], 2012, CVPR
[2]  
[Anonymous], 2014, CoRR
[3]  
[Anonymous], 2014, NIPS
[4]  
[Anonymous], 2007, CVPR
[5]  
[Anonymous], 2014, ICML
[6]  
[Anonymous], 2008, CVPR
[7]  
[Anonymous], 1994, ECCV
[8]  
[Anonymous], 2013, INT C LEARN REPR ICL
[9]  
[Anonymous], 2014, COMPUTING STEREO MAT
[10]  
[Anonymous], 2014, P EUR C COMP VIS ZUR