Semisupervised learning-based depth estimation with semantic inference guidance

被引:8
作者
Zhang Yan [1 ]
Fan XiaoPeng [1 ]
Zhao DeBin [1 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci & Technol, Harbin 150001, Peoples R China
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金;
关键词
depth estimation; semisupervised learning; semantic information; neural networks;
D O I
10.1007/s11431-021-1948-3
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Depth estimation is a fundamental computer vision problem that infers three-dimensional (3D) structures from a given scene. As it is an ill-posed problem, to fit the projection function from the given scene to the 3D structure, traditional methods generally require mass amounts of annotated data. Such pixel-level annotation is quite labor consuming, especially when addressing reflective surfaces such as mirrors or water. The widespread application of deep learning further intensifies the demand for large amounts of annotated data. Therefore, it is urgent and necessary to propose a framework that is able to reduce the requirement on the amount of data. In this paper, we propose a novel semisupervised learning framework to infer the 3D structure from the given scene. First, semantic information is employed to make the depth inference more accurate. Second, we make both the depth estimation and semantic segmentation coarse-to-fine frameworks; thus, the depth estimation can be gradually guided by semantic segmentation. We compare our model with state-of-the-art methods. The experimental results demonstrate that our method is better than many supervised learning-based methods, which proves the effectiveness of the proposed method.
引用
收藏
页码:1098 / 1106
页数:9
相关论文
共 57 条
[21]   Pulling Things out of Perspective [J].
Ladicky, L'ubor ;
Shi, Jianbo ;
Pollefeys, Marc .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :89-96
[22]   Deeper Depth Prediction with Fully Convolutional Residual Networks [J].
Laina, Iro ;
Rupprecht, Christian ;
Belagiannis, Vasileios ;
Tombari, Federico ;
Navab, Nassir .
PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, :239-248
[23]  
Lan X, 2018, ADV NEUR IN, V31
[24]  
Lee JH., 2018, CVF C COMP VIS PATT
[25]  
Li B, 2015, PROC CVPR IEEE, P1119, DOI 10.1109/CVPR.2015.7298715
[26]   Toward Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models [J].
Li, Congcong ;
Kowdle, Adarsh ;
Saxena, Ashutosh ;
Chen, Tsuhan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (07) :1394-1408
[27]  
LIU BY, 2010, PROC CVPR IEEE, P1253, DOI DOI 10.1109/CVPR.2010.5539823
[28]   Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields [J].
Liu, Fayao ;
Shen, Chunhua ;
Lin, Guosheng ;
Reid, Ian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) :2024-2039
[29]  
Liu FY, 2015, PROC CVPR IEEE, P5162, DOI 10.1109/CVPR.2015.7299152
[30]   Discrete-Continuous Depth Estimation from a Single Image [J].
Liu, Miaomiao ;
Salzmann, Mathieu ;
He, Xuming .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :716-723