Depth estimation of supervised monocular images based on semantic segmentation

被引：14

作者：

Wang, Qi ^{[1
]}

Piao, Yan ^{[1
]}

机构：

[1] Changchun Univ Sci & Technol, Coll Elect & Informat Engn, Changchun 130022, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2023年 / 90卷

关键词：

Monocular depth estimation; Semantic segmentation; Shared parameters; Multi-scale feature fusion;

D O I：

10.1016/j.jvcir.2023.103753

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, the research method of depth estimation of target images using Convolutional Neural Networks (CNN) has been widely recognized in the fields of artificial intelligence, scene understanding and three-dimensional (3D) reconstruction. The fusion of semantic segmentation information and depth estimation will further improve the quality of acquired depth images. However, how to deeply combine image semantic in-formation with image depth information and use image edge information more accurately to improve the ac-curacy of depth image is still an urgent problem to be solved. For this purpose, we propose a novel depth estimation model based on semantic segmentation to estimate the depth of monocular images in this paper. Firstly, a shared parameter model of semantic segmentation information and depth estimation information is built, and the semantic segmentation information is used to guide depth acquisition in an auxiliary way. Then, through the multi-scale feature fusion module, the feature information contained in the neural network on different layers is fused, and the local feature information and global feature information are effectively used to generate high-resolution feature maps, so as to achieve the goal of improving the quality of depth image by optimizing the semantic segmentation model. The experimental results show that the model can fully extract and combine the image feature information, which improves the quality of monocular depth vision estimation. Compared with other advanced models, our model has certain advantages.

引用

页数：9

共 61 条

[1]

Bai L., 2022, J JILIN U ENG TECHNO, DOI [10.13229/j.cnki.jdxbgxb20220126, DOI 10.13229/J.CNKI.JDXBGXB20220126]

[2]

Bian D., 2022, J CHENGDU TECHNOLOGI, P2095, DOI j.Cnki.51-1747/tn.2022.01.004

[3]

BKP, 1970, HORN SHAP SHAD METH, DOI [10.1016/0734-189x(85)90010-6, DOI 10.1016/0734-189X(85)90010-6]

[4] Optimal disparity estimation in natural stereo images [J].

Burge, Johannes ;

Geisler, Wilson S. .

JOURNAL OF VISION, 2014, 14 (02) :1-18

[5] Scale-aware attention network for weakly supervised semantic segmentation [J].

Cao, Zhiyuan ;

Gao, Yufei ;

Zhang, Jiacai .

NEUROCOMPUTING, 2022, 492 :34-49

[6]

Chen W, 2016, ADV NEUR IN, V29

[7]

Eigen D, 2014, ADV NEUR IN, V27

[8] Deep Ordinal Regression Network for Monocular Depth Estimation [J].

Fu, Huan ;

Gong, Mingming ;

Wang, Chaohui ;

Batmanghelich, Kayhan ;

Tao, Dacheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2002-2011

[9] Monocular Depth Estimation with Affinity, Vertical Pooling, and Label Enhancement [J].

Gan, Yukang ;

Xu, Xiangyu ;

Sun, Wenxiu ;

Lin, Liang .

COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :232-247

[10]

Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074

← 1 2 3 4 5 6 7 →