Depth Estimation From a Single RGB Image Using Fine-Tuned Generative Adversarial Network

被引：10

作者：

Ul Islam, Naeem ^{[1
]}

Park, Jaebyung ^{[1
,2
]}

机构：

[1] Jeonbuk Natl Univ, Core Res Inst Intelligent Robots, Jeonju 54896, South Korea

[2] Jeonbuk Natl Univ, Div Elect & Informat Engn, Jeonju 54896, South Korea

来源：

IEEE ACCESS | 2021年 / 9卷 / 09期

基金：

新加坡国家研究基金会;

关键词：

Estimation; Generators; Training; Shape; Robots; Generative adversarial networks; Three-dimensional displays; Generative adversarial network; convolutional neural network; image translation; auto-encoders; TRANSLATION;

D O I：

10.1109/ACCESS.2021.3060435

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Estimating the depth map from a single RGB image is important to understand the nature of the terrain in robot navigation and has attracted considerable attention in the past decade. The existing approaches can accurately estimate the depth from a single RGB image, considering a highly structured environment. The problem becomes more challenging when the terrain is highly dynamic. We propose a fine-tuned generative adversarial network to estimate the depth map effectively for a given single RGB image. The proposed network is composed of a fine-tuned generator and a global discriminator. The encoder part of the generator takes input RGB images and depth maps and generates their joint distribution in the latent space. Subsequently, the decoder part of the generator decodes the depth map from the joint distribution. The discriminator takes real and fake pairs in three different configurations and then guides the generator to estimate the depth map from the given RGB image accordingly. Finally, we conducted extensive experiments with a highly dynamic environment dataset for verifying the effectiveness and feasibility of the proposed approach. The proposed approach could decode the depth map from the joint distribution more effectively and accurately than the existing approaches.

引用

页码：32781 / 32794

页数：14

共 37 条

[1]

Abrams A, 2012, LECT NOTES COMPUT SC, V7573, P357, DOI 10.1007/978-3-642-33709-3_26

[2]

[Anonymous], REALSENSE DEPTH DATA

[3]

[Anonymous], 2017, ARXIV170504932

[4]

Anwar I, 2017, CYBERN INF TECHNOL, V17, P152, DOI 10.1515/cait-2017-0036

[5]

Barron JT, 2015, PROC CVPR IEEE, P4466, DOI 10.1109/CVPR.2015.7299076

[6] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[7]

Eigen D, 2014, ADV NEUR IN, V27

[8] Vision meets robotics: The KITTI dataset [J].

Geiger, A. ;

Lenz, P. ;

Stiller, C. ;

Urtasun, R. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237

[9]

Hoiem D, 2005, IEEE I CONF COMP VIS, P654

[10] Automatic photo pop-up [J].

Hoiem, D ;

Efros, AA ;

Hebert, M .

ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :577-584

← 1 2 3 4 →