Practical Depth Estimation with Image Segmentation and Serial U-Nets

被引：8

作者：

Cantrell, Kyle J. ^{[1
]}

Miller, Craig D. ^{[1
]}

Morato, Carlos W. ^{[1
]}

机构：

[1] Worcester Polytech Inst, Dept Robot Engn, 100 Inst Rd, Worcester, MA 01609 USA

来源：

PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON VEHICLE TECHNOLOGY AND INTELLIGENT TRANSPORT SYSTEMS (VEHITS) | 2020年

关键词：

Autonomous Vehicles; Depth Estimation; Ensemble Neural Networks; Intelligent Transport Systems; Semantic Segmentation; U-Net; Vehicle Perception; VSLAM;

D O I：

10.5220/0009781804060414

中图分类号：

U [交通运输];

学科分类号：

08 ; 0823 ;

摘要：

Knowledge of environmental depth is required for successful autonomous vehicle navigation and VSLAM. Current autonomous vehicles utilize range-finding solutions such as LIDAR, RADAR, and SONAR that suffer drawbacks in both cost and accuracy. Vision-based systems offer the promise of cost-effective, accurate, and passive depth estimation to compete with existing sensor technologies. Existing research has shown that it is possible to estimate depth from 2D monocular vision cameras using convolutional neural networks. Recent advances suggest that depth estimate accuracy can be improved when networks used for supplementary tasks such as semantic segmentation are incorporated into the network architecture. A novel Serial U-Net (NU-Net) architecture is introduced as a modular, ensembling technique for combining the learned features from N-many U-Nets into a single pixel-by-pixel output. Serial U-Nets are proposed to combine the benefits of semantic segmentation and transfer learning for improved depth estimation accuracy. The performance of Serial U-Net architectures are characterized by evaluation on the NYU Depth V2 benchmark dataset and by measuring depth inference times. Autonomous vehicle navigation can substantially benefit by leveraging the latest in depth estimation and deep learning.

引用

页码：406 / 414

页数：9

共 12 条

[1]

Eigen D., 2014, Advances in neural information processing systems, P2366, DOI [DOI 10.5555/2969033.2969091, DOI 10.1007/978-3-540-28650-9_5]

[2] Learning Hierarchical Features for Scene Labeling [J].

Farabet, Clement ;

Couprie, Camille ;

Najman, Laurent ;

LeCun, Yann .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1915-1929

[3] Vision meets robotics: The KITTI dataset [J].

Geiger, A. ;

Lenz, P. ;

Stiller, C. ;

Urtasun, R. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237

[4] Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].

Godard, Clement ;

Mac Aodha, Oisin ;

Brostow, Gabriel J. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611

[5]

He K, 2015, PREPRINT, DOI DOI 10.1109/CVPR.2016.90

[6] Look Deeper into Depth: Monocular Depth Estimation with Semantic Booster and Attention-Driven Loss [J].

Jiao, Jianbo ;

Cao, Ying ;

Song, Yibing ;

Lau, Rynson .

COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :55-71

[7]

Lee J.H., 2019, ARXIV1907

[8] Monocular Depth Estimation Using Relative Depth Maps [J].

Lee, Jae-Han ;

Kim, Chang-Su .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9721-9730

[9]

Liang M, 2015, PROC CVPR IEEE, P3367, DOI 10.1109/CVPR.2015.7298958

[10] U-Net: Convolutional Networks for Biomedical Image Segmentation [J].

Ronneberger, Olaf ;

Fischer, Philipp ;

Brox, Thomas .

MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241

← 1 2 →