Monocular Depth Estimation Based on Dilated Convolutions and Feature Fusion

被引：0

作者：

Li, Hang ^{[1
]}

Liu, Shuai ^{[1
,2
]}

Wang, Bin ^{[1
]}

Wu, Yuanhao ^{[1
]}

机构：

[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Dong Nanhu Rd 3888, Changchun 130033, Peoples R China

[2] Dalian Maritime Univ, Nav Coll, Linghai Rd 1, Dalian 116026, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 13期

关键词：

monocular depth estimation; convolutional neural networks; feature fusion; end to end;

D O I：

10.3390/app14135833

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Depth estimation represents a prevalent research focus within the realm of computer vision. Existing depth estimation methodologies utilizing LiDAR (Light Detection and Ranging) technology typically obtain sparse depth data and are associated with elevated hardware expenses. Multi-view image-matching techniques necessitate prior knowledge of camera intrinsic parameters and frequently encounter challenges such as depth inconsistency, loss of details, and the blurring of edges. To tackle these challenges, the present study introduces a monocular depth estimation approach based on an end-to-end convolutional neural network. Specifically, a DNET backbone has been developed, incorporating dilated convolution and feature fusion mechanisms within the network architecture. By integrating semantic information from various receptive fields and levels, the model's capacity for feature extraction is augmented, thereby enhancing its sensitivity to nuanced depth variations within the image. Furthermore, we introduce a loss function optimization algorithm specifically designed to address class imbalance, thereby enhancing the overall predictive accuracy of the model. Training and validation conducted on the NYU Depth-v2 (New York University Depth Dataset Version 2) and KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) datasets demonstrate that our approach outperforms other algorithms in terms of various evaluation metrics.

引用

页数：19

共 24 条

[1] AdaBins: Depth Estimation Using Adaptive Bins
Bhat, Shariq Farooq
Alhashim, Ibraheem
Wonka, Peter
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4008 - 4017
[2] Blanche PA, 2021, LIGHT-ADV MANUF, V2, DOI 10.37188/lam.2021.028
[3] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Dai, Angela
Chang, Angel X.
Savva, Manolis
Halber, Maciej
Funkhouser, Thomas
Niessner, Matthias
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443
[4] Research on the new detection method of suppressing the skylight background based on the shearing interference and the phase modulation
Dong, Lei
Wang, Bin
[J]. OPTICS EXPRESS, 2020, 28 (09) : 12518 - 12528
[5] Eigen D, 2014, ADV NEUR IN, V27
[6] Deep Ordinal Regression Network for Monocular Depth Estimation
Fu, Huan
Gong, Mingming
Wang, Chaohui
Batmanghelich, Kayhan
Tao, Dacheng
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011
[7] Garcia-Garcia A, 2017, Arxiv, DOI [arXiv:1704.06857, 10.48550/arXiv.1704.06857, DOI 10.48550/ARXIV.1704.06857]
[8] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[9] Digging Into Self-Supervised Monocular Depth Estimation
Godard, Clement
Mac Aodha, Oisin
Firman, Michael
Brostow, Gabriel
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 →