Monocular Depth Estimation Based on Dilated Convolutions and Feature Fusion

被引:0
作者
Li, Hang [1 ]
Liu, Shuai [1 ,2 ]
Wang, Bin [1 ]
Wu, Yuanhao [1 ]
机构
[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Dong Nanhu Rd 3888, Changchun 130033, Peoples R China
[2] Dalian Maritime Univ, Nav Coll, Linghai Rd 1, Dalian 116026, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 13期
关键词
monocular depth estimation; convolutional neural networks; feature fusion; end to end;
D O I
10.3390/app14135833
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Depth estimation represents a prevalent research focus within the realm of computer vision. Existing depth estimation methodologies utilizing LiDAR (Light Detection and Ranging) technology typically obtain sparse depth data and are associated with elevated hardware expenses. Multi-view image-matching techniques necessitate prior knowledge of camera intrinsic parameters and frequently encounter challenges such as depth inconsistency, loss of details, and the blurring of edges. To tackle these challenges, the present study introduces a monocular depth estimation approach based on an end-to-end convolutional neural network. Specifically, a DNET backbone has been developed, incorporating dilated convolution and feature fusion mechanisms within the network architecture. By integrating semantic information from various receptive fields and levels, the model's capacity for feature extraction is augmented, thereby enhancing its sensitivity to nuanced depth variations within the image. Furthermore, we introduce a loss function optimization algorithm specifically designed to address class imbalance, thereby enhancing the overall predictive accuracy of the model. Training and validation conducted on the NYU Depth-v2 (New York University Depth Dataset Version 2) and KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) datasets demonstrate that our approach outperforms other algorithms in terms of various evaluation metrics.
引用
收藏
页数:19
相关论文
共 24 条
  • [1] AdaBins: Depth Estimation Using Adaptive Bins
    Bhat, Shariq Farooq
    Alhashim, Ibraheem
    Wonka, Peter
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4008 - 4017
  • [2] Blanche PA, 2021, LIGHT-ADV MANUF, V2, DOI 10.37188/lam.2021.028
  • [3] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
    Dai, Angela
    Chang, Angel X.
    Savva, Manolis
    Halber, Maciej
    Funkhouser, Thomas
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443
  • [4] Research on the new detection method of suppressing the skylight background based on the shearing interference and the phase modulation
    Dong, Lei
    Wang, Bin
    [J]. OPTICS EXPRESS, 2020, 28 (09) : 12518 - 12528
  • [5] Eigen D, 2014, ADV NEUR IN, V27
  • [6] Deep Ordinal Regression Network for Monocular Depth Estimation
    Fu, Huan
    Gong, Mingming
    Wang, Chaohui
    Batmanghelich, Kayhan
    Tao, Dacheng
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011
  • [7] Garcia-Garcia A, 2017, Arxiv, DOI [arXiv:1704.06857, 10.48550/arXiv.1704.06857, DOI 10.48550/ARXIV.1704.06857]
  • [8] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
  • [9] Digging Into Self-Supervised Monocular Depth Estimation
    Godard, Clement
    Mac Aodha, Oisin
    Firman, Michael
    Brostow, Gabriel
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778