Dynamic Semantically Guided Monocular Depth Estimation for UAV Environment Perception

被引:1
作者
Miclea, Vlad-Cristian [1 ]
Nedevschi, Sergiu [1 ]
机构
[1] Tech Univ Cluj Napoca, Dept Comp Sci, Cluj Napoca 400114, Romania
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
3-D reconstruction; CNN; monocular depth estimation (MDE); ordinal regression; semantic segmentation; UAV;
D O I
10.1109/TGRS.2023.3345475
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Monocular depth estimation (MDE) is one of the most difficult tasks in computer vision. The problem becomes even more complicated in case of aerial images due to the high complexity and lack of structure present in such scenarios. State-of-the-art MDE methods can cope with such environments only by using high amounts of resources. In this work, we try to provide a more resource-aware alternative by dynamically inserting scene priors through semantic features into the network. To this end, we initially propose a novel dynamic semantic-aware module, that combines features extracted from the RGB image with a dynamically weighted semantic map. The weights are gradually modified, according to the iteration number inside the training process. By pursuing this methodology, we initially predict the depth for larger objects inside the scene. As the training resumes, smaller objects are further emphasized, thus increasing object boundary confidences. We prove that this dynamic adaptation mechanism leads to a better and faster convergence. The adaptation technique is applied for various learning frameworks, by introducing three novel semantic-aware losses for depth-based regression, classification, and ordinal regression. We show the results on a large set of synthetic and real-life aerial images, captured in various scenarios and thus we prove the effectiveness of our approach in terms of convergence time, depth accuracy, and prediction time.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 56 条
  • [1] Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention
    Agarwal, Ashutosh
    Arora, Chetan
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5850 - 5859
  • [2] AdaBins: Depth Estimation Using Adaptive Bins
    Bhat, Shariq Farooq
    Alhashim, Ibraheem
    Wonka, Peter
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4008 - 4017
  • [3] Bhoi A, 2019, Arxiv, DOI [arXiv:1901.09402, 10.48550/arXiv.1901.09402]
  • [4] InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
    Borse, Shubhankar
    Wang, Ying
    Zhang, Yizhe
    Porikli, Fatih
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5897 - 5907
  • [5] Brzozowski Bartosz, 2018, 2018 5th IEEE International Workshop on Metrology for AeroSpace (MetroAeroSpace), P2422, DOI 10.1109/MetroAeroSpace.2018.8453571
  • [6] 3-D Instance Segmentation of MVS Buildings
    Chen, Jiazhou
    Xu, Yanghui
    Lu, Shufang
    Liang, Ronghua
    Nan, Liangliang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [7] Towards Scene Understanding: Unsupervised Monocular Depth Estimation with Semantic-aware Representation
    Chen, Po-Yi
    Liu, Alexander H.
    Liu, Yen-Cheng
    Wang, Yu-Chiang Frank
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2619 - 2627
  • [8] Learning Semantic Segmentation from Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach
    Chen, Yuhua
    Li, Wen
    Chen, Xiaoran
    Van Gool, Luc
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1841 - 1850
  • [9] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [10] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223