Geometric Pretraining for Monocular Depth Estimation

被引:0
|
作者
Wang, Kaixuan [1 ]
Chen, Yao [2 ]
Guo, Hengkai [2 ]
Wen, Linfu [2 ]
Shen, Shaojie [1 ]
机构
[1] HKUST, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[2] ByteDance AI Lab, Beijing, Peoples R China
关键词
D O I
10.1109/icra40945.2020.9196847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
ImageNet-pretrained networks have been widely used in transfer learning for monocular depth estimation. These pretrained networks are trained with classification losses for which only semantic information is exploited while spatial information is ignored. However, both semantic and spatial information is important for per-pixel depth estimation. In this paper, we design a novel self-supervised geometric pretraining task that is tailored for monocular depth estimation using uncalibrated videos. The designed task decouples the structure information from input videos by a simple yet effective conditional autoencoder-decoder structure. Using almost unlimited videos from the internet, networks are pretrained to capture a variety of structures of the scene and can be easily transferred to depth estimation tasks using calibrated images. Extensive experiments are used to demonstrate that the proposed geometric-pretrained networks perform better than ImageNet-pretrained networks in terms of accuracy, few-shot learning and generalization ability. Using existing learning methods, geometric-transferred networks achieve new state-of-the-art results by a large margin. The pretrained networks will be open source soon(1).
引用
收藏
页码:4782 / 4788
页数:7
相关论文
共 50 条
  • [31] Monocular depth estimation using self-supervised learning with more effective geometric constraints
    Xiong, Mingkang
    Zhang, Zhenghong
    Liu, Jiyuan
    Zhang, Tao
    Xiong, Huilin
    Engineering Applications of Artificial Intelligence, 2024, 128
  • [32] Towards Explainability in Monocular Depth Estimation
    Arampatzakis, Vasileios
    Pavlidis, George
    Pantoglou, Kyriakos
    Mitianoudis, Nikolaos
    Papamarkos, Nikos
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II, 2025, 2134 : 412 - 419
  • [33] Unsupervised Monocular Depth Estimation for Monocular Visual SLAM Systems
    Liu, Feng
    Huang, Ming
    Ge, Hongyu
    Tao, Dan
    Gao, Ruipeng
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 13
  • [34] Self-Supervised Pretraining With Monocular Height Estimation for Semantic Segmentation
    Xiong, Zhitong
    Chen, Sining
    Shi, Yilei
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [35] Monocular Human Depth Estimation Via Pose Estimation
    Jun, Jinyoung
    Lee, Jae-Han
    Lee, Chul
    Kim, Chang-Su
    IEEE ACCESS, 2021, 9 : 151444 - 151457
  • [36] Enhancing Self-supervised Monocular Depth Estimation via Piece-Wise Pose Estimation and Geometric Constraints
    Shyam, Pranjay
    Okon, Alexandre
    Yoo, HyunJin
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 221 - 231
  • [37] MONOCULAR SEGMENT-WISE DEPTH: MONOCULAR DEPTH ESTIMATION BASED ON A SEMANTIC SEGMENTATION PRIOR
    Atapour-Abarghouei, Amir
    Breckon, Toby P.
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4295 - 4299
  • [38] Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention
    Zhang, Mingliang
    Ye, Xinchen
    Fan, Xin
    Zhong, Wei
    NEUROCOMPUTING, 2020, 379 (379) : 250 - 261
  • [39] The Constraints between Edge Depth and Uncertainty for Monocular Depth Estimation
    Wu, Shouying
    Li, Wei
    Liang, Binbin
    Huang, Guoxin
    ELECTRONICS, 2021, 10 (24)
  • [40] DTTNet: Depth Transverse Transformer Network for Monocular Depth Estimation
    Kamath, Shreyas K. M.
    Rajeev, Srijith
    Panetta, Karen
    Agaian, Sos S.
    MULTIMODAL IMAGE EXPLOITATION AND LEARNING 2022, 2022, 12100