Self-Supervised Learning of Domain Invariant Features for Depth Estimation

被引:12
作者
Akada, Hiroyasu [1 ,2 ]
Bhat, Shariq Farooq [1 ]
Alhashim, Ibraheem [3 ]
Wonka, Peter [1 ]
机构
[1] KAUST, Thuwal, Saudi Arabia
[2] Keio Univ, Tokyo, Japan
[3] Saudi Data & Artificial Intelligence Author SDAIA, Natl Ctr Artificial Intelligence NCAI, Riyadh, Saudi Arabia
来源
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022) | 2022年
关键词
D O I
10.1109/WACV51458.2022.00107
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the problem of unsupervised synthetic-to-real domain adaptation for single image depth estimation. An essential building block of single image depth estimation is an encoder-decoder task network that takes RGB images as input and produces depth maps as output. In this paper, we propose a novel training strategy to force the task network to learn domain invariant representations in a self-supervised manner. Specifically, we extend self-supervised learning from traditional representation learning, which works on images from a single domain, to domain invariant representation learning, which works on images from two different domains by utilizing an image-to-image translation network. Firstly, we use an image-to-image translation network to transfer domain-specific styles between synthetic and real domains. This style transfer operation allows us to obtain similar images from the different domains. Secondly, we jointly train our task network and Siamese network with the same images from the different domains to obtain domain invariance for the task network. Finally, we fine-tune the task network using labeled synthetic and unlabeled realworld data. Our training strategy yields improved generalization capability in the real-world domain. We carry out an extensive evaluation on two popular datasets for depth estimation, KITTI and Make3D. The results demonstrate that our proposed method outperforms the state-of-the-art on all metrics, e.g. by 14.7% on Sq Rel on KITTI. The source code and model weights will be made available.
引用
收藏
页码:997 / 1007
页数:11
相关论文
共 56 条
  • [1] Alhashim Ibraheem, 2018, ARXIV E PRINTS
  • [2] Ali M., 2020, P EUR C COMP VIS, P290
  • [3] [Anonymous], 2020, INT C MACH LEARN PML
  • [4] Arvin AM, 2009, LIVE VARIOLA VIRUS: CONSIDERATIONS FOR CONTINUING RESEARCH, P9
  • [5] Bhat Shariq Farooq, 2020, ADABINS DEPTH ESTIMA
  • [6] Bi Sai, 2019, P IEEE CVF INT C COM
  • [7] Domain Adaptation for Semantic Segmentation with Maximum Squares Loss
    Chen, Minghao
    Xue, Hongyang
    Cai, Deng
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2090 - 2099
  • [8] Chen T., 2020, Advances in neural information processing systems, V33, P22243
  • [9] Chen Wuyang, 2021, CONTRASTIVE SYN REAL, V2021
  • [10] Knowledge-guided Deep Reinforcement Learning for Interactive Recommendation
    Chen, Xiaocong
    Huang, Chaoran
    Yao, Lina
    Wang, Xianzhi
    Liu, Wei
    Zhang, Wenjie
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,