Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation

被引:0
|
作者
Bae, Jinwoo [1 ]
Moon, Sungho [1 ]
Im, Sunghoon [1 ]
机构
[1] DGIST, Dept Elect Engn & Comp Sci, Daegu, South Korea
基金
新加坡国家研究基金会;
关键词
VISION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised monocular depth estimation has been widely studied recently. Most of the work has focused on improving performance on benchmark datasets, such as KITTI, but has offered a few experiments on generalization performance. In this paper, we investigate the backbone net-works (e.g. CNNs, Transformers, and CNN-Transformer hybrid models) toward the generalization of monocular depth estimation. We first evaluate state-of-the-art models on diverse public datasets, which have never been seen during the network training. Next, we investigate the effects of texture-biased and shape-biased representations using the various texture-shifted datasets that we generated. We observe that Transformers exhibit a strong shape bias and CNNs do a strong texture-bias. We also find that shape-biased models show better generalization performance for monocular depth estimation compared to texture-biased models. Based on these observations, we newly design a CNN-Transformer hybrid network with a multi-level adaptive feature fusion module, called MonoFormer. The design intuition behind MonoFormer is to increase shape bias by employing Transformers while compensating for the weak locality bias of Transformers by adaptively fusing multi-level representations. Extensive experiments show that the proposed method achieves state-of-the-art performance with various public datasets. Our method also shows the best generalization ability among the competitive methods.
引用
收藏
页码:187 / 196
页数:10
相关论文
共 50 条
  • [31] Graph semantic information for self-supervised monocular depth estimation
    Zhang, Dongdong
    Wang, Chunping
    Wang, Huiying
    Fu, Qiang
    PATTERN RECOGNITION, 2024, 156
  • [32] Image Masking for Robust Self-Supervised Monocular Depth Estimation
    Chawla, Hemang
    Jeeveswaran, Kishaan
    Arani, Elahe
    Zonooz, Bahram
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 10054 - 10060
  • [33] Exploring the vulnerability of self-supervised monocular depth estimation models
    Hou, Ruitao
    Mo, Kanghua
    Long, Yucheng
    Li, Ning
    Rao, Yuan
    INFORMATION SCIENCES, 2024, 677
  • [34] Self-Supervised Monocular Depth Estimation Based on Channel Attention
    Tao, Bo
    Chen, Xinbo
    Tong, Xiliang
    Jiang, Du
    Chen, Baojia
    PHOTONICS, 2022, 9 (06)
  • [35] Self-Supervised Human Depth Estimation from Monocular Videos
    Tan, Feitong
    Zhu, Hao
    Cui, Zhaopeng
    Zhu, Siyu
    Pollefeys, Marc
    Tan, Ping
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 647 - 656
  • [36] Self-Supervised Monocular Depth Estimation with Multi-constraints
    Yang, Xinpeng
    Zhang, Sen
    Zhao, Baoyong
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8422 - 8427
  • [37] Self-supervised Monocular Depth Estimation on Unseen Synthetic Cameras
    Diana-Albelda, Cecilia
    Bravo Perez-Villar, Juan Ignacio
    Montalvo, Javier
    Garcia-Martin, Alvaro
    Bescos Cano, Jesus
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 449 - 463
  • [38] MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer
    Zhao, Chaoqiang
    Zhang, Youmin
    Poggi, Matteo
    Tosi, Fabio
    Guo, Xianda
    Zhu, Zheng
    Huang, Guan
    Tang, Yang
    Mattoccia, Stefano
    2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 668 - 678
  • [39] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation
    Peng, Rui
    Wang, Ronggang
    Lai, Yawen
    Tang, Luyang
    Cai, Yangang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15540 - 15549
  • [40] A LIGHTWEIGHT SELF-SUPERVISED TRAINING FRAMEWORK FOR MONOCULAR DEPTH ESTIMATION
    Heydrich, Tim
    Yang, Yimin
    Du, Shan
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2265 - 2269