MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model

被引:0
|
作者
Shao, Shuwei [1 ,2 ]
Pei, Zhongcai [1 ]
Chen, Weihai [1 ,2 ]
Sun, Dingchi [1 ]
Chen, Peter C. Y. [3 ]
Li, Zhengguo [4 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
[2] Beihang Univ, Hangzhou Innovat Inst, Hangzhou 310052, Zhejiang, Peoples R China
[3] Natl Univ Singapore, Dept Mech Engn, Singapore 117575, Singapore
[4] ASTAR, Inst Infocomm Res, Dept 6, Singapore 138632, Singapore
基金
中国国家自然科学基金;
关键词
Diffusion models; Noise reduction; Circuits and systems; Training; Accuracy; Standards; Diffusion processes; Visualization; Transformers; Three-dimensional displays; Monocular depth estimation; conditional diffusion; self-supervised learning;
D O I
10.1109/TCSVT.2024.3509619
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Over the past few years, self-supervised monocular depth estimation has received widespread attention. Most efforts focus on designing different types of network architectures and loss functions or handling edge cases, for example, occlusion and dynamic objects. In this work, we take another path and propose a novel conditional diffusion-based generative framework for self-supervised monocular depth estimation, dubbed MonoDiffusion. Because the depth ground-truth is unavailable in a self-supervised setting, we develop a new pseudo ground-truth diffusion process to assist the diffusion for training. Instead of diffusing at a fixed high resolution, we perform diffusion in a coarse-to-fine manner that allows for faster inference time without sacrificing accuracy or even better accuracy. Furthermore, we develop a simple yet effective contrastive depth reconstruction mechanism to enhance the denoising ability of model. It is worth noting that the proposed MonoDiffusion has the property of naturally acquiring the depth uncertainty that is essential to be implemented in safety-critical cases. Extensive experiments on the KITTI, Make3D and DIML datasets indicate that our MonoDiffusion outperforms prior state-of-the-art self-supervised competitors. The source code will be publicly available upon the acceptance.
引用
收藏
页码:3664 / 3678
页数:15
相关论文
共 50 条
  • [1] Digging Into Self-Supervised Monocular Depth Estimation
    Godard, Clement
    Mac Aodha, Oisin
    Firman, Michael
    Brostow, Gabriel
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837
  • [2] Self-supervised monocular depth estimation in fog
    Tao, Bo
    Hu, Jiaxin
    Jiang, Du
    Li, Gongfa
    Chen, Baojia
    Qian, Xinbo
    OPTICAL ENGINEERING, 2023, 62 (03)
  • [3] On the uncertainty of self-supervised monocular depth estimation
    Poggi, Matteo
    Aleotti, Filippo
    Tosi, Fabio
    Mattoccia, Stefano
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3224 - 3234
  • [4] Revisiting Self-supervised Monocular Depth Estimation
    Kim, Ue-Hwan
    Lee, Gyeong-Min
    Kim, Jong-Hwan
    ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 336 - 350
  • [5] Adapting Segment Anything Model for self-supervised monocular depth estimation
    Zhang, Dongdong
    Wang, Chunping
    Fu, Qiang
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (06)
  • [6] Semantically guided self-supervised monocular depth estimation
    Lu, Xiao
    Sun, Haoran
    Wang, Xiuling
    Zhang, Zhiguo
    Wang, Haixia
    IET IMAGE PROCESSING, 2022, 16 (05) : 1293 - 1304
  • [7] Self-Supervised Monocular Scene Decomposition and Depth Estimation
    Safadoust, Sadra
    Guney, Fatma
    2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 627 - 636
  • [8] Joint Self-Supervised Monocular Depth Estimation and SLAM
    Xing, Xiaoxia
    Cai, Yinghao
    Lu, Tao
    Yang, Yiping
    Wen, Dayong
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4030 - 4036
  • [9] Learn to Adapt for Self-Supervised Monocular Depth Estimation
    Sun, Qiyu
    Yen, Gary G.
    Tang, Yang
    Zhao, Chaoqiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 15647 - 15659
  • [10] Learn to Adapt for Self-Supervised Monocular Depth Estimation
    Sun, Qiyu
    Yen, Gary G.
    Tang, Yang
    Zhao, Chaoqiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 15647 - 15659