Self-supervised Monocular Depth Estimation on Unseen Synthetic Cameras

被引:0
作者
Diana-Albelda, Cecilia [1 ]
Bravo Perez-Villar, Juan Ignacio [1 ,2 ]
Montalvo, Javier [1 ]
Garcia-Martin, Alvaro [1 ]
Bescos Cano, Jesus [1 ]
机构
[1] Univ Autonoma Madrid, Video Proc & Understanding Lab, Madrid 28049, Spain
[2] Deimos Space, Madrid 28760, Spain
来源
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I | 2024年 / 14469卷
关键词
Monocular Depth Estimation; Computer Vision; Self-Supervised Learning; Camera Generalization; Custom Synthetic Dataset; Adversarial Training;
D O I
10.1007/978-3-031-49018-7_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monocular depth estimation is a critical task in computer vision, and self-supervised deep learning methods have achieved remarkable results in recent years. However, these models often struggle on camera generalization, i.e. at sequences captured by unseen cameras. To address this challenge, we present a new public custom dataset created using the CARLA simulator [4], consisting of three video sequences recorded by five different cameras with varying focal distances. This dataset has been created due to the absence of public datasets containing identical sequences captured by different cameras. Additionally, it is proposed in this paper the use of adversarial training to improve the models' robustness to intrinsic camera parameter changes, enabling accurate depth estimation regardless of the recording camera. The results of our proposed architecture are compared with a baseline model, hence being evaluated the effectiveness of adversarial training and demonstrating its potential benefits both on our synthetic dataset and on the KITTI benchmark [8] as the reference dataset to evaluate depth estimation.
引用
收藏
页码:449 / 463
页数:15
相关论文
共 31 条
[1]   Depth Estimation with Light Field and Photometric Stereo Data Using Energy Minimization [J].
Antensteiner, Doris ;
Stolc, Svorad ;
Huber-Moerk, Reinhold .
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2016, 2017, 10125 :175-183
[2]  
Cadena C, 2016, 2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), P4150, DOI 10.1109/IROS.2016.7759611
[3]  
Diana C., 2023, UNSYN-MF dataset: unified synthetic multiple FOV
[4]  
Dosovitskiy A, 2017, PR MACH LEARN RES, V78
[5]  
Eigen D, 2014, ADV NEUR IN, V27
[6]   CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth [J].
Facil, Jose M. ;
Ummenhofer, Benjamin ;
Zhou, Huizhong ;
Montesano, Luis ;
Brox, Thomas ;
Civera, Javier .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11818-11827
[7]  
Games E., 2019, Unreal engine 4
[8]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[9]   Digging Into Self-Supervised Monocular Depth Estimation [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Firman, Michael ;
Brostow, Gabriel .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3827-3837
[10]   Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras [J].
Gordon, Ariel ;
Li, Hanhan ;
Jonschkowski, Rico ;
Angelova, Anelia .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8976-8985