UFODepth: Unsupervised learning with flow-based odometry optimization for metric depth estimation

被引：2

作者：

Licaret, Vlad ^{[1
]}

Robu, Victor ^{[2
]}

Marcu, Alina ^{[1
,2
]}

Costea, Dragos ^{[1
,2
]}

Slusanschi, Emil ^{[1
]}

Sukthankar, Rahul ^{[3
]}

Leordeanu, Marius ^{[1
,2
]}

机构：

[1] Univ Politehn Bucuresti, Bucharest, Romania

[2] Romanian Acad, Inst Math, Bucharest, Romania

[3] Google Res, Mountain View, CA USA

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022 | 2022年

关键词：

SEMANTIC SEGMENTATION;

D O I：

10.1109/ICRA46639.2022.9812374

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose an efficient method for unsupervised learning of metric depth estimation from a single image in the context of unconstrained videos captured from UAVs. We combine the accuracy of an analytical solution based on odometry with the power of deep learning. First, we show how to correct the noisy odometric measurements by optimizing the alignment between the derotated optical flow and the projected linear speed in the image. Then, we detail an analytical depth estimation method based on optical flow and corrected camera velocities. Subsequently, the improved depth and camera velocities obtained analytically are used, as additional cost terms, for training our novel unsupervised learning architecture for metric depth estimation. We extensively test on a recent UAV dataset, which we significantly extend by adding completely novel scenes. We outperform by significant margins different kinds of state-of-the-art approaches, ranging from analytical and unsupervised solutions to transformer-based architectures that require heavy computation and pre-training. The resulting algorithm could be deployed on embedded devices, being a good candidate for practical robotics use cases, such as obstacle avoidance and safe landing for UAVs.

引用

页码：6526 / 6532

页数：7

共 29 条

[1]

AliceVision, 2018, AliceVision | Meshroom-3D Reconstruction Software

[2] Review of visual odometry: types, approaches, challenges, and applications [J].

Aqel, Mohammad O. A. ;

Marhaban, Mohammad H. ;

Saripan, M. Iqbal ;

Ismail, Napsiah Bt. .

SPRINGERPLUS, 2016, 5

[3]

Bian JW, 2019, ADV NEUR IN, V32

[4] OBJECT MODELING BY REGISTRATION OF MULTIPLE RANGE IMAGES [J].

CHEN, Y ;

MEDIONI, G .

IMAGE AND VISION COMPUTING, 1992, 10 (03) :145-155

[5] DeepFactors: Real-Time Probabilistic Dense Monocular SLAM [J].

Czarnowski, Jan ;

Laidlow, Tristan ;

Clark, Ronald ;

Davison, Andrew J. .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :721-728

[6]

Deng KL, 2024, Arxiv, DOI arXiv:2107.02791

[7] The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking [J].

Du, Dawei ;

Qi, Yuankai ;

Yu, Hongyang ;

Yang, Yifan ;

Duan, Kaiwen ;

Li, Guorong ;

Zhang, Weigang ;

Huang, Qingming ;

Tian, Qi .

COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :375-391

[8] Mid-Air: A multi-modal dataset for extremely low altitude drone flights [J].

Fonder, Michael ;

Van Droogenbroeck, Marc .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :553-562

[9] UVid-Net: Enhanced Semantic Segmentation of UAV Aerial Videos by Embedding Temporal Information [J].

Girisha, S. ;

Verma, Ujjwal ;

Pai, Manohara M. M. ;

Pai, Radhika M. .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :4115-4127

[10] Digging Into Self-Supervised Monocular Depth Estimation [J].

Godard, Clement ;

Mac Aodha, Oisin ;

Firman, Michael ;

Brostow, Gabriel .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3827-3837

← 1 2 3 →