MonoNav: MAV Navigation via Monocular Depth Estimation and Reconstruction

被引：0

作者：

Simon, Nathaniel ^{[1
]}

Majumdar, Anirudha ^{[1
]}

机构：

[1] Princeton Univ, Dept Mech & Aerosp Engn, Princeton, NJ 08544 USA

来源：

EXPERIMENTAL ROBOTICS, ISER 2023 | 2024年 / 30卷

关键词：

MAV; monocular depth estimation; 3D reconstruction; collision avoidance;

D O I：

10.1007/978-3-031-63596-0_37

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A major challenge in deploying the smallest of Micro Aerial Vehicle (MAV) platforms (<= 100 g) is their inability to carry sensors that provide high-resolution metric depth information (e.g., LiDAR or stereo cameras). Current systems rely on end-to-end learning or heuristic approaches that directly map images to control inputs, and struggle to fly fast in unknown environments. In this work, we ask the following question: using only a monocular camera, optical odometry, and off-board computation, can we create metrically accurate maps to leverage the powerful path planning and navigation approaches employed by larger state-of-the-art robotic systems to achieve robust autonomy in unknown environments? We present MonoNav: a fast 3D reconstruction and navigation stack for MAVs that leverages recent advances in depth prediction neural networks to enable metrically accurate 3D scene reconstruction from a stream of monocular images and poses. MonoNav uses off-the-shelf pre-trained monocular depth estimation and fusion techniques to construct a map, then searches over motion primitives to plan a collision-free trajectory to the goal. In extensive hardware experiments, we demonstrate how MonoNav enables the Crazyflie (a 37 g MAV) to navigate fast (0.5m/s) in cluttered indoor environments. We evaluate MonoNav against a state-of-the-art end-to-end approach, and find that the collision rate in navigation is significantly reduced (by a factor of 4). This increased safety comes at the cost of conservatism in terms of a 22% reduction in goal completion.

引用

页码：415 / 426

页数：12

共 21 条

[1] Bhat SF, 2023, Arxiv, DOI arXiv:2302.12288
[2] Chaplot DS, 2020, PROC CVPR IEEE, P12872, DOI 10.1109/CVPR42600.2020.01289
[3] Chi C, 2024, Arxiv, DOI arXiv:2303.04137
[4] ASH: A Modern Framework for Parallel Spatial Hashing in 3D Perception
Dong, Wei
Lao, Yixing
Kaess, Michael
Koltun, Vladlen
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5417 - 5435
[5] Eigen D, 2014, ADV NEUR IN, V27
[6] Navigating to objects in the real world
Gervet, Theophile
Chintala, Soumith
Batra, Dhruv
Malik, Jitendra
Chaplot, Devendra Singh
[J]. SCIENCE ROBOTICS, 2023, 8 (79)
[7] Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight
Kang, Katie
Belkhale, Suneel
Kahn, Gregory
Abbeel, Pieter
Levine, Sergey
[J]. 2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 6008 - 6014
[8] Learning high-speed flight in the wild
Loquercio, Antonio
Kaufmann, Elia
Ranftl, Rene
Mueller, Matthias
Koltun, Vladlen
Scaramuzza, Davide
[J]. SCIENCE ROBOTICS, 2021, 6 (59)
[9] Majumdar A., Introduction to Robotics at Princeton
[10] KinectFusion: Real-Time Dense Surface Mapping and Tracking
Newcombe, Richard A.
Izadi, Shahram
Hilliges, Otmar
Molyneaux, David
Kim, David
Davison, Andrew J.
Kohli, Pushmeet
Shotton, Jamie
Hodges, Steve
Fitzgibbon, Andrew
[J]. 2011 10TH IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR), 2011, : 127 - 136

← 1 2 3 →