Scale-Aware Visual-Inertial Depth Estimation and Odometry Using Monocular Self-Supervised Learning

被引：4

作者：

Lee, Chungkeun ^{[1
]}

Kim, Changhyeon ^{[2
]}

Kim, Pyojin ^{[3
]}

Lee, Hyeonbeom ^{[4
]}

Kim, H. Jin ^{[5
]}

机构：

[1] Seoul Natl Univ, Inst Adv Aerosp Technol, Seoul 08826, South Korea

[2] Seoul Natl Univ, Automation & Syst Res Inst, Seoul 08826, South Korea

[3] Sookmyung Womens Univ, Dept Mech Syst Engn, Seoul 04312, South Korea

[4] Kyungpook Natl Univ, Sch Elect & Elect Engn, Daegu 37224, South Korea

[5] Seoul Natl Univ, Dept Mech & Aerosp Engn, Seoul 08826, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

新加坡国家研究基金会;

关键词：

Odometry; Deep learning; Loss measurement; Depth measurement; Cameras; Self-supervised learning; Coordinate measuring machines; monocular depth estimation; self-supervised learning; visual-inertial odometry;

D O I：

10.1109/ACCESS.2023.3252884

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

For real-world applications with a single monocular camera, scale ambiguity is an important issue. Because self-supervised data-driven approaches that do not require additional data containing scale information cannot avoid the scale ambiguity, state-of-the-art deep-learning-based methods address this issue by learning the scale information from additional sensor measurements. In that regard, inertial measurement unit (IMU) is a popular sensor for various mobile platforms due to its lightweight and inexpensiveness. However, unlike supervised learning that can learn the scale from the ground-truth information, learning the scale from IMU is challenging in a self-supervised setting. We propose a scale-aware monocular visual-inertial depth estimation and odometry method with end-to-end training. To learn the scale from the IMU measurements with end-to-end training in the monocular self-supervised setup, we propose a new loss function named as preintegration loss function, which trains scale-aware ego-motion by comparing the ego-motion integrated from IMU measurement and predicted ego-motion. Since the gravity and the bias should be compensated to obtain the ego-motion by integrating IMU measurements, we design a network to predict the gravity and the bias in addition to the ego-motion and the depth map. The overall performance of the proposed method is compared to state-of-the-art methods in the popular outdoor driving dataset, i.e., KITTI dataset, and the author-collected indoor driving dataset. In the KITTI dataset, the proposed method shows competitive performance compared with state-of-the-art monocular depth estimation and odometry methods, i.e., root-mean-square error of 5.435 m in the KITTI Eigen split and absolute trajectory error of 22.46 m and 0.2975 degrees in the KITTI odometry 09 sequence. Different from other up-to-scale monocular methods, the proposed method can estimate the metric-scaled depth and camera poses. Additional experiments on the author-collected indoor driving dataset qualitatively confirm the accurate performance of metric-depth and metric pose estimations.

引用

页码：24087 / 24102

页数：16

共 51 条

[41] Unsupervised Deep Visual-Inertial Odometry with Online Error Correction for RGB-D Imagery
Shamwell, E. Jared
Lindgren, Kyle
Leung, Sarah
Nothwang, William D.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2478 - 2493
[42] SDC-Depth: Semantic Divide-and-Conquer Network for Monocular Depth Estimation
Wang, Lijun
Zhang, Jianming
Wang, Oliver
Lin, Zhe
Lu, Huchuan
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 538 - 547
[43] Recurrent Neural Network for (Un-)supervised Learning of Monocular Video Visual Odometry and Depth
Wang, Rui
Pizer, Stephen M.
Frahm, Jan-Michael
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5647 - 5656
[44] Wang Y., 2021, ICCV, P12727
[45] Wei P, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2347
[46] Unsupervised Depth Completion From Visual Inertial Odometry
Wong, Alex
Fei, Xiaohan
Tsuei, Stephanie
Soatto, Stefano
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) : 1899 - 1906
[47] GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
Yin, Zhichao
Shi, Jianping
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1983 - 1992
[48] Loss Functions for Image Restoration With Neural Networks
Zhao, Hang
Gallo, Orazio
Frosio, Iuri
Kautz, Jan
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2017, 3 (01) : 47 - 57
[49] Unsupervised High-Resolution Depth Learning From Videos With Dual Networks
Zhou, Junsheng
Wang, Yuwang
Qin, Kaihuai
Zeng, Wenjun
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6871 - 6880
[50] Unsupervised Learning of Depth and Ego-Motion from Video
Zhou, Tinghui
Brown, Matthew
Snavely, Noah
Lowe, David G.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6612 - +

← 1 2 3 4 5 6 →