Deep Direct Visual Odometry

被引：23

作者：

Zhao, Chaoqiang ^{[1
]}

Tang, Yang ^{[1
]}

Sun, Qiyu ^{[1
]}

Vasilakos, Athanasios V. ^{[2
,3
,4
]}

机构：

[1] East China Univ Sci & Technol, Key Lab Smart Mfg Energy Chem Proc, Minist Educ, Shanghai 200237, Peoples R China

[2] Univ Technol Sydney, Sch Elect & Data Engn, Sydney, NSW, Australia

[3] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350116, Peoples R China

[4] Lulea Univ Technol, Dept Comp Sci Elect & Space Engn, S-97187 Lulea, Sweden

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2022年 / 23卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Visual odometry; direct methods; pose estimation; deep learning; unsupervised learning; VISION; SYSTEM; POSE;

D O I：

10.1109/TITS.2021.3071886

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Traditional monocular direct visual odometry (DVO) is one of the most famous methods to estimate the ego-motion of robots and map environments from images simultaneously. However, DVO heavily relies on high-quality images and accurate initial pose estimation during tracking. With the outstanding performance of deep learning, previous works have shown that deep neural networks can effectively learn 6-DoF (Degree of Freedom) poses between frames from monocular image sequences in the unsupervised manner. However, these unsupervised deep learning-based frameworks cannot accurately generate the full trajectory of a long monocular video because of the scale-inconsistency between each pose. To address this problem, we use several geometric constraints to improve the scale-consistency of the pose network, including improving the previous loss function and proposing a novel scale-to-trajectory constraint for unsupervised training. We call the pose network trained by the proposed novel constraint as TrajNet. In addition, a new DVO architecture, called deep direct sparse odometry (DDSO), is proposed to overcome the drawbacks of the previous direct sparse odometry (DSO) framework by embedding TrajNet. Extensive experiments on the KITTI dataset show that the proposed constraints can effectively improve the scale-consistency of TrajNet when compared with previous unsupervised monocular methods, and integration with TrajNet makes the initialization and tracking of DSO more robust and accurate.

引用

页码：7733 / 7742

页数：10

共 32 条

[1] Bian JW, 2019, ADV NEUR IN, V32
[2] Casser V., 2019, P 33 AAAI C ARTIFICI, V2, P7
[3] Direct Sparse Odometry
Engel, Jakob
Koltun, Vladlen
Cremers, Daniel
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) : 611 - 625
[4] LSD-SLAM: Large-Scale Direct Monocular SLAM
Engel, Jakob
Schoeps, Thomas
Cremers, Daniel
[J]. COMPUTER VISION - ECCV 2014, PT II, 2014, 8690 : 834 - 849
[5] SVO: Semidirect Visual Odometry for Monocular and Multicamera Systems
Forster, Christian
Zhang, Zichao
Gassner, Michael
Werlberger, Manuel
Scaramuzza, Davide
[J]. IEEE TRANSACTIONS ON ROBOTICS, 2017, 33 (02) : 249 - 265
[6] Vision meets robotics: The KITTI dataset
Geiger, A.
Lenz, P.
Stiller, C.
Urtasun, R.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237
[7] Unsupervised Monocular Depth Estimation with Left-Right Consistency
Godard, Clement
Mac Aodha, Oisin
Brostow, Gabriel J.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6602 - 6611
[8] PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization
Kendall, Alex
Grimes, Matthew
Cipolla, Roberto
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2938 - 2946
[9] Vision-Only Localization
Lategahn, Henning
Stiller, Christoph
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2014, 15 (03) : 1246 - 1257
[10] Robust and Efficient Relative Pose With a Multi-Camera System for Autonomous Driving in Highly Dynamic Environments
Liu, Liu
Li, Hongdong
Dai, Yuchao
Pan, Quan
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (08) : 2432 - 2444

← 1 2 3 4 →