SWformer-VO: A Monocular Visual Odometry Model Based on Swin Transformer

被引:3
|
作者
Wu, Zhigang [1 ]
Zhu, Yaohui [1 ]
机构
[1] Jiangxi Univ Sci & Technol, Sch Energy & Mech Engn, Nanchang 330013, Peoples R China
关键词
Deep learning; monocular visual odometry; transformer; DEPTH;
D O I
10.1109/LRA.2024.3384911
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
This letter introduces a novel monocular visual odometry network structure, leveraging the Swin Transformer as the backbone network, named SWformer-VO. It can directly estimate the six degrees of freedom camera pose under monocular camera conditions by utilizing a modest volume of image sequence data with an end-to-end methodology. SWformer-VO introduces an Embed module called "Mixture Embed", which fuses consecutive pairs of images into a single frame and converts them into tokens passed into the backbone network. This approach replaces traditional temporal sequence schemes by addressing the problem at the image level. Building upon this foundation, various parameters of the backbone network are continually improved and optimized. Additionally, experiments are conducted to explore the impact of different layers and depths of the backbone network on accuracy. Excitingly, on the KITTI dataset, SWformer-VO demonstrates superior accuracy compared with common deep learning-based methods such as SFMlearner, Deep-VO, TSformer-VO, Depth-VO-Feat, GeoNet, Masked Gans and others introduced in recent years. Moreover, the effectiveness of SWformer-VO is also validated on our self-collected dataset consisting of nine indoor corridor routes for visual odometry.
引用
收藏
页码:4766 / 4773
页数:8
相关论文
共 50 条
  • [1] Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach
    Francani, Andre O.
    Maximo, Marcos R. O. A.
    IEEE ACCESS, 2025, 13 : 13959 - 13971
  • [2] Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry
    Francani, Andre O.
    Maximo, Marcos R. O. A.
    2022 LATIN AMERICAN ROBOTICS SYMPOSIUM (LARS), 2022 BRAZILIAN SYMPOSIUM ON ROBOTICS (SBR), AND 2022 WORKSHOP ON ROBOTICS IN EDUCATION (WRE), 2022, : 312 - 317
  • [3] RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry
    Cimarelli, Claudio
    Bavle, Hriday
    Sanchez-Lopez, Jose Luis
    Voos, Holger
    SENSORS, 2022, 22 (07)
  • [4] Transformer Based Visual Inertial Odometry
    Fei, Sicheng
    Li, Jingfeng
    Li, Lei
    Liang, Jie
    Hu, Jinwen
    Zhang, Dingwen
    Han, Junwei
    ADVANCES IN GUIDANCE, NAVIGATION AND CONTROL, VOL 17, 2025, 1353 : 567 - 575
  • [5] Motion Consistency Loss for Monocular Visual Odometry with Attention-Based Deep Learning
    Francani, Andre O.
    Maximo, Marcos R. O. A.
    2023 LATIN AMERICAN ROBOTICS SYMPOSIUM, LARS, 2023 BRAZILIAN SYMPOSIUM ON ROBOTICS, SBR, AND 2023 WORKSHOP ON ROBOTICS IN EDUCATION, WRE, 2023, : 409 - 414
  • [6] Monocular Visual Odometry based on Inverse Perspective Mapping
    Cao Yu
    Feng Ying
    Yang Yun-tao
    Chen Yun-jin
    Lei Bing
    Zhao Li-shuang
    INTERNATIONAL SYMPOSIUM ON PHOTOELECTRONIC DETECTION AND IMAGING 2011: ADVANCES IN IMAGING DETECTORS AND APPLICATIONS, 2011, 8194
  • [7] A Comparison of Deep Learning-Based Monocular Visual Odometry Algorithms
    Jeong, Eunju
    Lee, Jaun
    Kim, Pyojin
    PROCEEDINGS OF THE 2021 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY (APISAT 2021), VOL 2, 2023, 913 : 923 - 934
  • [8] Transformer guided geometry model for flow-based unsupervised visual odometry
    Li, Xiangyu
    Hou, Yonghong
    Wang, Pichao
    Gao, Zhimin
    Xu, Mingliang
    Li, Wanqing
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (13) : 8031 - 8042
  • [9] Transformer guided geometry model for flow-based unsupervised visual odometry
    Xiangyu Li
    Yonghong Hou
    Pichao Wang
    Zhimin Gao
    Mingliang Xu
    Wanqing Li
    Neural Computing and Applications, 2021, 33 : 8031 - 8042
  • [10] Monocular Visual Odometry Based on Recurrent Convolutional Neural Networks
    Chen Z.
    Hong Y.
    Wang J.
    Ge Z.
    Jiqiren/Robot, 2019, 41 (02): : 147 - 155