TransFusionOdom: Transformer-Based LiDAR-Inertial Fusion Odometry Estimation

被引:14
|
作者
Sun, Leyuan [1 ,2 ]
Ding, Guanqun [3 ]
Qiu, Yue [2 ]
Yoshiyasu, Yusuke [2 ]
Kanehiro, Fumio [1 ,4 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, CNRS AIST Joint Robot Lab JRL, IRL, Tsukuba 3058560, Japan
[2] Natl Inst Adv Ind Sci & Technol, Comp Vis Res Team, Artificial Intelligence Res Ctr AIRC, Tsukuba 3058560, Japan
[3] Natl Inst Adv Ind Sci & Technol, Digital Architecture Res Ctr DigiARC, Tokyo 1350064, Japan
[4] Univ Tsukuba, Grad Sch Sci & Technol, Dept Intelligent & Mech Interact Syst, Tsukuba 3050006, Japan
基金
日本学术振兴会;
关键词
Attention mechanisms; LiDAR-inertial odometry (LIO); multimodal learning; sensor data fusion; transformer; ROBUST; DEPTH; CNN;
D O I
10.1109/JSEN.2023.3302401
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multimodal fusion of sensors is a commonly used approach to enhance the performance of odometry estimation, which is also a fundamental module for mobile robots. Recently, learning-based approaches garner the attention in this field, due to their robust nonhandcrafted designs. However, the question of How to perform fusion among different modalities in a supervised sensor fusion odometry estimation task? is one of the challenging issues still remaining. Some simple operations, such as elementwise summation and concatenation, are not capable of assigning adaptive attentional weights to incorporate different modalities efficiently, which makes it difficult to achieve competitive odometry results. Besides, the Transformer architecture has shown potential for multimodal fusion tasks, particularly in the domains of vision with language. In this work, we propose an end-to-end supervised Transformer-based LiDAR-Inertial fusion framework (namely TransFusionOdom) for odometry estimation. The multiattention fusion module demonstrates different fusion approaches for homogeneous and heterogeneous modalities to address the overfitting problem that can arise from blindly increasing the complexity of the model. Additionally, to interpret the learning process of the Transformer-based multimodal interactions, a general visualization approach is introduced to illustrate the interactions between modalities. Moreover, exhaustive ablation studies evaluate different multimodal fusion strategies to verify the performance of the proposed fusion strategy. A synthetic multimodal dataset is made public to validate the generalization ability of the proposed fusion strategy, which also works for other combinations of different modalities. The quantitative and qualitative odometry evaluations on the KITTI dataset verify that the proposed TransFusionOdom can achieve superior performance compared with other learning-based related works.
引用
收藏
页码:22064 / 22079
页数:16
相关论文
共 50 条
  • [31] Hierarchical Distribution-Based Tightly-Coupled LiDAR Inertial Odometry
    Wang, Chengpeng
    Cao, Zhiqiang
    Li, Jianjie
    Yu, Junzhi
    Wang, Shuo
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1423 - 1435
  • [32] Adaptive Keyframe Generation based LiDAR Inertial Odometry for Complex Underground Environments
    Kim, Boseong
    Jung, Chanyoung
    Shim, D. Hyunchul
    Agha-mohammadi, Ali-Akbar
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 3332 - 3338
  • [33] Novel Transformer-Based Fusion Models for Aero-Engine Remaining Useful Life Estimation
    Hu, Qiankun
    Zhao, Yongping
    Ren, Lihua
    IEEE ACCESS, 2023, 11 : 52668 - 52685
  • [34] A Transformer-Based Fusion Recommendation Model For IPTV Applications
    Li, Heng
    Lei, Hang
    Yang, Maolin
    Zeng, Jinghong
    Zhu, Di
    Fu, Shouwei
    2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 177 - 182
  • [35] Pose Estimation Based on Bidirectional Visual-Inertial Odometry with 3D LiDAR (BV-LIO)
    Peng, Gang
    Gao, Qiang
    Xu, Yue
    Li, Jianfeng
    Deng, Zhang
    Li, Cong
    REMOTE SENSING, 2024, 16 (16)
  • [36] A Transformer-Based Channel Estimation Method for OTFS Systems
    Sun, Teng
    Lv, Jiebiao
    Zhou, Tao
    ENTROPY, 2023, 25 (10)
  • [37] Cross-Fusion Transformer-Based Infrared and Visible Image Fusion Method
    Yin, Haitao
    Zhou, Changsheng
    LASER & OPTOELECTRONICS PROGRESS, 2025, 62 (06)
  • [38] TUFusion: A Transformer-Based Universal Fusion Algorithm for Multimodal Images
    Zhao, Yangyang
    Zheng, Qingchun
    Zhu, Peihao
    Zhang, Xu
    Ma, Wenpeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1712 - 1725
  • [39] FormerUnify: Transformer-Based Unified Fusion for Efficient Image Matting
    Wang, Jiaquan
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII, 2025, 15038 : 412 - 425
  • [40] A Temporal Transformer-Based Fusion Framework for Morphological Arrhythmia Classification
    Anjum, Nafisa
    Sathi, Khaleda Akhter
    Hossain, Md. Azad
    Dewan, M. Ali Akber
    COMPUTERS, 2023, 12 (03)