TransFusionOdom: Transformer-Based LiDAR-Inertial Fusion Odometry Estimation

被引：14

作者：

Sun, Leyuan ^{[1
,2
]}

Ding, Guanqun ^{[3
]}

Qiu, Yue ^{[2
]}

Yoshiyasu, Yusuke ^{[2
]}

Kanehiro, Fumio ^{[1
,4
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, CNRS AIST Joint Robot Lab JRL, IRL, Tsukuba 3058560, Japan

[2] Natl Inst Adv Ind Sci & Technol, Comp Vis Res Team, Artificial Intelligence Res Ctr AIRC, Tsukuba 3058560, Japan

[3] Natl Inst Adv Ind Sci & Technol, Digital Architecture Res Ctr DigiARC, Tokyo 1350064, Japan

[4] Univ Tsukuba, Grad Sch Sci & Technol, Dept Intelligent & Mech Interact Syst, Tsukuba 3050006, Japan

来源：

IEEE SENSORS JOURNAL | 2023年 / 23卷 / 18期

基金：

日本学术振兴会;

关键词：

Attention mechanisms; LiDAR-inertial odometry (LIO); multimodal learning; sensor data fusion; transformer; ROBUST; DEPTH; CNN;

D O I：

10.1109/JSEN.2023.3302401

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Multimodal fusion of sensors is a commonly used approach to enhance the performance of odometry estimation, which is also a fundamental module for mobile robots. Recently, learning-based approaches garner the attention in this field, due to their robust nonhandcrafted designs. However, the question of How to perform fusion among different modalities in a supervised sensor fusion odometry estimation task? is one of the challenging issues still remaining. Some simple operations, such as elementwise summation and concatenation, are not capable of assigning adaptive attentional weights to incorporate different modalities efficiently, which makes it difficult to achieve competitive odometry results. Besides, the Transformer architecture has shown potential for multimodal fusion tasks, particularly in the domains of vision with language. In this work, we propose an end-to-end supervised Transformer-based LiDAR-Inertial fusion framework (namely TransFusionOdom) for odometry estimation. The multiattention fusion module demonstrates different fusion approaches for homogeneous and heterogeneous modalities to address the overfitting problem that can arise from blindly increasing the complexity of the model. Additionally, to interpret the learning process of the Transformer-based multimodal interactions, a general visualization approach is introduced to illustrate the interactions between modalities. Moreover, exhaustive ablation studies evaluate different multimodal fusion strategies to verify the performance of the proposed fusion strategy. A synthetic multimodal dataset is made public to validate the generalization ability of the proposed fusion strategy, which also works for other combinations of different modalities. The quantitative and qualitative odometry evaluations on the KITTI dataset verify that the proposed TransFusionOdom can achieve superior performance compared with other learning-based related works.

引用

页码：22064 / 22079

页数：16

共 50 条

[31] Hierarchical Distribution-Based Tightly-Coupled LiDAR Inertial Odometry
Wang, Chengpeng
Cao, Zhiqiang
Li, Jianjie
Yu, Junzhi
Wang, Shuo
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1423 - 1435
[32] Adaptive Keyframe Generation based LiDAR Inertial Odometry for Complex Underground Environments
Kim, Boseong
Jung, Chanyoung
Shim, D. Hyunchul
Agha-mohammadi, Ali-Akbar
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 3332 - 3338
[33] Novel Transformer-Based Fusion Models for Aero-Engine Remaining Useful Life Estimation
Hu, Qiankun
Zhao, Yongping
Ren, Lihua
IEEE ACCESS, 2023, 11 : 52668 - 52685
[34] A Transformer-Based Fusion Recommendation Model For IPTV Applications
Li, Heng
Lei, Hang
Yang, Maolin
Zeng, Jinghong
Zhu, Di
Fu, Shouwei
2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 177 - 182
[35] Pose Estimation Based on Bidirectional Visual-Inertial Odometry with 3D LiDAR (BV-LIO)
Peng, Gang
Gao, Qiang
Xu, Yue
Li, Jianfeng
Deng, Zhang
Li, Cong
REMOTE SENSING, 2024, 16 (16)
[36] A Transformer-Based Channel Estimation Method for OTFS Systems
Sun, Teng
Lv, Jiebiao
Zhou, Tao
ENTROPY, 2023, 25 (10)
[37] Cross-Fusion Transformer-Based Infrared and Visible Image Fusion Method
Yin, Haitao
Zhou, Changsheng
LASER & OPTOELECTRONICS PROGRESS, 2025, 62 (06)
[38] TUFusion: A Transformer-Based Universal Fusion Algorithm for Multimodal Images
Zhao, Yangyang
Zheng, Qingchun
Zhu, Peihao
Zhang, Xu
Ma, Wenpeng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1712 - 1725
[39] FormerUnify: Transformer-Based Unified Fusion for Efficient Image Matting
Wang, Jiaquan
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII, 2025, 15038 : 412 - 425
[40] A Temporal Transformer-Based Fusion Framework for Morphological Arrhythmia Classification
Anjum, Nafisa
Sathi, Khaleda Akhter
Hossain, Md. Azad
Dewan, M. Ali Akber
COMPUTERS, 2023, 12 (03)

← 1 2 3 4 5 →