CrossFuser: Multi-Modal Feature Fusion for End-to-End Autonomous Driving Under Unseen Weather Conditions

被引:17
|
作者
Wu, Weishang [1 ,2 ]
Deng, Xiaoheng [1 ,2 ]
Jiang, Ping [1 ,2 ]
Wan, Shaohua [3 ]
Guo, Yuanxiong [4 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Shenzhen Res Inst, Shenzhen 518000, Peoples R China
[3] Univ Elect Sci & Technol China, Shenzhen Inst Adv Study, Shenzhen 518110, Peoples R China
[4] Univ Texas San Antonio, Dept Informat Syst & Cyber Secur, San Antonio, TX 78249 USA
基金
中国国家自然科学基金;
关键词
End-to-end autonomous driving; multimodal sensor fusion; out-of-training distribution; imitation learning;
D O I
10.1109/TITS.2023.3307589
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multi-modal fusion is a promising approach to boost the autonomous driving performance and has already received a large amount of attention. Meanwhile, to increase driving reliability under distinct scenarios, it is important to handle unforeseen weather events in the training dataset, which is known as an Out-Of-Distribution (OOD) problem, for autonomous driving algorithms. In this paper, we consider those two aspects and propose an end-to-end multi-modal domain-enhanced framework, namely CrossFuser, to meet the safety orientated driving requirements. CrossFuser first integrates both image and lidar modalities to generate a robust environmental representation through conjoint mapping, elastic disentanglement, and attention mechanism. Further, the perception embedding is used to calculate corresponding waypoints by a waypoint prediction network, consisting of Gate Recurrent Units (GRUs). Finally, the final control commands are calculated by low-level control functions. We conduct experiments on the Car Learning to Act (CARLA) driving simulator involving complex weather conditions under urban scenarios, the results show that CrossFuser can outperform the state of the art.
引用
收藏
页码:14378 / 14392
页数:15
相关论文
共 50 条
  • [1] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
    Prakash, Aditya
    Chitta, Kashyap
    Geiger, Andreas
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7073 - 7083
  • [2] Multi-modal policy fusion for end-to-end autonomous driving
    Huang, Zhenbo
    Sun, Shiliang
    Zhao, Jing
    Mao, Liang
    INFORMATION FUSION, 2023, 98
  • [3] Multi-modal information fusion for multi-task end-to-end behavior prediction in autonomous driving
    Guo, Baicang
    Liu, Hao
    Yang, Xiao
    Cao, Yuan
    Jin, Lisheng
    Wang, Yinlin
    NEUROCOMPUTING, 2025, 634
  • [4] Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models
    Wang, Tsun-Hsuan
    Maalouf, Alaa
    Xia, Wei
    Bao, Yutong
    Amini, Alexander
    Rosman, Guy
    Karaman, Sertac
    Rus, Daniela
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 6687 - 6694
  • [5] Multi-Modal Sensor Fusion-Based Deep Neural Network for End-to-End Autonomous Driving With Scene Understanding
    Huang, Zhiyu
    Lv, Chen
    Xing, Yang
    Wu, Jingda
    IEEE SENSORS JOURNAL, 2021, 21 (10) : 11781 - 11790
  • [6] Multi-Modal Fusion for End-to-End RGB-T Tracking
    Zhang, Lichao
    Danelljan, Martin
    Gonzalez-Garcia, Abel
    van de Weijer, Joost
    Khan, Fahad Shahbaz
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2252 - 2261
  • [7] MMFN: Multi-Modal-Fusion-Net for End-to-End Driving
    Zhang, Qingwen
    Tang, Mingkai
    Geng, Ruoyu
    Chen, Feiyi
    Xin, Ren
    Wang, Lujia
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8638 - 8643
  • [8] SymmetricNet: end-to-end mesoscale eddy detection with multi-modal data fusion
    Zhao, Yuxiao
    Fan, Zhenlin
    Li, Haitao
    Zhang, Rui
    Xiang, Wei
    Wang, Shengke
    Zhong, Guoqiang
    FRONTIERS IN MARINE SCIENCE, 2023, 10
  • [9] Multi-Modal Data Augmentation for End-to-End ASR
    Renduchintala, Adithya
    Ding, Shuoyang
    Wiesner, Matthew
    Watanabe, Shinji
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2394 - 2398
  • [10] End-to-end Knowledge Retrieval with Multi-modal Queries
    Luo, Man
    Fang, Zhiyuan
    Gokhale, Tejas
    Yang, Yezhou
    Baral, Chitta
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 8573 - 8589