V2I-BEVF: Multi-modal Fusion Based on BEV Representation for Vehicle-Infrastructure Perception

被引:0
|
作者
Xiang, Chao [1 ,3 ]
Xie, Xiaopo [1 ]
Feng, Chen [1 ,2 ]
Bai, Zhen
Niu, Zhendong [3 ]
Yang, Mingchuan [1 ]
机构
[1] China Telecom Res Inst, Beijing 102209, Peoples R China
[2] China Telecom Corp Ltd, Technol Innovat Dept, Beijing 100032, Peoples R China
[3] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
来源
2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC | 2023年
关键词
VOXELNET;
D O I
10.1109/ITSC57777.2023.10421963
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As one of the core modules of autonomous driving technology, environment perception has gradually become a hot research topic in industry and academia in recent years. However, self-driving vehicles face safety challenges due to the existence of perceptual blind spots and the lack of remote sensing capability. In this paper, a multi-modal fusion based on BEV representation for Vehicle-Infrastructure perception is proposed, referred to as V2I-BEVF, which mainly contains two branch networks for feature extraction from 2D images and 3D point clouds and transform them into BEV features, then use Deformable Attention Transformer to fuse and decode them in order to achieve high-precision real-time perception of road traffic participants. The V2I-BEVF algorithm proposed in this paper experimentally verified on the open-source roadside DAIR-V2X-I dataset from Tsinghua University and Baidu. The experimental results show that compared to several algorithm benchmarks provided by the DAIR-V2X-I dataset, the V2I-BEVF algorithm has a large improvement in pedestrian detection accuracy. Simultaneously, we verified the effectiveness of the proposed method on our collected dataset of roadside sensor devices. The V2I-BEVF algorithm can be combined with 5G/V2X communication technology and applied to V2I collaborative perception scenarios to take full advantage of wide roadside environmental perception vision and the small blind area.
引用
收藏
页码:5292 / 5299
页数:8
相关论文
共 50 条
  • [1] Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection
    Chu, Huazhen
    Liu, Haizhuang
    Zhuo, Junbao
    Chen, Jiansheng
    Ma, Huimin
    PATTERN RECOGNITION, 2025, 157
  • [2] CoFormerNet: A Transformer-Based Fusion Approach for Enhanced Vehicle-Infrastructure Cooperative Perception
    Li, Bin
    Zhao, Yanan
    Tan, Huachun
    SENSORS, 2024, 24 (13)
  • [3] Multi-modal Perception Fusion Method Based on Cross Attention
    Zhang B.-L.
    Pan Z.-H.
    Jiang J.-Z.
    Zhang C.-B.
    Wang Y.-X.
    Yang C.-L.
    Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2024, 37 (03): : 181 - 193
  • [4] ART-based fusion of multi-modal perception for robots
    Berghoefer, Elmar
    Schulze, Denis
    Rauch, Christian
    Tscherepanow, Marko
    Koehler, Tim
    Wachsmuth, Sven
    NEUROCOMPUTING, 2013, 107 : 11 - 22
  • [5] Multi-Modal Fusion Technology Based on Vehicle Information: A Survey
    Zhang, Xinyu
    Gong, Yan
    Lu, Jianli
    Wu, Jiayi
    Li, Zhiwei
    Jin, Dafeng
    Li, Jun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (06): : 3605 - 3619
  • [6] A method of vehicle-infrastructure cooperative perception based vehicle state information fusion using improved kalman filter
    Mo, Yanghui
    Zhang, Peilin
    Chen, Zhijun
    Ran, Bin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 4603 - 4620
  • [7] A method of vehicle-infrastructure cooperative perception based vehicle state information fusion using improved kalman filter
    Yanghui Mo
    Peilin Zhang
    Zhijun Chen
    Bin Ran
    Multimedia Tools and Applications, 2022, 81 : 4603 - 4620
  • [8] Representation and Fusion Based on Knowledge Graph in Multi-Modal Semantic Communication
    Xing, Chenlin
    Lv, Jie
    Luo, Tao
    Zhang, Zhilong
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (05) : 1344 - 1348
  • [9] V2VFormer++: Multi-Modal Vehicle-to-Vehicle Cooperative Perception via Global-Local Transformer
    Yin, Hongbo
    Tian, Daxin
    Lin, Chunmian
    Duan, Xuting
    Zhou, Jianshan
    Zhao, Dezong
    Cao, Dongpu
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (02) : 2153 - 2166
  • [10] Multi-Modal Medical Image Fusion With Geometric Algebra Based Sparse Representation
    Li, Yanping
    Fang, Nian
    Wang, Haiquan
    Wang, Rui
    FRONTIERS IN GENETICS, 2022, 13