MFSA-Net: Semantic Segmentation With Camera-LiDAR Cross-Attention Fusion Based on Fast Neighbor Feature Aggregation

被引:0
|
作者
Duan, Yijian [1 ]
Meng, Liwen [1 ]
Meng, Yanmei [1 ]
Zhu, Jihong [2 ]
Zhang, Jiacheng [1 ]
Zhang, Jinlai [3 ]
Liu, Xin [1 ]
机构
[1] Guangxi Univ, Coll Mech Engn, Nanning 530004, Peoples R China
[2] Tsinghua Univ, Dept Precis Instrument, Beijing 100000, Peoples R China
[3] Changsha Univ Sci & Technol, Coll Automot & Mech Engn, Changsha 410114, Peoples R China
关键词
Cross-attention; LiDAR point clouds; multimodal; semantic segmentation; NETWORK;
D O I
10.1109/JSTARS.2024.3472751
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Given the inherent limitations of camera-only and LiDAR-only methods in performing semantic segmentation tasks in large-scale complex environments, multimodal information fusion for semantic segmentation has become a focal point of contemporary research. However, significant modal disparities often result in existing fusion-based methods struggling with low segmentation accuracy and limited efficiency in large-scale complex environments. To address these challenges,we propose a semantic segmentation network with camera-LiDAR cross-attention fusion based on fast neighbor feature aggregation (MFSA-Net), which is better suited for large-scale semantic segmentation in complex environments. Initially, we propose a dual-distance attention feature aggregation module based on rapid 3-D nearest neighbor search. This module employs a sliding window method in point cloud perspective projections for swift proximity search, and efficiently combines feature distance and Euclidean distance information to learn more distinctive local features. This improves segmentation accuracy while ensuring computational efficiency. Furthermore, we propose a cross-attention fusion two-stream network based on residual, which allows for more effective integration of camera information into the LiDAR data stream, enhancing both accuracy and robustness. Extensive experimental results on the large-scale point cloud datasets SemanticKITTI and Nuscenes demonstrate that our proposed algorithm outperforms similar algorithms in semantic segmentation performance in large-scale complex environments.
引用
收藏
页码:19627 / 19639
页数:13
相关论文
共 50 条
  • [21] A Semantic Segmentation Method of Remote Sensing Image Based on Feature Fusion and Attention Mechanism
    Wang, Yiqin
    Dong, Yunyun
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2024, 20 (05): : 640 - 653
  • [22] Research on Multi-task Semantic Segmentation Based on Attention and Feature Fusion Method
    Dong, Aimei
    Liu, Sidi
    MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 362 - 373
  • [23] APPFNet: Adaptive point-pixel fusion network for 3D semantic segmentation with neighbor feature aggregation
    Wu, Zhaolong
    Zhang, Yong
    Lan, Rukai
    Qiu, Shaohua
    Ran, Shaolin
    Liu, Yifan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [24] Point cloud semantic segmentation based on local feature fusion and multilayer attention network
    Wen, Junjie
    Ma, Jie
    Zhao, Yuehua
    Nie, Tong
    Sun, Mengxuan
    Fan, Ziming
    IET COMPUTER VISION, 2024, 18 (03) : 381 - 392
  • [25] FPF-Net: feature propagation and fusion based on attention mechanism for pancreas segmentation
    Chen, Haipeng
    Liu, Yunjie
    Shi, Zenan
    MULTIMEDIA SYSTEMS, 2023, 29 (02) : 525 - 538
  • [26] FPF-Net: feature propagation and fusion based on attention mechanism for pancreas segmentation
    Haipeng Chen
    Yunjie Liu
    Zenan Shi
    Multimedia Systems, 2023, 29 : 525 - 538
  • [27] MsVFE and V-SIAM: Attention-based multi-scale feature interaction and fusion for outdoor LiDAR semantic segmentation
    Yang, Jingru
    Wang, Jin
    Huang, Kaixiang
    Lu, Guodong
    Sun, Yu
    Yu, Huan
    Zhang, Cheng
    Yang, Ying
    Zou, Wenming
    NEUROCOMPUTING, 2024, 584
  • [28] MsVFE and V-SIAM: Attention-based multi-scale feature interaction and fusion for outdoor LiDAR semantic segmentation
    Yang, Jingru
    Wang, Jin
    Huang, Kaixiang
    Lu, Guodong
    Sun, Yu
    Yu, Huan
    Zhang, Cheng
    Yang, Ying
    Zou, Wenming
    Neurocomputing, 2024, 584
  • [29] Enhanced global attention upsample decoder based on enhanced spatial attention and feature aggregation module for semantic segmentation
    Yin, Lianglu
    Hu, Haifeng
    ELECTRONICS LETTERS, 2020, 56 (13) : 659 - 661
  • [30] Cross Fusion Net: A Fast Semantic Segmentation Network for Small-Scale Semantic Information Capturing in Aerial Scenes
    Peng, Chengli
    Zhang, Kaining
    Ma, Yong
    Ma, Jiayi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60