VPA-Net: A visual perception assistance network for 3d lidar semantic segmentation

被引:0
作者
Lin, Fangfang [1 ,2 ]
Lin, Tianliang [1 ,2 ]
Yao, Yu [3 ]
Ren, Haoling [1 ,2 ]
Wu, Jiangdong [1 ,2 ]
Cai, Qipeng [1 ,2 ]
机构
[1] Huaqiao Univ, Coll Mech Engn & Automat, Xiamen 361021, Peoples R China
[2] Fujian Key Lab Green Intelligent Drive & Transmiss, Xiamen 361021, Peoples R China
[3] Beihang Univ, Sch Mech Engn & Automat, Beijing 102206, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-sensor fusion; Semantic segmentation; 3D point cloud; Autonomous driving; Intelligent perception; Dataset;
D O I
10.1016/j.patcog.2024.111014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The semantic segmentation of 3D point clouds holds paramount importance in visual perception tasks of automatic driving, including obstacle avoidance, decision control, path planning, and map construction, etc. Multi-sensor fusion is a rational and pivotal technique for implementing LiDAR semantic segmentation. However, effectively fusing and utilizing multi-source data remains a challenging task. In this work, we introduce a novel visual perception-assisted point cloud segmentation network, termed VPA-Net. This network architecture employs a dual-branch design to effectively combine spatial information from point clouds and visual cues from images, thereby bolstering the performance of 3D LiDAR semantic segmentation. More specifically, the dual-branch network structure processes and fuses multi-modal data from both point clouds and RGB images. Subsequently, the intermediate features from the two branches are merged via the proposed attention-based feature fusion module. Furthermore, to address the challenge of precise boundary prediction in large-scale point cloud scene segmentation, we introduce a refinement module based on 3D sparse convolution to enhance the spatial information of the LiDAR point clouds. The effectiveness of our method is validated on SemanticKITTI and a more challenging 3D semantic segmentation dataset. Experimental results underscore significant improvement on SemanticKITTI, with our approach surpassing the state-of-the-art method, achieving a 7.1% higher mIoU than SalsaNext.
引用
收藏
页数:10
相关论文
共 38 条
  • [1] Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds
    Alnaggar, Yara Ali
    Afifi, Mohamed
    Amer, Karim
    ElHelw, Mohamed
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1799 - 1808
  • [2] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
    Behley, Jens
    Garbade, Martin
    Milioto, Andres
    Quenzel, Jan
    Behnke, Sven
    Stachniss, Cyrill
    Gall, Juergen
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9296 - 9306
  • [3] The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
    Berman, Maxim
    Triki, Amal Rannen
    Blaschko, Matthew B.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4413 - 4421
  • [4] Cortinhal Tiago, 2020, Advances in Visual Computing. 15th International Symposium, ISVC 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12510), P207, DOI 10.1007/978-3-030-64559-5_16
  • [5] El Madawi K, 2019, IEEE INT C INTELL TR, P7, DOI [10.1109/itsc.2019.8917447, 10.1109/ITSC.2019.8917447]
  • [6] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
  • [7] 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
    Graham, Benjamin
    Engelcke, Martin
    van der Maaten, Laurens
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9224 - 9232
  • [8] Global-local consistent semi-supervised segmentation of histopathological image with different perturbations
    Guan, Xi
    Zhu, Qi
    Sun, Liang
    Zhao, Junyong
    Zhang, Daoqiang
    Wan, Peng
    Shao, Wei
    [J]. PATTERN RECOGNITION, 2024, 155
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] Learning Semantic Segmentation of Large-Scale Point Clouds With Random Sampling
    Hu, Qingyong
    Yang, Bo
    Xie, Linhai
    Rosa, Stefano
    Guo, Yulan
    Wang, Zhihua
    Trigoni, Niki
    Markham, Andrew
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 8338 - 8354