VPA-Net: A visual perception assistance network for 3d lidar semantic segmentation

被引：0

作者：

Lin, Fangfang ^{[1
,2
]}

Lin, Tianliang ^{[1
,2
]}

Yao, Yu ^{[3
]}

Ren, Haoling ^{[1
,2
]}

Wu, Jiangdong ^{[1
,2
]}

Cai, Qipeng ^{[1
,2
]}

机构：

[1] Huaqiao Univ, Coll Mech Engn & Automat, Xiamen 361021, Peoples R China

[2] Fujian Key Lab Green Intelligent Drive & Transmiss, Xiamen 361021, Peoples R China

[3] Beihang Univ, Sch Mech Engn & Automat, Beijing 102206, Peoples R China

来源：

PATTERN RECOGNITION | 2025年 / 158卷

基金：

中国国家自然科学基金;

关键词：

Multi-sensor fusion; Semantic segmentation; 3D point cloud; Autonomous driving; Intelligent perception; Dataset;

D O I：

10.1016/j.patcog.2024.111014

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The semantic segmentation of 3D point clouds holds paramount importance in visual perception tasks of automatic driving, including obstacle avoidance, decision control, path planning, and map construction, etc. Multi-sensor fusion is a rational and pivotal technique for implementing LiDAR semantic segmentation. However, effectively fusing and utilizing multi-source data remains a challenging task. In this work, we introduce a novel visual perception-assisted point cloud segmentation network, termed VPA-Net. This network architecture employs a dual-branch design to effectively combine spatial information from point clouds and visual cues from images, thereby bolstering the performance of 3D LiDAR semantic segmentation. More specifically, the dual-branch network structure processes and fuses multi-modal data from both point clouds and RGB images. Subsequently, the intermediate features from the two branches are merged via the proposed attention-based feature fusion module. Furthermore, to address the challenge of precise boundary prediction in large-scale point cloud scene segmentation, we introduce a refinement module based on 3D sparse convolution to enhance the spatial information of the LiDAR point clouds. The effectiveness of our method is validated on SemanticKITTI and a more challenging 3D semantic segmentation dataset. Experimental results underscore significant improvement on SemanticKITTI, with our approach surpassing the state-of-the-art method, achieving a 7.1% higher mIoU than SalsaNext.

引用

页数：10

共 38 条

[1] Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds
Alnaggar, Yara Ali
Afifi, Mohamed
Amer, Karim
ElHelw, Mohamed
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1799 - 1808
[2] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
Behley, Jens
Garbade, Martin
Milioto, Andres
Quenzel, Jan
Behnke, Sven
Stachniss, Cyrill
Gall, Juergen
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9296 - 9306
[3] The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
Berman, Maxim
Triki, Amal Rannen
Blaschko, Matthew B.
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4413 - 4421
[4] Cortinhal Tiago, 2020, Advances in Visual Computing. 15th International Symposium, ISVC 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12510), P207, DOI 10.1007/978-3-030-64559-5_16
[5] El Madawi K, 2019, IEEE INT C INTELL TR, P7, DOI [10.1109/itsc.2019.8917447, 10.1109/ITSC.2019.8917447]
[6] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[7] 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
Graham, Benjamin
Engelcke, Martin
van der Maaten, Laurens
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9224 - 9232
[8] Global-local consistent semi-supervised segmentation of histopathological image with different perturbations
Guan, Xi
Zhu, Qi
Sun, Liang
Zhao, Junyong
Zhang, Daoqiang
Wan, Peng
Shao, Wei
[J]. PATTERN RECOGNITION, 2024, 155
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Learning Semantic Segmentation of Large-Scale Point Clouds With Random Sampling
Hu, Qingyong
Yang, Bo
Xie, Linhai
Rosa, Stefano
Guo, Yulan
Wang, Zhihua
Trigoni, Niki
Markham, Andrew
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 8338 - 8354

← 1 2 3 4 →