PVI-Net: Point-Voxel-Image Fusion for Semantic Segmentation of Point Clouds in Large-Scale Autonomous Driving Scenarios

被引:3
作者
Wang, Zongshun [1 ]
Li, Ce [1 ]
Ma, Jialin [1 ]
Feng, Zhiqiang [1 ]
Xiao, Limei [1 ]
机构
[1] Lanzhou Univ Technol, Sch Elect Engn & Informat Engn, Lanzhou 730050, Peoples R China
基金
中国国家自然科学基金;
关键词
semantic segmentation; multi-perspective; cross-attention; LiDAR point clouds;
D O I
10.3390/info15030148
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we introduce a novel framework for the semantic segmentation of point clouds in autonomous driving scenarios, termed PVI-Net. This framework uniquely integrates three different data perspectives-point clouds, voxels, and distance maps-executing feature extraction through three parallel branches. Throughout this process, we ingeniously design a point cloud-voxel cross-attention mechanism and a multi-perspective feature fusion strategy for point images. These strategies facilitate information interaction across different feature dimensions of perspectives, thereby optimizing the fusion of information from various viewpoints and significantly enhancing the overall performance of the model. The network employs a U-Net structure and residual connections, effectively merging and encoding information to improve the precision and efficiency of semantic segmentation. We validated the performance of PVI-Net on the SemanticKITTI and nuScenes datasets. The results demonstrate that PVI-Net surpasses most of the previous methods in various performance metrics.
引用
收藏
页数:16
相关论文
共 36 条
  • [1] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding
    Afham, Mohamed
    Dissanayake, Isuru
    Dissanayake, Dinithi
    Dharmasiri, Amaya
    Thilakarathna, Kanchana
    Rodrigo, Ranga
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9892 - 9902
  • [2] DGCNN: A convolutional neural network over large-scale labeled graphs
    Anh Viet Phan
    Minh Le Nguyen
    Yen Lam Hoang Nguyen
    Lam Thu Bui
    [J]. NEURAL NETWORKS, 2018, 108 : 533 - 543
  • [3] Semantic labeling of lidar point clouds for UAV applications
    Axelsson, Maria
    Holmberg, Max
    Serra, Sabina
    Ovren, Hannes
    Tulldahl, Michael
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4309 - 4316
  • [4] PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
    Chen, Anthony
    Zhang, Kevin
    Zhang, Renrui
    Wang, Zihan
    Lu, Yuheng
    Guo, Yandong
    Zhang, Shanghang
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5291 - 5301
  • [5] Chenfeng Xu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12373), P1, DOI 10.1007/978-3-030-58604-1_1
  • [6] (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
    Cheng, Ran
    Razani, Ryan
    Taghavi, Ehsan
    Li, Enxu
    Liu, Bingbing
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12542 - 12551
  • [7] 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
    Choy, Christopher
    Gwak, JunYoung
    Savarese, Silvio
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3070 - 3079
  • [8] Cortinhal Tiago, 2020, Advances in Visual Computing. 15th International Symposium, ISVC 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12510), P207, DOI 10.1007/978-3-030-64559-5_16
  • [9] Cui MY, 2023, AAAI CONF ARTIF INTE, P470
  • [10] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554