PVI-Net: Point-Voxel-Image Fusion for Semantic Segmentation of Point Clouds in Large-Scale Autonomous Driving Scenarios

被引:3
作者
Wang, Zongshun [1 ]
Li, Ce [1 ]
Ma, Jialin [1 ]
Feng, Zhiqiang [1 ]
Xiao, Limei [1 ]
机构
[1] Lanzhou Univ Technol, Sch Elect Engn & Informat Engn, Lanzhou 730050, Peoples R China
基金
中国国家自然科学基金;
关键词
semantic segmentation; multi-perspective; cross-attention; LiDAR point clouds;
D O I
10.3390/info15030148
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we introduce a novel framework for the semantic segmentation of point clouds in autonomous driving scenarios, termed PVI-Net. This framework uniquely integrates three different data perspectives-point clouds, voxels, and distance maps-executing feature extraction through three parallel branches. Throughout this process, we ingeniously design a point cloud-voxel cross-attention mechanism and a multi-perspective feature fusion strategy for point images. These strategies facilitate information interaction across different feature dimensions of perspectives, thereby optimizing the fusion of information from various viewpoints and significantly enhancing the overall performance of the model. The network employs a U-Net structure and residual connections, effectively merging and encoding information to improve the precision and efficiency of semantic segmentation. We validated the performance of PVI-Net on the SemanticKITTI and nuScenes datasets. The results demonstrate that PVI-Net surpasses most of the previous methods in various performance metrics.
引用
收藏
页数:16
相关论文
共 36 条
  • [21] Park C., Efficient Point Transformer for Large-Scale 3D Scene Understanding
  • [22] Qingyong Hu, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P11105, DOI 10.1109/CVPR42600.2020.01112
  • [23] KPConv: Flexible and Deformable Convolution for Point Clouds
    Thomas, Hugues
    Qi, Charles R.
    Deschaud, Jean-Emmanuel
    Marcotegui, Beatriz
    Goulette, Francois
    Guibas, Leonidas J.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6420 - 6429
  • [24] DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
    Wang, Haiyang
    Shi, Chen
    Shi, Shaoshuai
    Lei, Meng
    Wang, Sen
    He, Di
    Schiele, Bernt
    Wang, Liwei
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13520 - 13529
  • [25] SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
    Wei, Yi
    Zhao, Linqing
    Zheng, Wenzhao
    Zhu, Zheng
    Zhou, Jie
    Lu, Jiwen
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21672 - 21683
  • [26] Wu BC, 2018, IEEE INT CONF ROBOT, P1887
  • [27] Xia Y., 2023, P IEEECVF INT C COMP, P8461
  • [28] RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation
    Xu, Jianyun
    Zhang, Ruixiang
    Dou, Jian
    Zhu, Yushi
    Sun, Jie
    Pu, Shiliang
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16004 - 16013
  • [29] Yan X., 2022, ADV NEUR IN
  • [30] 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
    Yan, Xu
    Gao, Jiantao
    Zheng, Chaoda
    Zheng, Chao
    Zhang, Ruimao
    Cui, Shuguang
    Li, Zhen
    [J]. COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 677 - 695