PVI-Net: Point-Voxel-Image Fusion for Semantic Segmentation of Point Clouds in Large-Scale Autonomous Driving Scenarios

被引：3

作者：

Wang, Zongshun ^{[1
]}

Li, Ce ^{[1
]}

Ma, Jialin ^{[1
]}

Feng, Zhiqiang ^{[1
]}

Xiao, Limei ^{[1
]}

机构：

[1] Lanzhou Univ Technol, Sch Elect Engn & Informat Engn, Lanzhou 730050, Peoples R China

来源：

INFORMATION | 2024年 / 15卷 / 03期

基金：

中国国家自然科学基金;

关键词：

semantic segmentation; multi-perspective; cross-attention; LiDAR point clouds;

D O I：

10.3390/info15030148

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this study, we introduce a novel framework for the semantic segmentation of point clouds in autonomous driving scenarios, termed PVI-Net. This framework uniquely integrates three different data perspectives-point clouds, voxels, and distance maps-executing feature extraction through three parallel branches. Throughout this process, we ingeniously design a point cloud-voxel cross-attention mechanism and a multi-perspective feature fusion strategy for point images. These strategies facilitate information interaction across different feature dimensions of perspectives, thereby optimizing the fusion of information from various viewpoints and significantly enhancing the overall performance of the model. The network employs a U-Net structure and residual connections, effectively merging and encoding information to improve the precision and efficiency of semantic segmentation. We validated the performance of PVI-Net on the SemanticKITTI and nuScenes datasets. The results demonstrate that PVI-Net surpasses most of the previous methods in various performance metrics.

引用

页数：16

共 36 条

[21] Park C., Efficient Point Transformer for Large-Scale 3D Scene Understanding
[22] Qingyong Hu, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P11105, DOI 10.1109/CVPR42600.2020.01112
[23] KPConv: Flexible and Deformable Convolution for Point Clouds
Thomas, Hugues
Qi, Charles R.
Deschaud, Jean-Emmanuel
Marcotegui, Beatriz
Goulette, Francois
Guibas, Leonidas J.
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6420 - 6429
[24] DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Wang, Haiyang
Shi, Chen
Shi, Shaoshuai
Lei, Meng
Wang, Sen
He, Di
Schiele, Bernt
Wang, Liwei
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13520 - 13529
[25] SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
Wei, Yi
Zhao, Linqing
Zheng, Wenzhao
Zhu, Zheng
Zhou, Jie
Lu, Jiwen
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21672 - 21683
[26] Wu BC, 2018, IEEE INT CONF ROBOT, P1887
[27] Xia Y., 2023, P IEEECVF INT C COMP, P8461
[28] RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation
Xu, Jianyun
Zhang, Ruixiang
Dou, Jian
Zhu, Yushi
Sun, Jie
Pu, Shiliang
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16004 - 16013
[29] Yan X., 2022, ADV NEUR IN
[30] 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
Yan, Xu
Gao, Jiantao
Zheng, Chaoda
Zheng, Chao
Zhang, Ruimao
Cui, Shuguang
Li, Zhen
[J]. COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 677 - 695

← 1 2 3 4 →