MUAN: Multiscale Upsampling Aggregation Network for 3-D Point Cloud Segmentation

被引:2
作者
Dai, Jiaxi [1 ]
Zhang, Youbing [1 ]
Bi, Dong [1 ]
Lan, Jianping [1 ]
机构
[1] Hubei Univ Automot Technol, Sch Automot Engineers, Shiyan 442002, Hubei, Peoples R China
关键词
Semantics; Point cloud compression; Three-dimensional displays; Feature extraction; Decoding; Kernel; Fuses; 3-D point cloud segmentation; Attentive ScoreNet (ASN); bilateral feature fusion (BFF); multiscale upsampling aggregation (MUA); semantic gap;
D O I
10.1109/LGRS.2022.3185299
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Three-dimensional point cloud semantic segmentation is of great significance for self-driving and virtual reality (VR), and it is an important research topic in 3-D vision. In this letter, we present a multiscale upsampling aggregation network (MUAN) to improve semantic segmentation performance in 3-D sophisticated environments. We take position adaptive convolution (PAConv) as the backbone of MUAN. First, to overcome the problem of weak capture ability of the multilayer perceptron (MLP) in low-dimensional space, we introduce an Attentive ScoreNet (ASN) module that combines a point feature enrichment (PFE) module and an attention mechanism to ensure provide effective distribution scores for the weight matrices. Second, to address the issue of the semantic gap between the encoders and decoders, we present a novel multiscale upsampling aggregation (MUA) module that expands the receptive field before semantic prediction, and it fused multiscale features containing information about multiscale encoder blocks and decoder blocks. Third, to facilitate the MUA module to achieve better results, we design a bilateral feature fusion (BFF) module which aims to boost the global awareness of the network in each decoder block. A combination of BFF and MUA pays attention to multiscale features of the encoder and decoder, thereby improving both the semantic representation and the possibility of correct semantic prediction of the point cloud. The experimental results reported that our MUAN can improve at least 2.36% mean intersection over union (mIoU) and 0.5% class mIoU in the Stanford 3D indoor scene dataset (S3DIS) Area-5 and ShapeNet datasets compared with PAConv.
引用
收藏
页数:5
相关论文
共 27 条
[1]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[2]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[3]   Deep Learning for 3D Point Clouds: A Survey [J].
Guo, Yulan ;
Wang, Hanyun ;
Hu, Qingyong ;
Liu, Hao ;
Liu, Li ;
Bennamoun, Mohammed .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) :4338-4364
[4]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[5]   Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation [J].
Jiang, Li ;
Zhao, Hengshuang ;
Liu, Shu ;
Shen, Xiaoyong ;
Fu, Chi-Wing ;
Jia, Jiaya .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :10432-10440
[6]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[7]   Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds [J].
Kuang, Hongwu ;
Wang, Bei ;
An, Jianping ;
Zhang, Ming ;
Zhang, Zehan .
SENSORS, 2020, 20 (03)
[8]   PointPillars: Fast Encoders for Object Detection from Point Clouds [J].
Lang, Alex H. ;
Vora, Sourabh ;
Caesar, Holger ;
Zhou, Lubing ;
Yang, Jiong ;
Beijbom, Oscar .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12689-12697
[9]  
Li YY, 2018, ADV NEUR IN, V31
[10]   FPConv: Learning Local Flattening for Point Convolution [J].
Lin, Yiqun ;
Yan, Zizheng ;
Huang, Haibin ;
Du, Dong ;
Liu, Ligang ;
Cui, Shuguang ;
Han, Xiaoguang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4292-4301