GAF-Net: Geometric Contextual Feature Aggregation and Adaptive Fusion for Large-Scale Point Cloud Semantic Segmentation

被引:10
作者
Zhou, Ce [1 ,2 ]
Ling, Qiang [1 ,2 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230027, Anhui, Peoples R China
[2] Inst Artificial Intelligence, Hefei Comprehens Natl Sci Ctr, Hefei 230031, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷
关键词
Adaptive fusion; geometric representation; point cloud; semantic segmentation; NETWORK;
D O I
10.1109/TGRS.2023.3336053
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Large-scale point cloud semantic segmentation is a challenging task due to the complexity and diversity of real-world 3-D scenes. Most existing methods primarily rely on spatial coordinates to learn geometric representations without fully exploring local structural relationships. Additionally, the semantic gap between the encoder and decoder in segmentation networks is an important factor that constrains model performance. To address these challenges, we propose a novel network architecture called GAF-Net, which comprises a geometric contextual feature aggregation (GCFA) module and a multiscale feature adaptive fusion (MFAF) module. The GCFA module consists of three primary blocks: 1) a geometric edge representation (GER) block, designed to leverage spatial relative position and orientation information between the center point and its neighbors to capture detailed local geometric structural relations; 2) a point geometry prior (PGP) block, aimed at extracting explicit geometric priors for each point from raw point clouds. This block is lightweight and parameter-free; and 3) a geometry-aware attentive pooling (GAAP) block, which combines semantic features with learned geometric representations, enabling the learning and aggregation of informative local contextual features. Our proposed MFAF module integrates multiscale features by introducing an adaptive fusion approach. It effectively bridges the semantic gap between the encoder and decoder and mitigates the information loss caused by random sampling. Extensive experimental results on three large-scale benchmark datasets.
引用
收藏
页数:15
相关论文
共 72 条
[1]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[2]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[3]   A Local-Global Feature Fusing Method for Point Clouds Semantic Segmentation [J].
Bi, Yuanwei ;
Zhang, Lujian ;
Liu, Yaowen ;
Huang, Yansen ;
Liu, Hao .
IEEE ACCESS, 2023, 11 :68776-68790
[4]   Pointwise Convolutional Neural Networks [J].
Binh-Son Hua ;
Minh-Khoi Tran ;
Yeung, Sai-Kit .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :984-993
[5]  
Chen J., 2023, IEEE Geosci. Remote Sens. Lett., V20, P1, DOI [10.1109/LGRS.2023.3327763, DOI 10.1109/LGRS.2023.3327763]
[6]   Background-Aware 3-D Point Cloud Segmentation With Dynamic Point Feature Aggregation [J].
Chen, Jiajing ;
Kakillioglu, Burak ;
Velipasalar, Senem .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[7]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[8]   4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].
Choy, Christopher ;
Gwak, JunYoung ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079
[9]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[10]  
Dosovitskiy A., 2021, P INT C LEARN REPR, DOI DOI 10.48550/ARXIV.2010.11929