Full Point Encoding for Local Feature Aggregation in 3-D Point Clouds

被引:1
作者
He, Yong [1 ]
Yu, Hongshan [1 ]
Yang, Zhengeng [2 ]
Liu, Xiaoyan [1 ]
Sun, Wei [1 ]
Mian, Ajmal [3 ]
机构
[1] Hunan Univ, Quanzhou Inst Ind Design & Machine Intelligence In, Coll Elect & Informat Engn, Sch Robot, Changsha 410082, Peoples R China
[2] Hunan Normal Univ, Coll Engn & Design, Changsha 410082, Peoples R China
[3] Univ Western Australia, Dept Comp Sci, Perth, WA 6009, Australia
基金
中国国家自然科学基金;
关键词
3D point clouds; convolution; deep learning; global context; local features; transformer; SEGMENTATION; NETWORK;
D O I
10.1109/TNNLS.2024.3409891
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Point cloud processing methods exploit local point features and global context through aggregation which does not explicitly model the internal correlations between local and global features. To address this problem, we propose full point encoding which is applicable to convolution and transformer architectures. Specifically, we propose full point convolution (FuPConv) and full point transformer (FPTransformer) architectures. The key idea is to adaptively learn the weights from local and global geometric connections, where the connections are established through local and global correlation functions, respectively. FuPConv and FPTransformer simultaneously model the local and global geometric relationships as well as their internal correlations, demonstrating strong generalization ability and high performance. FuPConv is incorporated in classical hierarchical network architectures to achieve local and global shape-aware learning. In FPTransformer, we introduce full point position encoding in self-attention, that hierarchically encodes each point position in the global and local receptive field. We also propose a shape-aware downsampling block that takes into account the local shape and the global context. Experimental comparison to existing methods on benchmark datasets shows the efficacy of FuPConv and FPTransformer for semantic segmentation, object detection, classification, and normal estimation tasks. In particular, we achieve state-of-the-art semantic segmentation results of 76.8% mIoU on S3DIS sixfold and 73.1% on S3DIS Area 5. Our code is available at https://github.com/hnuhyuwa/FullPointTransformer.
引用
收藏
页码:8867 / 8881
页数:15
相关论文
共 82 条
[1]  
[Anonymous], 2015, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, June 8-10, DOI DOI 10.1109/CVPR.2015.7298801
[2]  
[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.170
[3]   Point Convolutional Neural Networks by Extension Operators [J].
Atzmon, Matan ;
Maron, Haggai ;
Lipman, Yaron .
ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04)
[4]   3DmFV: Three-Dimensional Point Cloud Classification in Real-Time Using Convolutional Neural Networks [J].
Ben-Shabat, Yizhak ;
Lindenbaum, Michael ;
Fischer, Anath .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04) :3145-3152
[5]   SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks [J].
Boulch, Alexandre ;
Guerry, Yids ;
Le Saux, Bertrand ;
Audebert, Nicolas .
COMPUTERS & GRAPHICS-UK, 2018, 71 :189-198
[6]   Boost 3-D Object Detection via Point Clouds Segmentation and Fused 3-D GIoU-L1 Loss [J].
Chen, Yaran ;
Li, Haoran ;
Gao, Ruiyuan ;
Zhao, Dongbin .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) :762-773
[7]   Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [J].
Cheng, Bowen ;
Sheng, Lu ;
Shi, Shaoshuai ;
Yang, Ming ;
Xu, Dong .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8959-8968
[8]   PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis [J].
Cheng, Silin ;
Chen, Xiwu ;
He, Xinwei ;
Liu, Zhe ;
Bai, Xiang .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :4436-4448
[9]   A Unified Point-Based Framework for 3D Segmentation [J].
Chiang, Hung-Yueh ;
Lin, Yen-Liang ;
Liu, Yueh-Cheng ;
Hsu, Winston H. .
2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, :155-163
[10]   PointMixer: MLP-Mixer for Point Cloud Understanding [J].
Choe, Jaesung ;
Park, Chunghyun ;
Rameau, Francois ;
Park, Jaesik ;
Kweon, In So .
COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 :620-640