3DGTN: 3-D Dual-Attention GLocal Transformer Network for Point Cloud Classification and Segmentation

被引：6

作者：

Lu, Dening ^{[1
]}

Gao, Kyle ^{[1
]}

Xie, Qian ^{[2
]}

Xu, Linlin ^{[1
]}

Li, Jonathan ^{[3
,4
]}

机构：

[1] Univ Waterloo, Dept Syst Design Engn, Waterloo, ON N2L 3G1, Canada

[2] Univ Oxford, Dept Comp Sci, Oxford OX1 3QD, England

[3] Univ Waterloo, Dept Syst Design Engn, Waterloo N2L 3G1, ON, Canada

[4] Univ Waterloo, Dept Geog & Environm Management, Waterloo, ON N2L 3G1, Canada

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Point cloud compression; Feature extraction; Three-dimensional displays; Decoding; Task analysis; Representation learning; Graph convolution; LiDAR data processing; point cloud classification; point cloud segmentation; self-attention mechanism; transformer;

D O I：

10.1109/TGRS.2024.3393845

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Although the application of Transformers to 3-D point cloud processing has achieved significant progress and success, it is still challenging for existing 3-D Transformer methods to efficiently and accurately learn both valuable global and local features for improved applications. This article presents a novel point cloud representational learning network, called 3-D Dual Self-attention global local (GLocal) Transformer Network (3DGTN), for improved feature learning in both classification and segmentation tasks, with the following key contributions. First, a GLocal feature learning (GFL) block with the dual self-attention mechanism [i.e., a novel point-patch self-attention, called PPSA, and a channel-wise self-attention (CSA)] is designed to efficiently learn the global and local context information. Second, the GFL block is integrated with a multiscale Graph Convolution-based local feature aggregation (LFA) block, leading to a GLocal information extraction module that can efficiently capture critical information. Third, a series of GLocal modules are used to construct a new hierarchical encoder-decoder structure to enable the learning of information in different scales in a hierarchical manner. The proposed framework is evaluated on both classification and segmentation datasets, demonstrating that the proposed method is capable of outperforming many state-of-the-art methods on both synthetic and LiDAR data. Our code has been released at https://github.com/d62lu/3DGTN.

引用

页码：1 / 13

页数：13

共 60 条

[1] [Anonymous], 2021, Int. J. Appl.Earth Observ. Geoinf., V105
[2] Atzmon M, 2018, Arxiv, DOI arXiv:1803.10091
[3] Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
Berg, Axel
Oskarsson, Magnus
O'Connor, Mark
[J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 528 - 534
[4] PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis
Cheng, Silin
Chen, Xiwu
He, Xinwei
Liu, Zhe
Bai, Xiang
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4436 - 4448
[5] Point Transformer
Engel, Nico
Belagiannis, Vasileios
Dietmayer, Klaus
[J]. IEEE ACCESS, 2021, 9 : 134826 - 134840
[6] Point attention network for semantic segmentation of 3D point clouds
Feng, Mingtao
Zhang, Liang
Lin, Xuefei
Gilani, Syed Zulqarnain
Mian, Ajmal
[J]. PATTERN RECOGNITION, 2020, 107 (107)
[7] LFT-Net: Local Feature Transformer Network for Point Clouds Analysis
Gao, Yongbin
Liu, Xuebing
Li, Jun
Fang, Zhijun
Jiang, Xiaoyan
Huq, Kazi Mohammed Saidul
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (02) : 2158 - 2168
[8] Goyal A, 2021, PR MACH LEARN RES, V139
[9] 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
Graham, Benjamin
Engelcke, Martin
van der Maaten, Laurens
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9224 - 9232
[10] PCT: Point cloud transformer
Guo, Meng-Hao
Cai, Jun-Xiong
Liu, Zheng-Ning
Mu, Tai-Jiang
Martin, Ralph R.
Hu, Shi-Min
[J]. COMPUTATIONAL VISUAL MEDIA, 2021, 7 (02) : 187 - 199

← 1 2 3 4 5 6 →