MCTNet: Multiscale Cross-Attention-Based Transformer Network for Semantic Segmentation of Large-Scale Point Cloud

被引：9

作者：

Guo, Bo ^{[1
]}

Deng, Liwei ^{[2
]}

Wang, Ruisheng ^{[3
,4
]}

Guo, Wenchao ^{[1
]}

Ng, Alex Hay-Man ^{[1
]}

Bai, Wenfeng ^{[5
]}

机构：

[1] Guangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R China

[2] Guilin Univ Technol, Coll Geomatics & Geoinformat, Guilin 541004, Peoples R China

[3] Shenzhen Univ, Sch Architecture & Urban Planning, Shenzhen 518060, Peoples R China

[4] Univ Calgary, Fac Schulich, Sch Engn, Calgary, AB T2N 1N4, Canada

[5] Res Inst Co Ltd, Geotech Branch Guangzhou Metro Design, Guangzhou 510010, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷

基金：

中国国家自然科学基金;

关键词：

Cross-attention; long-range dependency; point cloud; segmentation; CLASSIFICATION;

D O I：

10.1109/TGRS.2023.3322579

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

In this work, we implement a hybrid method to utilize sufficient information by aggregating both fine-grained and globally contextual features for point cloud semantic segmentation with a hierarchical network. By surpassing the defects of convolution operation mainly for extracting low-level features, we combine higher level cross-attention-based transformers to investigate the importance of long-range relations together with position embedding for multiscale feature representation. Specifically, by adding a learnable token to the feature sequence of a layer, a transformer encoder is first implemented with limited scope to embed these features. Furthermore, instead of performing all-to-all attention, we merely fuse tokens spanning various scales. To improve efficiency, we propose a simple yet efficient token-fusing architecture based on cross-attention, in which the computation of attention maps can be restricted within linear time by only using a token to calculate the query. The cross-attention module can be efficiently aggregated in a multiscale network to further enlarge the scope of the receptive field for attention. Experiments show that our multiscale cross-attention-based transformer network (MCTNet) achieves promising results on the three largest point cloud datasets, DALES, DublinCity, and S3DIS datasets. For the DALES benchmark dataset, MCTNet improves the mean intersection-over-union (mIoU) to 83.3% and the overall accuracy (OA) to 98.3%, which outperforms other existing baselines. We also perform abundant ablation studies on various attention and normalization modules and discuss the effect of parameters to validate the descriptive power of cross-attention modules and provide an understanding of how long-range dependency can be used to learn fair and unbiased features.

引用

页数：20

共 50 条

[1] Radial Transformer for Large-Scale Outdoor LiDAR Point Cloud Semantic Segmentation
He, Xiang
Li, Xu
Ni, Peizhou
Xu, Wang
Xu, Qimin
Liu, Xixiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[2] PointNAT: Large-Scale Point Cloud Semantic Segmentation via Neighbor Aggregation With Transformer
Zeng, Ziyin
Qiu, Huan
Zhou, Jian
Dong, Zhen
Xiao, Jinsheng
Li, Bijun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 18
[3] Point and voxel cross perception with lightweight cosformer for large-scale point cloud semantic segmentation
Zhang, Shuai
Wang, Biao
Chen, Yiping
Zhang, Shuhang
Zhang, Wuming
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 131
[4] Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
Landrieu, Loic
Simonovsky, Martin
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4558 - 4567
[5] Point attention network for point cloud semantic segmentation
Ren, Dayong
Wu, Zhengyi
Li, Jiawei
Yu, Piaopiao
Guo, Jie
Wei, Mingqiang
Guo, Yanwen
SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (09)
[6] Point attention network for point cloud semantic segmentation
Dayong Ren
Zhengyi Wu
Jiawei Li
Piaopiao Yu
Jie Guo
Mingqiang Wei
Yanwen Guo
Science China Information Sciences, 2022, 65
[7] RailPC: A large-scale railway point cloud semantic segmentation dataset
Jiang, Tengping
Li, Shiwei
Zhang, Qinyu
Wang, Guangshuai
Zhang, Zequn
Zeng, Fankun
An, Peng
Jin, Xin
Liu, Shan
Wang, Yongjun
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (06) : 1548 - 1560
[8] Semantic segmentation for large-scale point clouds based on hybrid attention and dynamic fusion
Zhou, Ce
Shu, Zhaokun
Shi, Li
Ling, Qiang
PATTERN RECOGNITION, 2024, 156
[9] A large-scale point cloud semantic segmentation network via local dual features and global correlations
Zhao, Yiqiang
Ma, Xingyi
Hu, Bin
Zhang, Qi
Ye, Mao
Zhou, Guoqing
COMPUTERS & GRAPHICS-UK, 2023, 111 : 133 - 144
[10] STSD:A large-scale benchmark for semantic segmentation of subway tunnel point cloud
Cui, Hao
Li, Jian
Mao, Qingzhou
Hu, Qingwu
Dong, Cuijun
Tao, Yiwen
TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2024, 150

← 1 2 3 4 5 →