MCTNet: Multiscale Cross-Attention-Based Transformer Network for Semantic Segmentation of Large-Scale Point Cloud

被引:9
|
作者
Guo, Bo [1 ]
Deng, Liwei [2 ]
Wang, Ruisheng [3 ,4 ]
Guo, Wenchao [1 ]
Ng, Alex Hay-Man [1 ]
Bai, Wenfeng [5 ]
机构
[1] Guangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R China
[2] Guilin Univ Technol, Coll Geomatics & Geoinformat, Guilin 541004, Peoples R China
[3] Shenzhen Univ, Sch Architecture & Urban Planning, Shenzhen 518060, Peoples R China
[4] Univ Calgary, Fac Schulich, Sch Engn, Calgary, AB T2N 1N4, Canada
[5] Res Inst Co Ltd, Geotech Branch Guangzhou Metro Design, Guangzhou 510010, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷
基金
中国国家自然科学基金;
关键词
Cross-attention; long-range dependency; point cloud; segmentation; CLASSIFICATION;
D O I
10.1109/TGRS.2023.3322579
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In this work, we implement a hybrid method to utilize sufficient information by aggregating both fine-grained and globally contextual features for point cloud semantic segmentation with a hierarchical network. By surpassing the defects of convolution operation mainly for extracting low-level features, we combine higher level cross-attention-based transformers to investigate the importance of long-range relations together with position embedding for multiscale feature representation. Specifically, by adding a learnable token to the feature sequence of a layer, a transformer encoder is first implemented with limited scope to embed these features. Furthermore, instead of performing all-to-all attention, we merely fuse tokens spanning various scales. To improve efficiency, we propose a simple yet efficient token-fusing architecture based on cross-attention, in which the computation of attention maps can be restricted within linear time by only using a token to calculate the query. The cross-attention module can be efficiently aggregated in a multiscale network to further enlarge the scope of the receptive field for attention. Experiments show that our multiscale cross-attention-based transformer network (MCTNet) achieves promising results on the three largest point cloud datasets, DALES, DublinCity, and S3DIS datasets. For the DALES benchmark dataset, MCTNet improves the mean intersection-over-union (mIoU) to 83.3% and the overall accuracy (OA) to 98.3%, which outperforms other existing baselines. We also perform abundant ablation studies on various attention and normalization modules and discuss the effect of parameters to validate the descriptive power of cross-attention modules and provide an understanding of how long-range dependency can be used to learn fair and unbiased features.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] PointNAC: Copula-Based Point Cloud Semantic Segmentation Network
    Deng, Chunyuan
    Chen, Ruixing
    Tang, Wuyang
    Chu, Hexuan
    Xu, Gang
    Cui, Yue
    Peng, Zhenyun
    SYMMETRY-BASEL, 2023, 15 (11):
  • [42] 3D Large-Scale Point Cloud Semantic Segmentation Using Optimal Feature Description Vector Network: OFDV-Net
    Jian Li
    Quan Sun
    Chen, Keru
    Hao Cui
    Kuan Huangfu
    Chen, Xiaolong
    IEEE ACCESS, 2020, 8 (08): : 226285 - 226296
  • [43] Semantic segmentation of large-scale point cloud scenes via dual neighborhood feature and global spatial-aware
    Liu, Tao
    Ma, Tianen
    Du, Ping
    Li, Dehui
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 129
  • [44] Classification and Segmentation of Mining Area Objects in Large-Scale Spares Lidar Point Cloud Using a Novel Rotated Density Network
    Yan, Yueguan
    Yan, Haixu
    Guo, Junting
    Dai, Huayang
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2020, 9 (03)
  • [45] GAF-Net: Geometric Contextual Feature Aggregation and Adaptive Fusion for Large-Scale Point Cloud Semantic Segmentation
    Zhou, Ce
    Ling, Qiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [46] Recurrent Residual Dual Attention Network for Airborne Laser Scanning Point Cloud Semantic Segmentation
    Zeng, Tao
    Luo, Fulin
    Guo, Tan
    Gong, Xiuwen
    Xue, Jingyun
    Li, Hanshan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [47] TFNet: point cloud Semantic Segmentation Network based on Triple feature extraction
    Li, Yong
    Chen, Falin
    Lin, Qi
    Li, Zhen
    Gao, Dongxu
    Yang, Jingchao
    GEOCARTO INTERNATIONAL, 2025, 40 (01)
  • [48] UnrollingNet: An attention-based deep learning approach for the segmentation of large-scale point clouds of tunnels
    Zhang, Zhaoxiang
    Ji, Ankang
    Wang, Kunyu
    Zhang, Limao
    AUTOMATION IN CONSTRUCTION, 2022, 142
  • [49] A multi-point focus transformer approach for large-scale ALS point cloud ground filtering
    Liu, Tongyang
    Wei, Bo
    Hao, Jiaojiao
    Li, Zexia
    Ye, Fuqiang
    Wang, Lili
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2025, : 979 - 999
  • [50] MLFNet- Point Cloud Semantic Segmentation Convolution Network Based on Multi-Scale Feature Fusion
    Yang, Jingfang
    Zou, Bochang
    Qiu, Huadong
    Li, Zhi
    IEEE ACCESS, 2021, 9 : 44950 - 44962