MCTNet: Multiscale Cross-Attention-Based Transformer Network for Semantic Segmentation of Large-Scale Point Cloud

被引:9
|
作者
Guo, Bo [1 ]
Deng, Liwei [2 ]
Wang, Ruisheng [3 ,4 ]
Guo, Wenchao [1 ]
Ng, Alex Hay-Man [1 ]
Bai, Wenfeng [5 ]
机构
[1] Guangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R China
[2] Guilin Univ Technol, Coll Geomatics & Geoinformat, Guilin 541004, Peoples R China
[3] Shenzhen Univ, Sch Architecture & Urban Planning, Shenzhen 518060, Peoples R China
[4] Univ Calgary, Fac Schulich, Sch Engn, Calgary, AB T2N 1N4, Canada
[5] Res Inst Co Ltd, Geotech Branch Guangzhou Metro Design, Guangzhou 510010, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-attention; long-range dependency; point cloud; segmentation; CLASSIFICATION;
D O I
10.1109/TGRS.2023.3322579
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In this work, we implement a hybrid method to utilize sufficient information by aggregating both fine-grained and globally contextual features for point cloud semantic segmentation with a hierarchical network. By surpassing the defects of convolution operation mainly for extracting low-level features, we combine higher level cross-attention-based transformers to investigate the importance of long-range relations together with position embedding for multiscale feature representation. Specifically, by adding a learnable token to the feature sequence of a layer, a transformer encoder is first implemented with limited scope to embed these features. Furthermore, instead of performing all-to-all attention, we merely fuse tokens spanning various scales. To improve efficiency, we propose a simple yet efficient token-fusing architecture based on cross-attention, in which the computation of attention maps can be restricted within linear time by only using a token to calculate the query. The cross-attention module can be efficiently aggregated in a multiscale network to further enlarge the scope of the receptive field for attention. Experiments show that our multiscale cross-attention-based transformer network (MCTNet) achieves promising results on the three largest point cloud datasets, DALES, DublinCity, and S3DIS datasets. For the DALES benchmark dataset, MCTNet improves the mean intersection-over-union (mIoU) to 83.3% and the overall accuracy (OA) to 98.3%, which outperforms other existing baselines. We also perform abundant ablation studies on various attention and normalization modules and discuss the effect of parameters to validate the descriptive power of cross-attention modules and provide an understanding of how long-range dependency can be used to learn fair and unbiased features.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] LPFE-Net: a local parallel feature extraction network for large-scale point cloud semantic segmentation
    Ai, Da
    Zhang, Xiaoyang
    Xu, Ce
    Liu, Xinlong
    Yuan, Hui
    Liu, Ying
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (05)
  • [42] HFA-Net: hybrid feature-aware network for large-scale point cloud semantic segmentation
    Wen, Changji
    Zhang, Long
    Ren, Junfeng
    Hong, Rundong
    Li, Chenshuang
    Yang, Ce
    Lv, Yanfeng
    Chen, Hongbing
    Yang, Ning
    ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (04)
  • [43] DCNet: Large-Scale Point Cloud Semantic Segmentation With Discriminative and Efficient Feature Aggregation
    Yin, Fukun
    Huang, Zilong
    Chen, Tao
    Luo, Guozhong
    Yu, Gang
    Fu, Bin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (08) : 4083 - 4095
  • [44] Multi-view Network with Transformer for Point Cloud Semantic Segmentation
    Hua, Zhongwei
    Du, Daming
    6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 161 - 165
  • [45] Point cloud semantic segmentation based on local feature fusion and multilayer attention network
    Wen, Junjie
    Ma, Jie
    Zhao, Yuehua
    Nie, Tong
    Sun, Mengxuan
    Fan, Ziming
    IET COMPUTER VISION, 2024, 18 (03) : 381 - 392
  • [46] PointBiMssc: Bidirectional Multiscale Attention-Based Point Cloud Semantic Segmentation for Water Conservancy Environment
    Zhou, Wei
    Jiao, Jianbin
    Xu, Haixia
    Wei, Mingan
    Zhao, Xueqiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [47] Deep Graph Attention Convolution Network for Point Cloud Semantic Segmentation
    Chai Yujing
    Ma Jie
    Liu Hong
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (12)
  • [48] Large-scale point cloud semantic segmentation via local perception and global descriptor vector
    Zeng, Ziyin
    Xu, Yongyang
    Xie, Zhong
    Tang, Wei
    Wan, Jie
    Wu, Weichao
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 246
  • [49] A 3D Semantic Segmentation Method for Large-Scale Point Cloud on Deep Learning
    Liu, Sihan
    Zhang, Wenyu
    Zhang, Yujun
    Wang, Zhijian
    Gao, Dongxiang
    ENGINEERING LETTERS, 2023, 31 (04) : 1667 - 1674
  • [50] Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation
    Zhang, Yachao
    Qu, Yanyun
    Xie, Yuan
    Li, Zonghao
    Zheng, Shanshan
    Li, Cuihua
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15500 - 15508