MCTNet: Multiscale Cross-Attention-Based Transformer Network for Semantic Segmentation of Large-Scale Point Cloud

被引:9
|
作者
Guo, Bo [1 ]
Deng, Liwei [2 ]
Wang, Ruisheng [3 ,4 ]
Guo, Wenchao [1 ]
Ng, Alex Hay-Man [1 ]
Bai, Wenfeng [5 ]
机构
[1] Guangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R China
[2] Guilin Univ Technol, Coll Geomatics & Geoinformat, Guilin 541004, Peoples R China
[3] Shenzhen Univ, Sch Architecture & Urban Planning, Shenzhen 518060, Peoples R China
[4] Univ Calgary, Fac Schulich, Sch Engn, Calgary, AB T2N 1N4, Canada
[5] Res Inst Co Ltd, Geotech Branch Guangzhou Metro Design, Guangzhou 510010, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-attention; long-range dependency; point cloud; segmentation; CLASSIFICATION;
D O I
10.1109/TGRS.2023.3322579
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In this work, we implement a hybrid method to utilize sufficient information by aggregating both fine-grained and globally contextual features for point cloud semantic segmentation with a hierarchical network. By surpassing the defects of convolution operation mainly for extracting low-level features, we combine higher level cross-attention-based transformers to investigate the importance of long-range relations together with position embedding for multiscale feature representation. Specifically, by adding a learnable token to the feature sequence of a layer, a transformer encoder is first implemented with limited scope to embed these features. Furthermore, instead of performing all-to-all attention, we merely fuse tokens spanning various scales. To improve efficiency, we propose a simple yet efficient token-fusing architecture based on cross-attention, in which the computation of attention maps can be restricted within linear time by only using a token to calculate the query. The cross-attention module can be efficiently aggregated in a multiscale network to further enlarge the scope of the receptive field for attention. Experiments show that our multiscale cross-attention-based transformer network (MCTNet) achieves promising results on the three largest point cloud datasets, DALES, DublinCity, and S3DIS datasets. For the DALES benchmark dataset, MCTNet improves the mean intersection-over-union (mIoU) to 83.3% and the overall accuracy (OA) to 98.3%, which outperforms other existing baselines. We also perform abundant ablation studies on various attention and normalization modules and discuss the effect of parameters to validate the descriptive power of cross-attention modules and provide an understanding of how long-range dependency can be used to learn fair and unbiased features.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] RailPC: A large-scale railway point cloud semantic segmentation dataset
    Jiang, Tengping
    Li, Shiwei
    Zhang, Qinyu
    Wang, Guangshuai
    Zhang, Zequn
    Zeng, Fankun
    An, Peng
    Jin, Xin
    Liu, Shan
    Wang, Yongjun
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (06) : 1548 - 1560
  • [22] Retrieval-and-alignment based large-scale indoor point cloud semantic segmentation
    Zongyi XU
    Xiaoshui HUANG
    Bo YUAN
    Yangfu WANG
    Qianni ZHANG
    Weisheng LI
    Xinbo GAO
    Science China(Information Sciences), 2024, 67 (04) : 164 - 180
  • [23] Retrieval-and-alignment based large-scale indoor point cloud semantic segmentation
    Xu, Zongyi
    Huang, Xiaoshui
    Yuan, Bo
    Wang, Yangfu
    Zhang, Qianni
    Li, Weisheng
    Gao, Xinbo
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (04)
  • [24] Cascaded Contextual Reasoning for Large-Scale Point Cloud Semantic Segmentation
    Zhang, Fengyi
    Xia, Xiuyu
    IEEE ACCESS, 2023, 11 : 20755 - 20768
  • [25] TempNet: Online Semantic Segmentation on Large-scale Point Cloud Series
    Zhou, Yunsong
    Zhu, Hongzi
    Li, Chunqin
    Cui, Tiankai
    Chang, Shan
    Guo, Minyi
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7098 - 7107
  • [26] CSFNet: Cross-Modal Semantic Focus Network for Semantic Segmentation of Large-Scale Point Clouds
    Luo, Yang
    Han, Ting
    Liu, Yujun
    Su, Jinhe
    Chen, Yiping
    Li, Jinyuan
    Wu, Yundong
    Cai, Guorong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [27] General Local Graph Attention in Large-scale Point Cloud Segmentation
    Tran, Anh-Thuan
    Le, Hoanh-Su
    Kwon, Oh-Joon
    Lee, Suk-Hwan
    Kwon, Ki-Ryong
    2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,
  • [28] CaSaFormer: A cross- and self-attention based lightweight network for large-scale building semantic segmentation
    Li, Jiayi
    Hu, Yuping
    Huang, Xin
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 130
  • [29] A large-scale point cloud semantic segmentation neural network based on long-range contextual dependencies enhancement
    Zhang, Hua
    Wan, Ziyang
    Xu, Ruizheng
    Zheng, Nanshan
    Hao, Ming
    REMOTE SENSING LETTERS, 2024, 15 (05) : 501 - 513
  • [30] Efficient Semantic Segmentation for Large-Scale Agricultural Nursery Managements via Point Cloud-Based Neural Network
    Liu, Hui
    Xu, Jie
    Chen, Wen-Hua
    Shen, Yue
    Kai, Jinru
    REMOTE SENSING, 2024, 16 (21)