Rethinking Transformers for Semantic Segmentation of Remote Sensing Images

被引:55
|
作者
Liu, Yuheng [1 ]
Zhang, Yifan [1 ]
Wang, Ye [1 ]
Mei, Shaohui [1 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710129, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷
基金
中国国家自然科学基金;
关键词
Encoder-decoder structure; global-local transformer; remote sensing (RS); semantic segmentation; CONVOLUTIONAL NETWORKS;
D O I
10.1109/TGRS.2023.3302024
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Transformer has been widely applied in image processing tasks as a substitute for convolutional neural networks (CNNs) for feature extraction due to its superiority in global context modeling and flexibility in model generalization. However, the existing transformer-based methods for semantic segmentation of remote sensing (RS) images are still with several limitations, which can be summarized into two main aspects: 1) the transformer encoder is generally combined with CNN-based decoder, leading to inconsistency in feature representations; and 2) the strategies for global and local context information utilization are not sufficiently effective. Therefore, in this article, a global-local transformer segmentor (GLOTS) framework is proposed for the semantic segmentation of RS images to acquire consistent feature representations by adopting transformers for both encoding and decoding, in which a masked image modeling (MIM) pretrained transformer encoder is adopted to learn semantic-rich representations of input images and a multiscale global-local transformer decoder is designed to fully exploit the global and local features. Specifically, the transformer decoder uses a feature separation-aggregation module (FSAM) to utilize the feature adequately at different scales and adopts a global-local attention module (GLAM) containing global attention block (GAB) and local attention block (LAB) to capture the global and local context information, respectively. Furthermore, a learnable progressive upsampling strategy (LPUS) is proposed to restore the resolution progressively, which can flexibly recover the fine-grained details in the upsampling process. The experiment results on the three benchmark RS datasets demonstrate that the proposed GLOTS is capable of achieving better performance with some state-of-the-art methods, and the superiority of the proposed framework is also verified by ablation studies. The code will be available at https://github.com/lyhnsn/GLOTS.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Distilling Segmenters From CNNs and Transformers for Remote Sensing Images' Semantic Segmentation
    Dong, Zhe
    Gao, Guoming
    Liu, Tianzhu
    Gu, Yanfeng
    Zhang, Xiangrong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [2] Novel Convolutions for Semantic Segmentation of Remote Sensing Images
    Xiao, Ruijie
    Zhong, Chuan
    Zeng, Wankang
    Cheng, Ming
    Wang, Cheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [3] Semantic Segmentation of Images Obtained by Remote Sensing of the Earth
    Igonin, Dmitry M.
    Tiumentsev, Yury V.
    ADVANCES IN NEURAL COMPUTATION, MACHINE LEARNING, AND COGNITIVE RESEARCH III, 2020, 856 : 309 - 318
  • [4] A NEW SEMANTIC SEGMENTATION MODEL FOR REMOTE SENSING IMAGES
    Wei, Xin
    Guo, Yajing
    Gao, Xin
    Yan, Menglong
    Sun, Xian
    2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 1776 - 1779
  • [5] Semantic Segmentation With Attention Mechanism for Remote Sensing Images
    Zhao, Qi
    Liu, Jiahui
    Li, Yuewen
    Zhang, Hong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [6] Semantic Segmentation of Remote Sensing Images With Sparse Annotations
    Hua, Yuansheng
    Marcos, Diego
    Mou, Lichao
    Zhu, Xiao Xiang
    Tuia, Devis
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [7] Threshold Attention Network for Semantic Segmentation of Remote Sensing Images
    Long, Wei
    Zhang, Yongjun
    Cui, Zhongwei
    Xu, Yujie
    Zhang, Xuexue
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [8] A Frequency Decoupling Network for Semantic Segmentation of Remote Sensing Images
    Li, Xin
    Xu, Feng
    Yu, Anzhu
    Lyu, Xin
    Gao, Hongmin
    Zhou, Jun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [9] Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images
    Muhammad Alam
    Jian-Feng Wang
    Cong Guangpei
    LV Yunrong
    Yuanfang Chen
    Mobile Networks and Applications, 2021, 26 : 200 - 215
  • [10] A Synergistical Attention Model for Semantic Segmentation of Remote Sensing Images
    Li, Xin
    Xu, Feng
    Liu, Fan
    Lyu, Xin
    Tong, Yao
    Xu, Zhennan
    Zhou, Jun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61