STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation

被引：143

作者：

Gao, Liang ^{[1
,2
,3
]}

Liu, Hui ^{[2
,3
,4
]}

Yang, Minhang ^{[1
,2
,3
]}

Chen, Long ^{[1
,2
,3
]}

Wan, Yaling ^{[1
,2
,3
]}

Xiao, Zhengqing ^{[5
]}

Qian, Yurong ^{[1
,2
,3
]}

机构：

[1] Xinjiang Univ, Coll Software, Urumqi 830008, Peoples R China

[2] Key Lab Signal Detect & Proc Xinjiang Uygur Auton, Urumqi 830014, Peoples R China

[3] Key Lab Software Engn, Urumqi 830008, Peoples R China

[4] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830014, Peoples R China

[5] Xinjiang Univ, Coll Math & Syst Sci, Urumqi 830014, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING | 2021年 / 14卷 / 14期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

Remote sensing; Transformers; Semantics; Image segmentation; Computational modeling; Feature extraction; Context modeling; self-attention; semantic segmentation; Transformer;

D O I：

10.1109/JSTARS.2021.3119654

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The applied research in remote sensing images has been pushed by convolutional neural network (CNN). Because of the fixed size of the perceptual field, CNN is unable to model global semantic relevance. Modeling global semantic information is possible with the self-attentive Transformer-based model. However, the method of patch computation used by Transformer for self-attentive computation ignores the spatial information inside each patch. To address these issues, we offer the STransFuse model as a new semantic segmentation method for remote sensing images. It is a model that combines the benefits of Transformer with CNN to improve the segmentation quality of various remote sensing images. We employ a staged model to extract coarse-grained and fine-grained feature representations at various semantic scales, unlike earlier techniques based on Transformer model fusion. In order to take full advantage of the features acquired at different stages, we designed an adaptive fusion module. This module adaptively fuses the semantic information between features at different scales employing a self-attentive mechanism. The overall accuracy (OA) of our proposed model on the Vaihingen dataset is 1.36% higher than the baseline, and 1.27% improvement in OA over baseline on the Potsdam dataset. When compared to other advanced models, the STransFuse model performs admirably.

引用

页码：10990 / 11003

页数：14

共 50 条

[1] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
Fan, Lili
Zhou, Yu
Liu, Hongmei
Li, Yunjie
Cao, Dongpu
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
[2] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
He, Xin
Zhou, Yong
Zhao, Jiaqi
Zhang, Di
Yao, Rui
Xue, Yong
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[3] FEST: Feature Enhancement Swin Transformer for Remote Sensing Image Semantic Segmentation
Zhang, Ronghuan
Zhao, Jing
Li, Ming
Zou, Qingzhi
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1177 - 1182
[4] Classification of High-Resolution Remote Sensing Image Based on Swin Transformer and Convolutional Neural Network
He Xiaoying
Xu Weiming
Pan Kaixiang
Wang Juan
Li Ziwei
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (14)
[5] Enhanced Swin Transformer and Edge Spatial Attention for Remote Sensing Image Semantic Segmentation
Liu, Fuxiang
Hu, Zhiqiang
Li, Lei
Li, Hanlu
Liu, Xinxin
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 1296 - 1300
[6] Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images
Muhammad Alam
Jian-Feng Wang
Cong Guangpei
LV Yunrong
Yuanfang Chen
Mobile Networks and Applications, 2021, 26 : 200 - 215
[7] Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images
Alam, Muhammad
Wang, Jian-Feng
Guangpei, Cong
Yunrong, L., V
Chen, Yuanfang
MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01): : 200 - 215
[8] Semantic Segmentation of Remote Sensing Image Based on Convolutional Neural Network and Mask Generation
Niu, Binglin
MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
[9] Swin-Conv-Dspp and Global Local Transformer for Remote Sensing Image Semantic Segmentation
Mo, Youda
Li, Huihui
Xiao, Xiangling
Zhao, Huimin
Liu, Xiaoyong
Zhan, Jin
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 5284 - 5296
[10] Hybrid semantic segmentation for tunnel lining cracks based on Swin Transformer and convolutional neural network
Zhou, Zhong
Zhang, Junjie
Gong, Chenjie
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2023, 38 (17) : 2491 - 2510

← 1 2 3 4 5 →