FEST: Feature Enhancement Swin Transformer for Remote Sensing Image Semantic Segmentation

被引：0

作者：

Zhang, Ronghuan ^{[1
,2
,3
]}

Zhao, Jing ^{[1
,2
,3
]}

Li, Ming ^{[4
]}

Zou, Qingzhi ^{[1
,2
,3
]}

机构：

[1] Qilu Univ Technol, Shandong Acad Sci, Shandong Comp Sci Ctr, Key Lab Comp Power Network & Informat Secur,Minis, Jinan, Peoples R China

[2] Qilu Univ Technol, Shandong Acad Sci, Fac Comp Sci & Technol, Shandong Engn Res Ctr Big Data Appl Technol, Jinan, Peoples R China

[3] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Networks, Jinan, Peoples R China

[4] Shandong Univ Tradit Chinese Med, Sch Intelligence & Informat Engn, Jinan, Peoples R China

来源：

PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024 | 2024年

关键词：

global information; semantic segmentation; Swin Transformer;

D O I：

10.1109/CSCWD61410.2024.10580494

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The global context is crucial for the precise segmentation of remote sensing images. However, the large volumes and high spatial resolutions of remote sensing images make efficient analysis of the entire scene challenging for most convolutional neural network (CNN)-based methods. To address this issue, we propose to design an innovative framework for semantic segmentation of remote sensing images called Feature Enhancement Swin Transformer (FEST). Firstly, we utilize the Swin Transformer as the encoder and incorporates a Global Information Enhancement Model (GIEM) within each Swin Transformer block to reduce information loss and enable encoding of more accurate spatial information. Secondly, we introduce an enhanced decoding structure called Enhanced Feature Fusion Module (EFFM) with added enhanced channel and spatial attention modules to retain localized information while obtaining extensive contextual information. Finally, for loss calculation, we utilize the dice and cross-entropy loss to jointly supervise the model, aiming to achieve a competitive performance. We comprehensively evaluated FEST on the ISPRS-Vaihingen and Potsdam datasets. The results indicate that our approach has achieved significant improvements in semantic segmentation tasks compared to existing methods.

引用

页码：1177 / 1182

页数：6

共 13 条

[1]

Chen J., 2023, IEEE Geosci. Remote Sens. Lett., V20, P1, DOI [10.1109/LGRS.2023.3327763, DOI 10.1109/LGRS.2023.3327763]

[2]

Li R., 2020, arXiv

[3] DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation [J].