Swin-CFNet: An Attempt at Fine-Grained Urban Green Space Classification Using Swin Transformer and Convolutional Neural Network

被引：2

作者：

Wu, Yehong ^{[1
]}

Zhang, Meng ^{[1
]}

机构：

[1] Cent South Univ Forestry & Technol, Coll Forestry, Changsha 410004, Peoples R China

来源：

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS | 2024年 / 21卷

基金：

中国国家自然科学基金;

关键词：

Transformers; Green products; Feature extraction; Convolution; Convolutional neural networks; Biological system modeling; Remote sensing; Classification; remote sensing imagery; swin transformer; swin transformer-CNN-fusion-network (Swin-CFNet); urban green space;

D O I：

10.1109/LGRS.2024.3404393

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Urban green space plays a critical role in contemporary urban planning and ecology as they provide recreational space for residents, promote ecological balance, and enhance the quality of the urban environment. However, the rapid development of urbanization poses increasingly complex challenges to the monitoring and management of these spaces. Previous studies have illustrated that semantic segmentation models based on convolutional neural network (CNN) perform well in classifying urban green space using high-resolution remote sensing images. However, there are still some deficiencies in CNNs model in capturing global information of green space and dealing with complex spatial relationships due to the special nature of urban environments, such as fragmentation of green space. Hence, swin transformer-CNN-fusion-network (Swin-CFNet) was proposed for urban green space classification, which overcomes the limitations of traditional methods in dealing with global green space information and complex spatial relationships by constructing a residual-swin-fusion (RSF) module for the fusion of multisource features. Experimental results demonstrated that the Swin-CFNet outperformed the UNet in urban green space classification, achieving an overall accuracy (OA) of 98.3% and improving the mean intersection over union (mIoU) compared to UNet and SwinUnet by 3.7% and 1%, respectively.

引用

页数：5

共 19 条

[1] Burrewar Smita Sunil, 2023, 2023 15th International Conference on Computer and Automation Engineering (ICCAE), P30, DOI 10.1109/ICCAE56788.2023.10111467
[2] Chen J., 2021, arXiv, DOI [DOI 10.48550/ARXIV.2102.04306, 10.48550/arXiv.2102.04306]
[3] MLAE: A Pretraining Method for Automatic Identification of Urban Public Space
Cheng, Siyuan
Chen, Huan
Yao, Ping
Song, Liuyi
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[4] Dosovitskiy A., 2020, P INT C LEARN REPR, P1
[5] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[6] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[7] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) : 1904 - 1916
[8] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[9] isprs, 2018, ISPRS Semantic Labeling Contest-Potsdam
[10] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002

← 1 2 →