UNeXt: An Efficient Network for the Semantic Segmentation of High-Resolution Remote Sensing Images

被引:1
|
作者
Chang, Zhanyuan [1 ]
Xu, Mingyu [1 ]
Wei, Yuwen [1 ]
Lian, Jie [1 ]
Zhang, Chongming [1 ]
Li, Chuanjiang [1 ]
机构
[1] Shanghai Normal Univ, Coll Informat Mech & Elect Engn, Shanghai 200234, Peoples R China
基金
上海市自然科学基金;
关键词
high-resolution remote sensing images; real-time semantic segmentation; convolutional attention; global-local context; transformer;
D O I
10.3390/s24206655
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The application of deep neural networks for the semantic segmentation of remote sensing images is a significant research area within the field of the intelligent interpretation of remote sensing data. The semantic segmentation of remote sensing images holds great practical value in urban planning, disaster assessment, the estimation of carbon sinks, and other related fields. With the continuous advancement of remote sensing technology, the spatial resolution of remote sensing images is gradually increasing. This increase in resolution brings about challenges such as significant changes in the scale of ground objects, redundant information, and irregular shapes within remote sensing images. Current methods leverage Transformers to capture global long-range dependencies. However, the use of Transformers introduces higher computational complexity and is prone to losing local details. In this paper, we propose UNeXt (UNet+ConvNeXt+Transformer), a real-time semantic segmentation model tailored for high-resolution remote sensing images. To achieve efficient segmentation, UNeXt uses the lightweight ConvNeXt-T as the encoder and a lightweight decoder, Transnext, which combines a Transformer and CNN (Convolutional Neural Networks) to capture global information while avoiding the loss of local details. Furthermore, in order to more effectively utilize spatial and channel information, we propose a SCFB (SC Feature Fuse Block) to reduce computational complexity while enhancing the model's recognition of complex scenes. A series of ablation experiments and comprehensive comparative experiments demonstrate that our method not only runs faster than state-of-the-art (SOTA) lightweight models but also achieves higher accuracy. Specifically, our proposed UNeXt achieves 85.2% and 82.9% mIoUs on the Vaihingen and Gaofen5 (GID5) datasets, respectively, while maintaining 97 fps for 512 x 512 inputs on a single NVIDIA GTX 4090 GPU, outperforming other SOTA methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] A Frequency Attention-Enhanced Network for Semantic Segmentation of High-Resolution Remote Sensing Images
    Zhong, Jianyi
    Zeng, Tao
    Xu, Zhennan
    Wu, Caifeng
    Qian, Shangtuo
    Xu, Nan
    Chen, Ziqi
    Lyu, Xin
    Li, Xin
    REMOTE SENSING, 2025, 17 (03)
  • [2] Multiscale Global Context Network for Semantic Segmentation of High-Resolution Remote Sensing Images
    Zeng, Qiaolin
    Zhou, Jingxiang
    Tao, Jinhua
    Chen, Liangfu
    Niu, Xuerui
    Zhang, Yumeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 13
  • [3] SEMANTIC SEGMENTATION OF HIGH-RESOLUTION REMOTE SENSING IMAGES USING AN IMPROVED TRANSFORMER
    Liu, Yuheng
    Mei, Shaohui
    Zhang, Shun
    Wang, Ye
    He, Mingyi
    Du, Qian
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 3496 - 3499
  • [4] Class-Guidance Network Based on the Pyramid Vision Transformer for Efficient Semantic Segmentation of High-Resolution Remote Sensing Images
    Du, Shuang
    Liu, Maohua
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 5578 - 5589
  • [5] HBRNet: Boundary Enhancement Segmentation Network for Cropland Extraction in High-Resolution Remote Sensing Images
    Sheng, Jiajia
    Sun, Youqiang
    Huang, He
    Xu, Wenyu
    Pei, Haotian
    Zhang, Wei
    Wu, Xiaowei
    AGRICULTURE-BASEL, 2022, 12 (08):
  • [6] Enhanced Lightweight End-to-End Semantic Segmentation for High-Resolution Remote Sensing Images
    Dong, He
    Yu, Baoguo
    Wu, Wanqing
    He, Chenglong
    IEEE ACCESS, 2022, 10 : 70947 - 70954
  • [7] CNN-transformer dual branch collaborative model for semantic segmentation of high-resolution remote sensing images
    Zhu, Xiaotong
    Peng, Taile
    Guo, Jia
    Wang, Hao
    Cao, Taotao
    PHOTOGRAMMETRIC RECORD, 2025, 40 (189)
  • [8] ASPP+-LANet: A Multi-Scale Context Extraction Network for Semantic Segmentation of High-Resolution Remote Sensing Images
    Hu, Lei
    Zhou, Xun
    Ruan, Jiachen
    Li, Supeng
    REMOTE SENSING, 2024, 16 (06)
  • [9] PGNet: Positioning Guidance Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Images
    Liu, Bo
    Hu, Jinwu
    Bi, Xiuli
    Li, Weisheng
    Gao, Xinbo
    REMOTE SENSING, 2022, 14 (17)
  • [10] MGFNet: An MLP-dominated gated fusion network for semantic segmentation of high-resolution multi-modal remote sensing images
    Wei, Kan
    Dai, Jinkun
    Hong, Danfeng
    Ye, Yuanxin
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 135