UM2Former: U-Shaped Multimixed Transformer Network for Large-Scale Hyperspectral Image Semantic Segmentation

被引:1
作者
Xu, Aijun [1 ]
Xue, Zhaohui [1 ]
Li, Ziyu [1 ]
Cheng, Shun [1 ]
Su, Hongjun [1 ]
Xia, Junshi [2 ]
机构
[1] Hohai Univ, Coll Geog & Remote Sensing, Nanjing 211100, Peoples R China
[2] RIKEN, Ctr Adv Intelligence Project AIP, Tokyo 1030027, Japan
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2025年 / 63卷
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Semantic segmentation; Semantics; Decoding; Convolutional neural networks; Data mining; Accuracy; Attention mechanisms; Convolutional codes; Hierarchical structure; large-scale hyperspectral image (HSI); positional encoding (PE); semantic segmentation; Transformer; CLASSIFICATION;
D O I
10.1109/TGRS.2025.3543821
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Transformer-based deep learning (DL) methods have gradually been advocated for remote sensing (RS) image semantic segmentation due to the great global modeling capability. Nevertheless, Transformer-based DL methods have not yet been sufficiently explored on the large-scale hyperspectral image (HSI) semantic segmentation. Current algorithms lack a comprehensive consideration of the impact of positional encoding (PE) interpolation when constructing Transformer-based decoders. Moreover, existing segmentation heads usually directly concatenate multiscale features to achieve segmentation, which ignores the inherent semantic differences between different features. To address the above issues, a U-shaped multimixed Transformer network (UM2Former) is proposed for large-scale HSI semantic segmentation. First, a weight encoder consisting of two modules, the overlap-down and the channel-weight, is built to extract hierarchical discriminative spectral-spatial features and decrease spectral redundancy. Second, the proposed multimixed Transformer block (MMTB) develops a PE-free module, spatial-feature-retention attention (SFRA) mechanism, in which "multimixed" represents the global dependency modeling of each pixel with the retented average spatial characteristics of different locations in the input feature maps. Finally, a linear fuse segmentation head (LFSH) is designed to align semantic information among multiscale feature maps and achieve accurate segmentation. Experiments were conducted in single cities and the entire large-scale WHU-OHS HSI dataset. The segmentation results indicated that the proposed method achieved higher accuracy compared to the existing semantic segmentation methods, with performance improvements of 17.80% and 4.16% in terms of intersection over union (mIoU) and overall accuracy (OA), respectively. The source code will be available at https://github.com/ZhaohuiXue/ UM2Former.
引用
收藏
页数:21
相关论文
共 64 条
[1]   Content-Driven Magnitude-Derivative Spectrum Complementary Learning for Hyperspectral Image Classification [J].
Bai, Huiyan ;
Xu, Tingfa ;
Chen, Huan ;
Liu, Peifu ;
Li, Jianan .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[2]   3-D Deep Learning Approach for Remote Sensing Image Classification [J].
Ben Hamida, Amina ;
Benoit, Alexandre ;
Lambert, Patrick ;
Ben Amar, Chokri .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (08) :4420-4434
[3]  
Bi H., 2024, Int. J. Comput. Vis., V2409, P1
[4]   Not Just Learning From Others but Relying on Yourself: A New Perspective on Few-Shot Segmentation in Remote Sensing [J].
Bi, Hanbo ;
Feng, Yingchao ;
Yan, Zhiyuan ;
Mao, Yongqiang ;
Diao, Wenhui ;
Wang, Hongqi ;
Sun, Xian .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[5]  
Chen J, 2021, arXiv, DOI [10.48550/arXiv.2102.04306, DOI 10.48550/ARXIV.2102.04306]
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]   Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks [J].
Chen, Yushi ;
Jiang, Hanlu ;
Li, Chunyang ;
Jia, Xiuping ;
Ghamisi, Pedram .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (10) :6232-6251
[8]   SANet: A Sea-Land Segmentation Network Via Adaptive Multiscale Feature Learning [J].
Cui, Binge ;
Jing, Wei ;
Huang, Ling ;
Li, Zhongrui ;
Lu, Yan .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :116-126
[9]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929]
[10]   Hyperspectral Image Instance Segmentation Using SpectralSpatial Feature Pyramid Network [J].
Fang, Leyuan ;
Jiang, Yifan ;
Yan, Yinglong ;
Yue, Jun ;
Deng, Yue .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61