UM2Former: U-Shaped Multimixed Transformer Network for Large-Scale Hyperspectral Image Semantic Segmentation

被引：1

作者：

Xu, Aijun ^{[1
]}

Xue, Zhaohui ^{[1
]}

Li, Ziyu ^{[1
]}

Cheng, Shun ^{[1
]}

Su, Hongjun ^{[1
]}

Xia, Junshi ^{[2
]}

机构：

[1] Hohai Univ, Coll Geog & Remote Sensing, Nanjing 211100, Peoples R China

[2] RIKEN, Ctr Adv Intelligence Project AIP, Tokyo 1030027, Japan

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2025年 / 63卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Transformers; Semantic segmentation; Semantics; Decoding; Convolutional neural networks; Data mining; Accuracy; Attention mechanisms; Convolutional codes; Hierarchical structure; large-scale hyperspectral image (HSI); positional encoding (PE); semantic segmentation; Transformer; CLASSIFICATION;

D O I：

10.1109/TGRS.2025.3543821

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Transformer-based deep learning (DL) methods have gradually been advocated for remote sensing (RS) image semantic segmentation due to the great global modeling capability. Nevertheless, Transformer-based DL methods have not yet been sufficiently explored on the large-scale hyperspectral image (HSI) semantic segmentation. Current algorithms lack a comprehensive consideration of the impact of positional encoding (PE) interpolation when constructing Transformer-based decoders. Moreover, existing segmentation heads usually directly concatenate multiscale features to achieve segmentation, which ignores the inherent semantic differences between different features. To address the above issues, a U-shaped multimixed Transformer network (UM2Former) is proposed for large-scale HSI semantic segmentation. First, a weight encoder consisting of two modules, the overlap-down and the channel-weight, is built to extract hierarchical discriminative spectral-spatial features and decrease spectral redundancy. Second, the proposed multimixed Transformer block (MMTB) develops a PE-free module, spatial-feature-retention attention (SFRA) mechanism, in which "multimixed" represents the global dependency modeling of each pixel with the retented average spatial characteristics of different locations in the input feature maps. Finally, a linear fuse segmentation head (LFSH) is designed to align semantic information among multiscale feature maps and achieve accurate segmentation. Experiments were conducted in single cities and the entire large-scale WHU-OHS HSI dataset. The segmentation results indicated that the proposed method achieved higher accuracy compared to the existing semantic segmentation methods, with performance improvements of 17.80% and 4.16% in terms of intersection over union (mIoU) and overall accuracy (OA), respectively. The source code will be available at https://github.com/ZhaohuiXue/ UM2Former.

引用

页数：21

共 64 条

[1] Content-Driven Magnitude-Derivative Spectrum Complementary Learning for Hyperspectral Image Classification [J].

Bai, Huiyan ;

Xu, Tingfa ;

Chen, Huan ;

Liu, Peifu ;

Li, Jianan .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62

[2] 3-D Deep Learning Approach for Remote Sensing Image Classification [J].

Ben Hamida, Amina ;

Benoit, Alexandre ;

Lambert, Patrick ;

Ben Amar, Chokri .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (08) :4420-4434

[3]

Bi H., 2024, Int. J. Comput. Vis., V2409, P1

[4] Not Just Learning From Others but Relying on Yourself: A New Perspective on Few-Shot Segmentation in Remote Sensing [J].

Bi, Hanbo ;

Feng, Yingchao ;

Yan, Zhiyuan ;

Mao, Yongqiang ;

Diao, Wenhui ;

Wang, Hongqi ;

Sun, Xian .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61

[5]

Chen J, 2021, arXiv, DOI [10.48550/arXiv.2102.04306, DOI 10.48550/ARXIV.2102.04306]

[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[7] Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks [J].

Chen, Yushi ;

Jiang, Hanlu ;

Li, Chunyang ;

Jia, Xiuping ;

Ghamisi, Pedram .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (10) :6232-6251

[8] SANet: A Sea-Land Segmentation Network Via Adaptive Multiscale Feature Learning [J].

Cui, Binge ;

Jing, Wei ;

Huang, Ling ;

Li, Zhongrui ;

Lu, Yan .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :116-126

[9]

Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929]

[10] Hyperspectral Image Instance Segmentation Using SpectralSpatial Feature Pyramid Network [J].

Fang, Leyuan ;

Jiang, Yifan ;

Yan, Yinglong ;

Yue, Jun ;

Deng, Yue .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61

← 1 2 3 4 5 6 7 →