Swin-MSP: A Shifted Windows Masked Spectral Pretraining Model for Hyperspectral Image Classification

被引:1
作者
Tian, Rui [1 ]
Liu, Danqing [2 ]
Bai, Yu [3 ]
Jin, Yu [1 ]
Wan, Guanliang [1 ]
Guo, Yanhui [4 ]
机构
[1] Qinghai Normal Univ, Coll Comp, Xining 810008, Peoples R China
[2] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Chengdu 610059, Peoples R China
[3] Calif State Univ Fullerton, Sch Engn & Comp Sci, Fullerton, CA 90831 USA
[4] Shandong Womens Univ, Sch Data & Comp Sci, Jinan 250300, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
Hyperspectral imaging; Task analysis; Feature extraction; Image classification; Computer architecture; Computational modeling; Long short term memory; Hyperspectral image (HSI) classification; pretraining model; Swin-MAE; transformer; ATTENTION TRANSFORMER; CNN; NETWORKS;
D O I
10.1109/TGRS.2024.3431517
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Deep learning has found widespread application in the hyperspectral image (HSI) classification, where transformer architectures based on self-attention have emerged as state-of-the-art (SOTA). The Swin-MAE framework utilizes a masked autoencoder approach with a shifted windows transformer as its backbone, demonstrating strong representational power and performance. This study proposes a shifted windows masking spectral pretraining (Swin-MSP) model, which achieves hierarchical modeling of hyperspectral data from local to global scales by introducing spectral masking pretraining techniques and a hierarchical architecture. To fit with this pretraining, we introduce the uniaxial continuous cross correlation layer (UC3L), a straightforward yet effective solution tailored for hyperspectral imagery masking. We design the shift frequency band transformer (SFBT) to hierarchically characterize spectral features. Experiments with publicly available datasets establish that our pretrained network significantly improves classification efficiency compared with SOTA networks. Furthermore, we systematically investigate the sensitivity of various datasets to pretraining hyper-parameters. The results underscore that the universal spectral representation acquired during the pretraining phase serves as a robust initialization for subsequent task-specific fine-tuning. It is noted that this work breaks from traditional vision transformer (ViT) approaches, offering a new perspective on hyperspectral dataset pretraining. The code is available at https://github.com/teaRRe/Swin-MSP.
引用
收藏
页数:14
相关论文
共 67 条
  • [1] SpectralSWIN: a spectral-swin transformer network for hyperspectral image classification
    Ayas, Selen
    Tunc-Gormus, Esra
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (11) : 4025 - 4044
  • [2] Hyperspectral Image Classification Based on Multibranch Attention Transformer Networks
    Bai, Jing
    Wen, Zheng
    Xiao, Zhu
    Ye, Fawang
    Zhu, Yongdong
    Alazab, Mamoun
    Jiao, Licheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [3] Boulch A., 2017, P 25 C GREST, P1
  • [4] Transformer-Based Masked Autoencoder With Contrastive Loss for Hyperspectral Image Classification
    Cao, Xianghai
    Lin, Haifeng
    Guo, Shuaixu
    Xiong, Tao
    Jiao, Licheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [5] Hyperspectral Image Denoising Using Factor Group Sparsity-Regularized Nonconvex Low-Rank Approximation
    Chen, Yong
    Huang, Ting-Zhu
    He, Wei
    Zhao, Xi-Le
    Zhang, Hongyan
    Zeng, Jinshan
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [6] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [7] Gao L., 2021, IEEE Transactions on Geoscience and Remote Sensing, DOI [DOI 10.1109/TGRS.2021.3130716, 10.1109/TGRS.2021.3130716]
  • [8] Relationship Learning From Multisource Images via Spatial-Spectral Perception Network
    Gao, Yunhao
    Li, Wei
    Wang, Junjie
    Zhang, Mengmeng
    Tao, Ran
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 3271 - 3284
  • [9] Adversarial Complementary Learning for Multisource Remote Sensing Classification
    Gao, Yunhao
    Zhang, Mengmeng
    Li, Wei
    Song, Xiukai
    Jiang, Xiangyang
    Ma, Yuanqing
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [10] Hyperspectral Image Classification Method Based on 2D-3D CNN and Multibranch Feature Fusion
    Ge, Zixian
    Cao, Guo
    Li, Xuesong
    Fu, Peng
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 5776 - 5788