Light Self-Gaussian-Attention Vision Transformer for Hyperspectral Image Classification

被引:31
|
作者
Ma, Chao [1 ,2 ]
Wan, Minjie [1 ,2 ]
Wu, Jian [3 ]
Kong, Xiaofang [4 ]
Shao, Ajun [1 ,2 ]
Wang, Fan [1 ,2 ]
Chen, Qian [1 ,2 ]
Gu, Guohua [1 ,2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China
[2] Nanjing Univ Sci & Technol, Jiangsu Key Lab Spectral Imaging & Intelligent Sen, Nanjing 210094, Peoples R China
[3] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China
[4] Nanjing Univ Sci & Technol, Natl Key Lab Transient Phys, Nanjing 210094, Peoples R China
关键词
Feature extraction; Transformers; Principal component analysis; Computational modeling; Task analysis; Data mining; Correlation; Gaussian position module; hybrid spatial-spectral tokenizer; hyperspectral image (HSI) classification; light self-Gaussian attention (LSGA); location-aware long-distance modeling; NETWORK;
D O I
10.1109/TIM.2023.3279922
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification because of their exceptional performance in local feature extraction. However, due to the local join and weight sharing properties of the convolution kernel, CNNs have limitations in long-distance modeling, and deeper networks tend to increase computational costs. To address these issues, this article proposes a vision Transformer (VIT) based on the light self-Gaussian-attention (LSGA) mechanism, which extracts global deep semantic features. First, the hybrid spatial-spectral tokenizer module extracts shallow spatial-spectral features and expands image patches to generate tokens. Next, the light self-attention uses Q (query), X (origin input), and X instead of Q, K (key), and V (value) to reduce the computation and parameters. Furthermore, to avoid the lack of location information resulting in the aliasing of central and neighborhood features, we devise Gaussian absolute position bias to simulate HSI data distribution and make the attention weight closer to the central query block. Several experiments verify the effectiveness of the proposed method, which outperforms state-of-the-art methods on four datasets. Specifically, we observed a 0.62% accuracy improvement over A2S2K and a 0.11% improvement over SSFTT. In conclusion, the proposed LSGA-VIT method demonstrates promising results in the HSI classification and shows potential in addressing the issues of location-aware long-distance modeling and computational cost. Our codes are available at https://github.com/machao132/LSGA-VIT.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Hyperspectral image classification with embedded linear vision transformer
    Tan, Yunfei
    Li, Ming
    Yuan, Longfa
    Shi, Chaoshan
    Luo, Yonghang
    Wen, Guihao
    EARTH SCIENCE INFORMATICS, 2025, 18 (01)
  • [22] Multiscale and Cross-Level Attention Learning for Hyperspectral Image Classification
    Xu, Fulin
    Zhang, Ge
    Song, Chao
    Wang, Hui
    Mei, Shaohui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [23] Graph Guided Transformer: An Image-Based Global Learning Framework for Hyperspectral Image Classification
    Shi, Chengzhong
    Liao, Qiming
    Li, Xinping
    Zhao, Lin
    Li, Wen
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [24] Spatial Attention Guided Residual Attention Network for Hyperspectral Image Classification
    Li, Ningyang
    Wang, Zhaohui
    IEEE ACCESS, 2022, 10 : 9830 - 9847
  • [25] Local Transformer With Spatial Partition Restore for Hyperspectral Image Classification
    Xue, Zhaohui
    Xu, Qi
    Zhang, Mengxue
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 4307 - 4325
  • [26] CACFTNet: A Hybrid Cov-Attention and Cross-Layer Fusion Transformer Network for Hyperspectral Image Classification
    Cheng, Shuli
    Chan, Runze
    Du, Anyu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 18 - 18
  • [27] Deformable Convolution-Enhanced Hierarchical Transformer With Spectral-Spatial Cluster Attention for Hyperspectral Image Classification
    Fang, Yu
    Sun, Le
    Zheng, Yuhui
    Wu, Zebin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 701 - 716
  • [28] Dual-Branch Spectral–Spatial Attention Network for Hyperspectral Image Classification
    Zhao, Jinling
    Wang, Jiajie
    Ruan, Chao
    Dong, Yingying
    Huang, Linsheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 18
  • [29] TriTF: A Triplet Transformer Framework Based on Parents and Brother Attention for Hyperspectral Image Change Detection
    Wang, Xianghai
    Zhao, Keyun
    Zhao, Xiaoyang
    Li, Siyao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [30] Discriminative Vision Transformer for Heterogeneous Cross-Domain Hyperspectral Image Classification
    Ye, Minchao
    Ling, Jiawei
    Huo, Wanli
    Zhang, Zhaojuan
    Xiong, Fengchao
    Qian, Yuntao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62