Light Self-Gaussian-Attention Vision Transformer for Hyperspectral Image Classification

被引:31
|
作者
Ma, Chao [1 ,2 ]
Wan, Minjie [1 ,2 ]
Wu, Jian [3 ]
Kong, Xiaofang [4 ]
Shao, Ajun [1 ,2 ]
Wang, Fan [1 ,2 ]
Chen, Qian [1 ,2 ]
Gu, Guohua [1 ,2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China
[2] Nanjing Univ Sci & Technol, Jiangsu Key Lab Spectral Imaging & Intelligent Sen, Nanjing 210094, Peoples R China
[3] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China
[4] Nanjing Univ Sci & Technol, Natl Key Lab Transient Phys, Nanjing 210094, Peoples R China
关键词
Feature extraction; Transformers; Principal component analysis; Computational modeling; Task analysis; Data mining; Correlation; Gaussian position module; hybrid spatial-spectral tokenizer; hyperspectral image (HSI) classification; light self-Gaussian attention (LSGA); location-aware long-distance modeling; NETWORK;
D O I
10.1109/TIM.2023.3279922
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification because of their exceptional performance in local feature extraction. However, due to the local join and weight sharing properties of the convolution kernel, CNNs have limitations in long-distance modeling, and deeper networks tend to increase computational costs. To address these issues, this article proposes a vision Transformer (VIT) based on the light self-Gaussian-attention (LSGA) mechanism, which extracts global deep semantic features. First, the hybrid spatial-spectral tokenizer module extracts shallow spatial-spectral features and expands image patches to generate tokens. Next, the light self-attention uses Q (query), X (origin input), and X instead of Q, K (key), and V (value) to reduce the computation and parameters. Furthermore, to avoid the lack of location information resulting in the aliasing of central and neighborhood features, we devise Gaussian absolute position bias to simulate HSI data distribution and make the attention weight closer to the central query block. Several experiments verify the effectiveness of the proposed method, which outperforms state-of-the-art methods on four datasets. Specifically, we observed a 0.62% accuracy improvement over A2S2K and a 0.11% improvement over SSFTT. In conclusion, the proposed LSGA-VIT method demonstrates promising results in the HSI classification and shows potential in addressing the issues of location-aware long-distance modeling and computational cost. Our codes are available at https://github.com/machao132/LSGA-VIT.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Densely Connected Multiscale Attention Network for Hyperspectral Image Classification
    Gao, Hongmin
    Miao, Yawen
    Cao, Xueying
    Li, Chenming
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 2563 - 2576
  • [32] Hyperspectral Image Classification Based on Graph Transformer Network and Graph Attention Mechanism
    Zhao, Xiaofeng
    Niu, Jiahui
    Liu, Chuntong
    Ding, Yao
    Hong, Danfeng
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [33] Compact Band Weighting Module Based on Attention-Driven for Hyperspectral Image Classification
    Zhao, Lin
    Yi, Jiawen
    Li, Xi
    Hu, Wenjing
    Wu, Jianhui
    Zhang, Guoyun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (11): : 9540 - 9552
  • [34] PSFormer: Pyramid Superpixel Transformer for Hyperspectral Image Classification
    Zou, Jiaqi
    He, Wei
    Zhang, Hongyan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [35] Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data Classification
    Xue, Zhixiang
    Tan, Xiong
    Yu, Xuchu
    Liu, Bing
    Yu, Anzhu
    Zhang, Pengqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3095 - 3110
  • [36] Large Kernel Spectral and Spatial Attention Networks for Hyperspectral Image Classification
    Sun, Genyun
    Pan, Zhaojie
    Zhang, Aizhu
    Jia, Xiuping
    Ren, Jinchang
    Fu, Hang
    Yan, Kai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [37] Feedback Attention-Based Dense CNN for Hyperspectral Image Classification
    Yu, Chunyan
    Han, Rui
    Song, Meiping
    Liu, Caiyu
    Chang, Chein-, I
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [38] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
    Liu, Jun
    Guo, Haoran
    He, Yile
    Li, Huali
    REMOTE SENSING, 2023, 15 (21)
  • [39] Bridging CNN and Transformer With Cross-Attention Fusion Network for Hyperspectral Image Classification
    Xu, Fulin
    Mei, Shaohui
    Zhang, Ge
    Wang, Nan
    Du, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [40] MSTNet: A Multilevel Spectral-Spatial Transformer Network for Hyperspectral Image Classification
    Yu, Haoyang
    Xu, Zhen
    Zheng, Ke
    Hong, Danfeng
    Yang, Hao
    Song, Meiping
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60