A Spatial-Spectral Transformer for Hyperspectral Image Classification Based on Global Dependencies of Multi-Scale Features

被引:21
作者
Ma, Yunxuan [1 ]
Lan, Yan [1 ]
Xie, Yakun [2 ]
Yu, Lanxin [3 ]
Chen, Chen [1 ]
Wu, Yusong [1 ]
Dai, Xiaoai [1 ]
机构
[1] Chengdu Univ Technol, Coll Earth Sci, Chengdu 610059, Peoples R China
[2] Southwest Jiaotong Univ, Fac Geosci & Environm Engn, Chengdu 610097, Peoples R China
[3] East China Normal Univ, Sch Stat, Shanghai 200062, Peoples R China
关键词
attention mechanism; convolutional neural networks; hyperspectral image classification; hybrid network; transformer; QUALITY; MACHINE; NETWORK;
D O I
10.3390/rs16020404
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Vision transformers (ViTs) are increasingly utilized for HSI classification due to their outstanding performance. However, ViTs encounter challenges in capturing global dependencies among objects of varying sizes, and fail to effectively exploit the spatial-spectral information inherent in HSI. In response to this limitation, we propose a novel solution: the multi-scale spatial-spectral transformer (MSST). Within the MSST framework, we introduce a spatial-spectral token generator (SSTG) and a token fusion self-attention (TFSA) module. Serving as the feature extractor for the MSST, the SSTG incorporates a dual-branch multi-dimensional convolutional structure, enabling the extraction of semantic characteristics that encompass spatial-spectral information from HSI and subsequently tokenizing them. TFSA is a multi-head attention module with the ability to encode attention to features across various scales. We integrated TFSA with cross-covariance attention (CCA) to construct the transformer encoder (TE) for the MSST. Utilizing this TE to perform attention modeling on tokens derived from the SSTG, the network effectively simulates global dependencies among multi-scale features in the data, concurrently making optimal use of spatial-spectral information in HSI. Finally, the output of the TE is fed into a linear mapping layer to obtain the classification results. Experiments conducted on three popular public datasets demonstrate that the MSST method achieved higher classification accuracy compared to state-of-the-art (SOTA) methods.
引用
收藏
页数:20
相关论文
共 57 条
[21]   Spectral-Spatial Classification of Hyperspectral Image Based on Deep Auto-Encoder [J].
Ma, Xiaorui ;
Wang, Hongyu ;
Geng, Jie .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2016, 9 (09) :4073-4085
[22]   Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer [J].
Mei, Shaohui ;
Song, Chao ;
Ma, Mingyang ;
Xu, Fulin .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[23]   ChatGPT and large language models in academia: opportunities and challenges [J].
Meyer, Jesse G. ;
Urbanowicz, Ryan J. ;
Martin, Patrick C. N. ;
O'Connor, Karen ;
Li, Ruowang ;
Peng, Pei-Chen ;
Bright, Tiffani J. ;
Tatonetti, Nicholas ;
Won, Kyoung Jae ;
Gonzalez-Hernandez, Graciela ;
Moore, Jason H. .
BIODATA MINING, 2023, 16 (01)
[24]   Hyperspectral Image Classification via Sparse Code Histogram [J].
Ni, Ding ;
Ma, Hongbing .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2015, 12 (09) :1843-1847
[25]   Deep support vector machine for hyperspectral image classification [J].
Okwuashi, Onuwa ;
Ndehedehe, Christopher E. .
PATTERN RECOGNITION, 2020, 103
[26]   When Multigranularity Meets Spatial-Spectral Attention: A Hybrid Transformer for Hyperspectral Image Classification [J].
Ouyang, Er ;
Li, Bin ;
Hu, Wenjing ;
Zhang, Guoyun ;
Zhao, Lin ;
Wu, Jianhui .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[27]   GAF-NAU: Gramian Angular Field encoded Neighborhood Attention U-Net for Pixel-Wise Hyperspectral Image Classification [J].
Paheding, Sidike ;
Reyes, Abel A. ;
Kasaragod, Anush ;
Oommen, Thomas .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :408-416
[28]   Rapid Vitality Estimation and Prediction of Corn Seeds Based on Spectra and Images Using Deep Learning and Hyperspectral Imaging Techniques [J].
Pang, Lei ;
Men, Sen ;
Yan, Lei ;
Xiao, Jiang .
IEEE ACCESS, 2020, 8 :123026-123036
[29]  
Pathan S., 2022, P INT C APPL MACH IN, P582
[30]   Low-Rank and Sparse Representation for Hyperspectral Image Processing: A Review [J].
Peng, Jiangtao ;
Sun, Weiwei ;
Li, Heng-Chao ;
Li, Wei ;
Meng, Xiangchao ;
Ge, Chiru ;
Du, Qian .
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, 2022, 10 (01) :10-43