A Spatial-Spectral Transformer for Hyperspectral Image Classification Based on Global Dependencies of Multi-Scale Features

被引:21
作者
Ma, Yunxuan [1 ]
Lan, Yan [1 ]
Xie, Yakun [2 ]
Yu, Lanxin [3 ]
Chen, Chen [1 ]
Wu, Yusong [1 ]
Dai, Xiaoai [1 ]
机构
[1] Chengdu Univ Technol, Coll Earth Sci, Chengdu 610059, Peoples R China
[2] Southwest Jiaotong Univ, Fac Geosci & Environm Engn, Chengdu 610097, Peoples R China
[3] East China Normal Univ, Sch Stat, Shanghai 200062, Peoples R China
关键词
attention mechanism; convolutional neural networks; hyperspectral image classification; hybrid network; transformer; QUALITY; MACHINE; NETWORK;
D O I
10.3390/rs16020404
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Vision transformers (ViTs) are increasingly utilized for HSI classification due to their outstanding performance. However, ViTs encounter challenges in capturing global dependencies among objects of varying sizes, and fail to effectively exploit the spatial-spectral information inherent in HSI. In response to this limitation, we propose a novel solution: the multi-scale spatial-spectral transformer (MSST). Within the MSST framework, we introduce a spatial-spectral token generator (SSTG) and a token fusion self-attention (TFSA) module. Serving as the feature extractor for the MSST, the SSTG incorporates a dual-branch multi-dimensional convolutional structure, enabling the extraction of semantic characteristics that encompass spatial-spectral information from HSI and subsequently tokenizing them. TFSA is a multi-head attention module with the ability to encode attention to features across various scales. We integrated TFSA with cross-covariance attention (CCA) to construct the transformer encoder (TE) for the MSST. Utilizing this TE to perform attention modeling on tokens derived from the SSTG, the network effectively simulates global dependencies among multi-scale features in the data, concurrently making optimal use of spatial-spectral information in HSI. Finally, the output of the TE is fed into a linear mapping layer to obtain the classification results. Experiments conducted on three popular public datasets demonstrate that the MSST method achieved higher classification accuracy compared to state-of-the-art (SOTA) methods.
引用
收藏
页数:20
相关论文
共 57 条
[1]   Hyperspectral Image Classification-Traditional to Deep Models: A Survey for Future Prospects [J].
Ahmad, Muhammad ;
Shabbir, Sidrah ;
Roy, Swalpa Kumar ;
Hong, Danfeng ;
Wu, Xin ;
Yao, Jing ;
Khan, Adil Mehmood ;
Mazzara, Manuel ;
Distefano, Salvatore ;
Chanussot, Jocelyn .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 :968-999
[2]   Hyperspectral image analysis. A tutorial [J].
Amigo, Jose Manuel ;
Babamoradi, Hamid ;
Elcoroaristizabal, Saioa .
ANALYTICA CHIMICA ACTA, 2015, 896 :34-51
[3]   CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].
Chen, Chun-Fu ;
Fan, Quanfu ;
Panda, Rameswar .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356
[4]   Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images [J].
Cheng, Gong ;
Han, Junwei ;
Guo, Lei ;
Liu, Zhenbao ;
Bu, Shuhui ;
Ren, Jinchang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (08) :4238-4249
[5]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[6]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[7]  
El-Nouby A, 2021, ADV NEUR IN
[8]   MCK-ELM: multiple composite kernel extreme learning machine for hyperspectral images [J].
Ergul, Ugur ;
Bilgin, Gokhan .
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (11) :6809-6819
[9]   A Survey on Vision Transformer [J].
Han, Kai ;
Wang, Yunhe ;
Chen, Hanting ;
Chen, Xinghao ;
Guo, Jianyuan ;
Liu, Zhenhua ;
Tang, Yehui ;
Xiao, An ;
Xu, Chunjing ;
Xu, Yixing ;
Yang, Zhaohui ;
Zhang, Yiman ;
Tao, Dacheng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :87-110
[10]   Recent Advances on Spectral-Spatial Hyperspectral Image Classification: An Overview and New Guidelines [J].
He, Lin ;
Li, Jun ;
Liu, Chenying ;
Li, Shutao .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (03) :1579-1597