Spatial-Spectral Transformer With Cross-Attention for Hyperspectral Image Classification

被引:50
作者
Peng, Yishu [1 ]
Zhang, Yuwen [1 ]
Tu, Bing [1 ,2 ]
Li, Qianming [1 ]
Li, Wujing [1 ]
机构
[1] Hunan Inst Sci & Technol, Sch Informat Sci & Engn, Yueyang 414000, Peoples R China
[2] Guilin Univ Elect Technol, Guangxi Key Lab Cryptog & Informat Secur, Guilin 541000, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2022年 / 60卷
基金
中国国家自然科学基金;
关键词
Convolutional neural network (CNN); cross-attention; hyperspectral image (HSI) classification; local spatial features; long sequence data; transformer; GRAPH CONVOLUTIONAL NETWORKS;
D O I
10.1109/TGRS.2022.3203476
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification tasks because of their excellent local spatial feature extraction capabilities. However, because it is difficult to establish dependencies between long sequences of data for CNNs, there are limitations in the process of processing hyperspectral spectral sequence features. To overcome these limitations, inspired by the Transformer model, a spatial-spectral transformer with cross-attention (CASST) method is proposed. Overall, the method consists of a dual-branch structures, i.e., spatial and spectral sequence branches. The former is used to capture fine-grained spatial information of HSI, and the latter is adopted to extract the spectral features and establish interdependencies between spectral sequences. Specifically, to enhance the consistency among features and relieve computational burden, we design a spatial-spectral cross-attention module with weighted sharing to extract the interactive spatial-spectral fusion feature intra Transformer block, while also developing a spatial-spectral weighted sharing mechanism to capture the robust semantic feature inter Transformer block. Performance evaluation experiments are conducted on three hyperspectral classification datasets, demonstrating that the CASST method achieves better accuracy than the state-of-the-art Transformer classification models and mainstream classification networks.
引用
收藏
页数:15
相关论文
共 53 条
[1]  
Ardouin JP, 2007, 2007 PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOLS 1-4, P1632
[2]   Deep Learning for Classification of Hyperspectral Data [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, 2019, 7 (02) :159-173
[3]  
Carion N, 2020, Arxiv, DOI arXiv:2005.12872
[4]   CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].
Chen, Chun-Fu ;
Fan, Quanfu ;
Panda, Rameswar .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356
[5]  
Chen X., 2022, arXiv
[6]   Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks [J].
Chen, Yushi ;
Jiang, Hanlu ;
Li, Chunyang ;
Jia, Xiuping ;
Ghamisi, Pedram .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (10) :6232-6251
[7]   Deep Learning-Based Classification of Hyperspectral Data [J].
Chen, Yushi ;
Lin, Zhouhan ;
Zhao, Xing ;
Wang, Gang ;
Gu, Yanfeng .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2014, 7 (06) :2094-2107
[8]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[9]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[10]   Semisupervised Feature Extraction of Hyperspectral Image Using Nonlinear Geodesic Sparse Hypergraphs [J].
Duan, Yule ;
Huang, Hong ;
Wang, Tao .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60