Remote Sensing Image Classification Method Based on Fusion of CNN and Transformer

被引:4
|
作者
Jin Chuan [1 ]
Tong Changqing [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Sci, Hangzhou 310018, Zhejiang, Peoples R China
关键词
image classification; convolutional neural network; Transformer; spatial location information; attention mechanism;
D O I
10.3788/LOP223154
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
To solve the difficult problem of the classification of high-resolution remote sensing images having large intraclass differences and small interclass differences, a hybrid structure using the advantages of convolutional neural networks and a Transformer in deep learning is proposed herein. Feature clustering is carried out for each channel along the horizontal and vertical directions using two attention mechanisms with spatial location information for the features extracted from the convolutional layer. This reduces the redundant mapping of remote sensing scene features and enables the network to extract more information relevant to the task object. Then, the captured feature maps are processed via encoding operations using the Transformer encoder structure to enable the allocation of greater weights to the regions of interest in the feature maps. The experimental results show that the proposed method reduces number of model parameters and increases the classification accuracy compared with the existing deep learning-based remote sensing image classification methods, achieving the highest average classification accuracy of 98. 95%, 96. 00%, and 95. 01% on the remote sensing image classification datasets of AID, NWPU-RESISC45, and VGoogle, respectively.
引用
收藏
页数:10
相关论文
共 29 条
  • [1] [车思韬 Che Sitao], 2022, [计算机应用研究, Application Research of Computers], V39, P2532
  • [2] Chen H, 2022, Journal of Electronics & Information Technology, P1
  • [3] Remote Sensing Image Scene Classification Using Bag of Convolutional Features
    Cheng, Gong
    Li, Zhenpeng
    Yao, Xiwen
    Guo, Lei
    Wei, Zhongliang
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (10) : 1735 - 1739
  • [4] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929, 10.48550/arXiv.2010.11929]
  • [5] Hyperspectral Image Classification Based on 3-D Gabor Filter and Support Vector Machines
    Feng Xiao
    Xiao Peng-feng
    Li Qi
    Liu Xiao-xi
    Wu Xiao-cui
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2014, 34 (08) : 2218 - 2224
  • [6] Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, DOI 10.48550/ARXIV.1704.04861]
  • [7] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [8] Hendrycks D, 2020, Arxiv, DOI [arXiv:1606.08415, DOI 10.48550/ARXIV.1606.08415]
  • [9] Two novel benchmark datasets from ArcGIS and bing world imagery for remote sensing image retrieval
    Hou, Dongyang
    Miao, Zelang
    Xing, Huaqiao
    Wu, Hao
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) : 220 - 238
  • [10] Coordinate Attention for Efficient Mobile Network Design
    Hou, Qibin
    Zhou, Daquan
    Feng, Jiashi
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13708 - 13717