PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification

被引:4
作者
Li, Daxiang [1 ]
Liu, Runyuan [1 ]
Tang, Yao [1 ]
Liu, Ying [1 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Telecommun & Informat Engn, Xian 710121, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene classification; Feature extraction; Cross layer design; Transformers; Semantics; Prototypes; Remote sensing; Position-sensitive transformer; remote sensing image (RSI) classification; self-supervised learning; FEATURES;
D O I
10.1109/LGRS.2024.3359415
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 23 条
  • [1] Vision Transformer With Contrastive Learning for Remote Sensing Image Scene Classification
    Bi, Meiqiao
    Wang, Minghua
    Li, Zhi
    Hong, Danfeng
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 738 - 749
  • [2] Remote Sensing Scene Classification via Multi-Branch Local Attention Network
    Chen, Si-Bao
    Wei, Qing-Song
    Wang, Wen-Zhong
    Tang, Jin
    Luo, Bin
    Wang, Zu-Yuan
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 99 - 109
  • [3] Dosovitskiy A., 2021, P INT C LEAR REPR, P1
  • [4] Gupta S. C., 2010, Int. J. Comput. Sci. Netw. Secur., V10, P96
  • [5] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [6] Contextual Spatial-Channel Attention Network for Remote Sensing Scene Classification
    Hou, Yan-e
    Yang, Kang
    Dang, Lanxue
    Liu, Yang
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [7] Distinctive image features from scale-invariant keypoints
    Lowe, DG
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) : 91 - 110
  • [8] SCViT: A Spatial-Channel Feature Preserving Vision Transformer for Remote Sensing Image Scene Classification
    Lv, Pengyuan
    Wu, Wenjun
    Zhong, Yanfei
    Du, Fang
    Zhang, Liangpei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [9] Remote Sensing Scene Classification Based on Attention-Enabled Progressively Searching
    Shen, Junge
    Cao, Bin
    Zhang, Chi
    Wang, Ruxin
    Wang, Qi
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [10] Remote Sensing Scene Classification Based on Multibranch Fusion Attention Network
    Shi, Jiacheng
    Liu, Wei
    Shan, Haoyu
    Li, Erzhu
    Li, Xing
    Zhang, Lianpeng
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20