PSCLI-TF: Position-Sensitive Cross-Layer Interactive Transformer Model for Remote Sensing Image Scene Classification

被引：4

作者：

Li, Daxiang ^{[1
]}

Liu, Runyuan ^{[1
]}

Tang, Yao ^{[1
]}

Liu, Ying ^{[1
]}

机构：

[1] Xian Univ Posts & Telecommun, Sch Telecommun & Informat Engn, Xian 710121, Peoples R China

来源：

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS | 2024年 / 21卷

基金：

中国国家自然科学基金;

关键词：

Scene classification; Feature extraction; Cross layer design; Transformers; Semantics; Prototypes; Remote sensing; Position-sensitive transformer; remote sensing image (RSI) classification; self-supervised learning; FEATURES;

D O I：

10.1109/LGRS.2024.3359415

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

In the scene classification task of remote sensing image (RSI), to fully perceive multiscale local objects in the image and explore their interdependencies to mine the scene semantics of RSI, this letter designs a novel position-sensitive cross-layer interactive transformer (PSCLI-TF) model to improve the accuracy of RSI scene classification. First, ResNet50 is used as the backbone to extract the multilayer feature maps of RSI. Then, to enhance the model's position sensitivity to local objects in RSI, a new position-sensitive cross-layer interactive attention (PSCLIA) mechanism is designed, and based on it a novel PSCLI-TF encoder is constructed to perform layer-by-layer interactive fusion on the multilayer feature maps to obtain the multigranularity cross-layer fusion (CLF) feature of RSI. Finally, a prototype-based self-supervised loss function (SELF) is constructed to alleviate the semantic gap problem of "large intraclass variance and small interclass variance" in RSI scene classification. Comparative experimental results based on three datasets (i.e., AID, NWPU, and UCM) indicate that the classification performance of the designed PSCLI-TF model is highly competitive compared with other state-of-the-art methods.

引用

页码：1 / 5

页数：5

共 23 条

[1] Vision Transformer With Contrastive Learning for Remote Sensing Image Scene Classification
Bi, Meiqiao
Wang, Minghua
Li, Zhi
Hong, Danfeng
[J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 738 - 749
[2] Remote Sensing Scene Classification via Multi-Branch Local Attention Network
Chen, Si-Bao
Wei, Qing-Song
Wang, Wen-Zhong
Tang, Jin
Luo, Bin
Wang, Zu-Yuan
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 99 - 109
[3] Dosovitskiy A., 2021, P INT C LEAR REPR, P1
[4] Gupta S. C., 2010, Int. J. Comput. Sci. Netw. Secur., V10, P96
[5] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[6] Contextual Spatial-Channel Attention Network for Remote Sensing Scene Classification
Hou, Yan-e
Yang, Kang
Dang, Lanxue
Liu, Yang
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[7] Distinctive image features from scale-invariant keypoints
Lowe, DG
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) : 91 - 110
[8] SCViT: A Spatial-Channel Feature Preserving Vision Transformer for Remote Sensing Image Scene Classification
Lv, Pengyuan
Wu, Wenjun
Zhong, Yanfei
Du, Fang
Zhang, Liangpei
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[9] Remote Sensing Scene Classification Based on Attention-Enabled Progressively Searching
Shen, Junge
Cao, Bin
Zhang, Chi
Wang, Ruxin
Wang, Qi
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[10] Remote Sensing Scene Classification Based on Multibranch Fusion Attention Network
Shi, Jiacheng
Liu, Wei
Shan, Haoyu
Li, Erzhu
Li, Xing
Zhang, Lianpeng
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20

← 1 2 3 →