Remote Sensing Scene Classification via Second-Order Differentiable Token Transformer Network

被引:0
|
作者
Ni, Kang [1 ,2 ,3 ]
Wu, Qianqian [1 ]
Li, Sichan [4 ]
Zheng, Zhizhong [1 ,2 ]
Wang, Peng [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp Sci, Nanjing 210023, Peoples R China
[2] Jiangsu Prov Engn Res Ctr Airborne Detecting & Int, Nanjing 210049, Peoples R China
[3] Nanjing Univ Aeronaut & Astronaut, Key Lab Radar Imaging & Microwave Photon, Minist Educ, Nanjing 211106, Peoples R China
[4] Nanjing Univ Posts & Telecommun, Coll Internet Things, Nanjing 210023, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Transformers; Remote sensing; Image coding; Merging; Representation learning; Visualization; Vectors; Classification token; learnable token; remote sensing; scene classification; vision transformer;
D O I
10.1109/TGRS.2024.3407879
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The vision transformer has been widely applied in remote sensing image scene classification due to its excellent ability to capture global features. However, remote sensing scene images involves challenges such as scene complexity and small interclass differences. Directly utilizing the global tokens of the transformer for feature learning may increase computational complexity. Therefore, constructing a distinguishable transformer network that adaptively selects tokens can effectively improve the classification performance of remote sensing scene images while considering computational complexity. Based on this, a second-order differentiable token transformer network (SDT2Net) is proposed for considering the efficacy of distinguishable statistical features and nonredundant learnable tokens of remote sensing scene images. A novel transformer block, including an efficient attention block (EAB) and differentiable token compression (DTC) mechanism, is inserted into SDT2Net for acquiring selectable token features of each scene image guided by sparse shift local features and token compression rate learning style. Furthermore, a fast token fusion (FTF) module is developed for acquiring more distinguishable token feature representations. This module utilizes the fast global covariance pooling algorithm to acquire high-order visual tokens and validates the effectiveness of classification tokens and high-order visual tokens for scene classification. Compared with other recent methods, SDT2Net achieves the most advanced performance with comparable floating point operations per second (FLOPs). The code will be available at https://github.com/RSIP-NJUPT/SDT2Net.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [31] Hyperbolic prototypical network for few shot remote sensing scene classification
    Hamzaoui, Manal
    Chapel, Laetitia
    Pham, Minh -Tan
    Lefevre, Sebastien
    PATTERN RECOGNITION LETTERS, 2024, 177 : 151 - 156
  • [32] Adaptive Discriminative Regions Learning Network for Remote Sensing Scene Classification
    Tang, Chuan
    Zheng, Xiao
    Tang, Chang
    SENSORS, 2023, 23 (02)
  • [33] REVIEW OF VISION TRANSFORMER MODELS FOR REMOTE SENSING IMAGE SCENE CLASSIFICATION
    Lv, Pengyuan
    Wu, Wenjun
    Zhong, Yanfei
    Zhang, Liangpei
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 2231 - 2234
  • [34] Remote Sensing Image Scene Classification via Graph Template Enhancement and Supplementation Network With Dual-Teacher Knowledge Distillation
    Zhou, Wujie
    Yang, Penghan
    Liu, Yuanyuan
    Cong, Runmin
    Jiang, Qiuping
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [35] Effective Multiscale Residual Network With High-Order Feature Representation for Optical Remote Sensing Scene Classification
    Li, Can
    Zhuang, Yin
    Liu, Wenchao
    Dong, Shan
    Du, Hailin
    Chen, He
    Zhao, Boya
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [36] SGMNet: Scene Graph Matching Network for Few-Shot Remote Sensing Scene Classification
    Zhang, Baoquan
    Feng, Shanshan
    Li, Xutao
    Ye, Yunming
    Ye, Rui
    Luo, Chen
    Jiang, Hao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [37] A Scene Images Diversity Improvement Generative Adversarial Network for Remote Sensing Image Scene Classification
    Pan, Xin
    Zhao, Jian
    Xu, Jun
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (10) : 1692 - 1696
  • [38] CTMFNet: CNN and Transformer Multiscale Fusion Network of Remote Sensing Urban Scene Imagery
    Song, Pengfei
    Li, Jinjiang
    An, Zhiyong
    Fan, Hui
    Fan, Linwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [39] Diversity-Infused Network for Unsupervised Few-Shot Remote Sensing Scene Classification
    Hou, Liyuan
    Ji, Zhong
    Wang, Xuan
    Yu, Yunlong
    Pang, Yanwei
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [40] Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification
    Zhao, Maofan
    Meng, Qingyan
    Zhang, Linlin
    Hu, Xinli
    Bruzzone, Lorenzo
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61