Remote Sensing Scene Classification via Second-Order Differentiable Token Transformer Network

被引:0
|
作者
Ni, Kang [1 ,2 ,3 ]
Wu, Qianqian [1 ]
Li, Sichan [4 ]
Zheng, Zhizhong [1 ,2 ]
Wang, Peng [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp Sci, Nanjing 210023, Peoples R China
[2] Jiangsu Prov Engn Res Ctr Airborne Detecting & Int, Nanjing 210049, Peoples R China
[3] Nanjing Univ Aeronaut & Astronaut, Key Lab Radar Imaging & Microwave Photon, Minist Educ, Nanjing 211106, Peoples R China
[4] Nanjing Univ Posts & Telecommun, Coll Internet Things, Nanjing 210023, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Transformers; Remote sensing; Image coding; Merging; Representation learning; Visualization; Vectors; Classification token; learnable token; remote sensing; scene classification; vision transformer;
D O I
10.1109/TGRS.2024.3407879
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The vision transformer has been widely applied in remote sensing image scene classification due to its excellent ability to capture global features. However, remote sensing scene images involves challenges such as scene complexity and small interclass differences. Directly utilizing the global tokens of the transformer for feature learning may increase computational complexity. Therefore, constructing a distinguishable transformer network that adaptively selects tokens can effectively improve the classification performance of remote sensing scene images while considering computational complexity. Based on this, a second-order differentiable token transformer network (SDT2Net) is proposed for considering the efficacy of distinguishable statistical features and nonredundant learnable tokens of remote sensing scene images. A novel transformer block, including an efficient attention block (EAB) and differentiable token compression (DTC) mechanism, is inserted into SDT2Net for acquiring selectable token features of each scene image guided by sparse shift local features and token compression rate learning style. Furthermore, a fast token fusion (FTF) module is developed for acquiring more distinguishable token feature representations. This module utilizes the fast global covariance pooling algorithm to acquire high-order visual tokens and validates the effectiveness of classification tokens and high-order visual tokens for scene classification. Compared with other recent methods, SDT2Net achieves the most advanced performance with comparable floating point operations per second (FLOPs). The code will be available at https://github.com/RSIP-NJUPT/SDT2Net.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [41] Remote Sensing Scene Classification via Pseudo-Category-Relationand Orthogonal Feature Learning
    Ji, Jinsheng
    Lu, Xiankai
    Zhang, Tao
    Guo, Yiyou
    Yang, Gongping
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [42] Interactive Concept Network Enhanced Transformer for Remote Sensing Image Captioning
    Zhang, Cheng
    Ren, Zhongle
    Hou, Biao
    Meng, Jianhua
    Li, Weibin
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [43] EMSCNet: Efficient Multisample Contrastive Network for Remote Sensing Image Scene Classification
    Zhao, Yibo
    Liu, Jianjun
    Yang, Jinlong
    Wu, Zebin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [44] Multi-scale Convolutional Neural Network for Remote Sensing Scene Classification
    Alhichri, Haikel
    Alajlan, Naif
    Bazi, Yakoub
    Rabczuk, Timon
    2018 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT), 2018, : 113 - 117
  • [45] Multigranularity Decoupling Network With Pseudolabel Selection for Remote Sensing Image Scene Classification
    Miao, Wang
    Geng, Jie
    Jiang, Wen
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [46] A Deep Neural Network Combined CNN and GCN for Remote Sensing Scene Classification
    Liang, Jiali
    Deng, Yufan
    Zeng, Dan
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 4325 - 4338
  • [47] Pairwise Comparison Network for Remote-Sensing Scene Classification
    Zhang, Yue
    Zheng, Xiangtao
    Lu, Xiaoqiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [48] A Multiscale Attention Network for Remote Sensing Scene Images Classification
    Zhang, Guokai
    Xu, Weizhe
    Zhao, Wei
    Huang, Chenxi
    Ng, Eddie Yk
    Chen, Yongyong
    Su, Jian
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 9530 - 9545
  • [49] A Multiscale Incremental Learning Network for Remote Sensing Scene Classification
    Ye, Zhen
    Zhang, Yu
    Zhang, Jinxin
    Li, Wei
    Bai, Lin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [50] Positional Context Aggregation Network for Remote Sensing Scene Classification
    Zhang, Dong
    Li, Nan
    Ye, Qiaolin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (06) : 943 - 947