MPT-SFANet: Multiorder Pooling Transformer-Based Semantic Feature Aggregation Network for SAR Image Classification

被引:0
作者
Ni, Kang [1 ,2 ,3 ]
Yuan, Chunyang [4 ]
Zheng, Zhizhong [3 ,5 ]
Zhang, Bingbing [6 ]
Wang, Peng [7 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp Sci, Nanjing, Peoples R China
[2] Minist Educ, Nanjing, Peoples R China
[3] Jiangsu Prov Engn Res Ctr Airborne Detecting & In, Nanjing, Peoples R China
[4] Nanjing Univ Posts & Telecommun, Comp Sci & Technol, Nanjing, Peoples R China
[5] Nanjing Univ Posts & Telecommun, Nanjing, Peoples R China
[6] Dalian Minzu Univ, Sch Comp & Engn, Dalian, Peoples R China
[7] Nanjing Univ Aeronaut & Astronaut, Minist Educ, Nanjing, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Transformers; Radar polarimetry; Synthetic aperture radar; Semantics; Feature extraction; Telecommunications; Land surface; Feature learning; image classification; semantic feature; synthetic aperture radar (SAR); transformer-based method; COVARIANCE;
D O I
10.1109/TAES.2024.3382622
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
The transformer-based methods have demonstrated remarkable advancements in synthetic aperture radar (SAR) classification. Nevertheless, many of these methods ignore global statistical information and semantic feature interaction for effectively characterizing different SAR land covers under complex structures. Leveraging second-order statistics presents an efficacious approach to well characterize the statistical features of SAR patches. Motivated by this, we integrate pyramid pooling and global covariance pooling techniques into each of the multihead self-attention blocks, thereby facilitating the extraction of powerful contextual features and the global statistical nature of SAR patches, namely multiorder pooling transformer module. Simultaneously, a semantic feature aggregation module is utilized for capturing local deep features and modeling the interaction of feature information across various feature levels. Both of these modules are embedded into a U-shaped architecture, which we refer to as a multiorder pooling transformer-based semantic feature aggregation network (MPT-SFANet). Extensive experimental results on TerraSAR, Sentinel-1B, and GF-3 SAR image classification datasets indicate that MPT-SFANet exceeds several relevant methods.
引用
收藏
页码:4923 / 4938
页数:16
相关论文
共 59 条
[1]   A New Statistical-Based Kurtosis Wavelet Energy Feature for Texture Recognition of SAR Images [J].
Akbarizadeh, Gholamreza .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2012, 50 (11) :4358-4368
[2]   Transformers in Remote Sensing: A Survey [J].
Aleissaee, Abdulaziz Amer ;
Kumar, Amandeep ;
Anwer, Rao Muhammad ;
Khan, Salman ;
Cholakkal, Hisham ;
Xia, Gui-Song ;
Khan, Fahad Shahbaz .
REMOTE SENSING, 2023, 15 (07)
[3]  
Ba JL, 2016, arXiv
[4]   Multisensor Earth Observation Image Classification Based on a Multimodal Latent Dirichlet Allocation Model [J].
Bahmanyar, Reza ;
Espinoza-Molina, Daniela ;
Datcu, Mihai .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2018, 15 (03) :459-463
[5]   Comparison of CNNs and Vision Transformers-Based Hybrid Models Using Gradient Profile Loss for Classification of Oil Spills in SAR Images [J].
Basit, Abdul ;
Siddique, Muhammad Adnan ;
Bhatti, Muhammad Khurram ;
Sarfraz, Muhammad Saquib .
REMOTE SENSING, 2022, 14 (09)
[6]   Superpixel-Based Cropland Classification of SAR Image With Statistical Texture and Polarization Features [J].
Chen, Qihao ;
Cao, Wenjing ;
Shang, Jiali ;
Liu, Jiangui ;
Liu, Xiuguo .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[7]  
Chu XX, 2021, Arxiv, DOI [arXiv:2102.10882, 10.48550/arXiv.2102.10882, DOI 10.48550/ARXIV.2102.10882]
[8]   Multilevel Local Pattern Histogram for SAR Image Classification [J].
Dai, Dengxin ;
Yang, Wen ;
Sun, Hong .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2011, 8 (02) :225-229
[9]  
Dong H., 2021, IEEE Trans. Geosci. Remote Sens, V60
[10]  
Dosovitskiy A., 2021, P INT C LEARN REPR J, P1