Deep Symmetric Fusion Transformer for Multimodal Remote Sensing Data Classification

被引:3
作者
Chang, Honghao [1 ]
Bi, Haixia [1 ]
Li, Fan [1 ]
Xu, Chen [2 ,3 ]
Chanussot, Jocelyn [4 ]
Hong, Danfeng [5 ,6 ]
机构
[1] Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Xian 710049, Peoples R China
[2] Peng Cheng Lab, Dept Math & Fundamental Res, Shenzhen 518055, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
[4] Univ Grenoble Alpes, CNRS, INRIA, Grenoble INP LJK, F-38000 Grenoble, France
[5] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100049, Peoples R China
[6] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
Land-cover classification; local-global mixture (LGM); multimodal feature fusion; remote sensing; symmetric fusion transformer (SFT); LAND-COVER CLASSIFICATION; LIDAR DATA;
D O I
10.1109/TGRS.2024.3476975
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In recent years, multimodal remote sensing data classification (MMRSC) has evoked growing attention due to its more comprehensive and accurate delineation of Earth's surface compared to its single-modal counterpart. However, it remains challenging to capture and integrate local and global features from single-modal data. Moreover, how to fully excavate and exploit the interactions between different modalities is still an intricate issue. To this end, we propose a novel dual-branch transformer-based framework named deep symmetric fusion transformer (DSymFuser). Within the framework, each branch contains a stack of local-global mixture (LGM) blocks, to extract hierarchical and discriminative single-modal features. In each LGM block, a local-global feature mixer with learnable weights is specifically devised to adaptively aggregate the local and global features extracted with a convolutional neural network (CNN)-transformer network. Furthermore, we innovatively design a symmetric fusion transformer (SFT) that trails behind each LGM block. The elaborately designed SFT symmetrically facilitates cross-modal correlation excavation, comprehensively exploiting the complementary cues underlying heterogeneous modalities. The hierarchical construction of the LGM and SFT blocks enables feature extraction and fusion in a multilevel manner, further promoting the completeness and descriptiveness of the learned features. We conducted extensive ablation studies and comparative experiments on three benchmark datasets, and the experimental results validated the effectiveness and superiority of the proposed method. The source code of the proposed method will be available publicly at https://github.com/HaixiaBi1982/DSymFuser.
引用
收藏
页数:15
相关论文
共 58 条
  • [41] Morchhale S, 2016, 2016 8TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS)
  • [42] A Tutorial on Synthetic Aperture Radar
    Moreira, Alberto
    Prats-Iraola, Pau
    Younis, Marwan
    Krieger, Gerhard
    Hajnsek, Irena
    Papathanassiou, Konstantinos P.
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, 2013, 1 (01) : 6 - 43
  • [43] Semisupervised charting for spectral multimodal manifold learning and alignment
    Pournemat, Ali
    Adibi, Peyman
    Chanussot, Jocelyn
    [J]. PATTERN RECOGNITION, 2021, 111
  • [44] Hyperspectral and LiDAR Fusion Using Extinction Profiles and Total Variation Component Analysis
    Rasti, Behnood
    Ghamisi, Pedram
    Gloaguen, Richard
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (07): : 3997 - 4007
  • [45] Multimodal Fusion Transformer for Remote Sensing Image Classification
    Roy, Swalpa Kumar
    Deria, Ankur
    Hong, Danfeng
    Rasti, Behnood
    Plaza, Antonio
    Chanussot, Jocelyn
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [46] Spectral-Spatial Morphological Attention Transformer for Hyperspectral Image Classification
    Roy, Swalpa Kumar
    Deria, Ankur
    Shah, Chiranjibi
    Haut, Juan M.
    Du, Qian
    Plaza, Antonio
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [47] SpectralSpatial Feature Tokenization Transformer for Hyperspectral Image Classification
    Sun, Le
    Zhao, Guangrui
    Zheng, Yuhui
    Wu, Zebin
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [48] NCGLF2: Network combining global and local features for fusion of multisource remote sensing data
    Tu, Bing
    Ren, Qi
    Li, Jun
    Cao, Zhaolou
    Chen, Yunyun
    Plaza, Antonio
    [J]. INFORMATION FUSION, 2024, 104
  • [49] Vaswani A., 2017, Adv. Neural Inf. Process. Syst., P30
  • [50] Multi-attentive hierarchical dense fusion net for fusion classification of hyperspectral and LiDAR data
    Wang, Xianghai
    Feng, Yining
    Song, Ruoxi
    Mu, Zhenhua
    Song, Chuanming
    [J]. INFORMATION FUSION, 2022, 82 : 1 - 18