MCKTNet: Multiscale Cross-Modal Knowledge Transfer Network for Semantic Segmentation of Remote Sensing Images

被引:0
作者
Cui, Jian [1 ]
Liu, Jiahang [1 ]
Ni, Yue [1 ]
Sun, Yuan [2 ]
Guo, Mao [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Astronaut, Nanjing 210016, Jiangsu, Peoples R China
[2] Beijing Univ Posts & Telecommun, Coll Elect Engn, Beijing 100876, Peoples R China
[3] Southern Marine Sci & Engn Guangdong Lab Guangzhou, Guangzhou 511458, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2025年 / 63卷
关键词
Remote sensing; Feature extraction; Semantic segmentation; Semantics; Image edge detection; Data mining; Knowledge transfer; Accuracy; Vegetation mapping; Soft sensors; Multimodal; remote sensing; semantic segmentation; transfer learning;
D O I
10.1109/TGRS.2025.3547442
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Multimodal data fusion can provide valuable and diverse information for remote sensing image segmentation. However, different modal data have different feature distributions, which causes some conflicts and redundancies in cross-modal feature fusion. In addition, existing multimodal fusion networks usually adopt a two-branch structure with a large number of parameters and high computational cost. To address these problems, we propose a multiscale cross-modal knowledge transfer network (MCKTNet) for remote sensing image segmentation. First, we use a cross-modal migration learning method that combines channel discrete loss and spatial discrete loss to facilitate cross-modal migration of geometric and semantic features and reduces redundancy by minimizing the differences in feature distribution. Then, we use the polarized cross-self-attention mechanism to establish long-range correlations between different modal features across spatial and channel dimensions, and only a small number of parameters need to be added to achieve complementary cross-modal feature fusion. Finally, to accurately capture object edges, we propose a multiscale edge perception module to optimize the edge details in the prediction results at the pixel level. Extensive experiments demonstrate that the proposed method is effective, robust, and generalizable, and achieves state-of-the-art performance in multiple remote sensing image semantic segmentation tasks with only 13.51 million parameters. The code will be available at https://github.com/NUAALISILab/MCKTNet.
引用
收藏
页数:15
相关论文
共 50 条
[1]   Efficient Semantic Segmentation via Self-Attention and Self-Distillation [J].
An, Shumin ;
Liao, Qingmin ;
Lu, Zongqing ;
Xue, Jing-Hao .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (09) :15256-15266
[2]   Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks [J].
Audebert, Nicolas ;
Le Saux, Bertrand ;
Lefevre, Sebastien .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 140 :20-32
[3]   Multimodal Machine Learning: A Survey and Taxonomy [J].
Baltrusaitis, Tadas ;
Ahuja, Chaitanya ;
Morency, Louis-Philippe .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) :423-443
[4]   Deep learning-based multi-feature semantic segmentation in building extraction from images of UAV photogrammetry [J].
Boonpook, Wuttichai ;
Tan, Yumin ;
Xu, Bo .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) :1-19
[5]   C3Net: Cross-Modal Feature Recalibrated, Cross-Scale Semantic Aggregated and Compact Network for Semantic Segmentation of Multi-Modal High-Resolution Aerial Images [J].
Cao, Zhiying ;
Diao, Wenhui ;
Sun, Xian ;
Lyu, Xiaode ;
Yan, Menglong ;
Fu, Kun .
REMOTE SENSING, 2021, 13 (03)
[6]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[7]  
Chen PG, 2021, Arxiv, DOI arXiv:2104.09044
[8]   Adaptive Effective Receptive Field Convolution for Semantic Segmentation of VHR Remote Sensing Images [J].
Chen, Xi ;
Li, Zhiqiang ;
Jiang, Jie ;
Han, Zhen ;
Deng, Shiyi ;
Li, Zhihong ;
Fang, Tao ;
Huo, Hong ;
Li, Qingli ;
Liu, Min .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (04) :3532-3546
[9]   Global Context Dependencies Aware Network for Efficient Semantic Segmentation of Fine-Resolution Remoted Sensing Images [J].
Cui, Jian ;
Liu, Jiahang ;
Wang, Jinjin ;
Ni, Yue .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[10]   ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data [J].
Diakogiannis, Foivos, I ;
Waldner, Francois ;
Caccetta, Peter ;
Wu, Chen .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 162 :94-114