CMSE: Cross-Modal Semantic Enhancement Network for Classification of Hyperspectral and LiDAR Data

被引:11
作者
Han, Wenqi [1 ]
Miao, Wang [1 ]
Geng, Jie [1 ]
Jiang, Wen [1 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710129, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
Semantics; Laser radar; Feature extraction; Hyperspectral imaging; Land surface; Data models; Data mining; Classification; land cover; multimodal; remote sensing (RS); semantic features; IMAGE CLASSIFICATION; NEURAL-NETWORK; FUSION;
D O I
10.1109/TGRS.2024.3368509
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The fusion of hyperspectral image (HSI) and light detection and ranging (LiDAR) data is widely used for land cover classification. However, due to different imaging mechanisms, HSI and LiDAR data always present significant image differences, and the dimensions and feature distributions of HSI and LiDAR are highly dissimilar. This makes it challenging to represent and correlate semantic information from multimodal data. Current methods for classifying pixel-by-pixel features, which rely on cascaded or attention-based fusion, cannot effectively use multimodal features. To achieve accurate classification results, extracting and fusing similar high-order semantic information and complementary discriminative information contained in multimodal data is vital. In this article, we propose a cross-modal semantic enhancement network (CMSE) for multimodal semantic information mining and fusion. Our proposed CMSE framework extracts features from the image on multiple scales, capturing more representative local sparse features with different sizes of convolution kernels. To represent high-level semantic features related to land cover, we establish a Gaussian-weighted matrix and semantically transform the spatial and spectral features of distinct branches. Finally, we build a multilevel residual fusion module to incrementally fuse spectral features from HSI and elevation features from LiDAR. Additionally, we introduce a cross-modal semantically constrained loss to guide multimodal semantic feature alignment. We evaluate our approach on three multimodal remote sensing (RS) datasets, namely the Houston2013, Trento, and MUUFL datasets. The experimental results demonstrate that our proposed CMSE model achieves superior performance in terms of accuracy and robustness compared to other related deep networks.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
[31]   Multimodal Deep Learning for Semisupervised Classification of Hyperspectral and LiDAR Data [J].
Pu, Chunyu ;
Liu, Yingxu ;
Lin, Shuai ;
Shi, Xu ;
Li, Zhengying ;
Huang, Hong .
IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (02) :821-834
[32]   Hyperspectral and LiDAR Data Classification Based on Structural Optimization Transmission [J].
Zhang, Mengmeng ;
Li, Wei ;
Zhang, Yuxiang ;
Tao, Ran ;
Du, Qian .
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) :3153-3164
[33]   TMCFN: Text-Supervised Multidimensional Contrastive Fusion Network for Hyperspectral and LiDAR Classification [J].
Yang, Yueguang ;
Qu, Jiahui ;
Dong, Wenqian ;
Zhang, Tongzhen ;
Xiao, Song ;
Li, Yunsong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :18-18
[34]   Multilevel Deep Semantic Feature Asymmetric Network for Cross-Modal Hashing Retrieval [J].
Jiang, Xiaolong ;
Fan, Jiabao ;
Zhang, Jie ;
Lin, Ziyong ;
Li, Mingyong .
IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (08) :621-631
[35]   Multiview Feature Learning and Multilevel Information Fusion for Joint Classification of Hyperspectral and LiDAR Data [J].
Feng, Jia ;
Zhang, Junping ;
Zhang, Ye .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[36]   Robust Asymmetric Cross-Modal Hashing Retrieval With Dual Semantic Enhancement [J].
Teng, Shaohua ;
Xu, Tuhong ;
Zheng, Zefeng ;
Wu, Naiqi ;
Zhang, Wei ;
Teng, Luyao .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (03) :4340-4353
[37]   Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification [J].
Wang, Xusheng ;
Dong, Shoubin ;
Zheng, Xiaorou ;
Lu, Runuo ;
Jia, Jianxin .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[38]   CSFNet: Cross-Modal Semantic Focus Network for Semantic Segmentation of Large-Scale Point Clouds [J].
Luo, Yang ;
Han, Ting ;
Liu, Yujun ;
Su, Jinhe ;
Chen, Yiping ;
Li, Jinyuan ;
Wu, Yundong ;
Cai, Guorong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[39]   Modality Fusion Vision Transformer for Hyperspectral and LiDAR Data Collaborative Classification [J].
Yang, Bin ;
Wang, Xuan ;
Xing, Ying ;
Cheng, Chen ;
Jiang, Weiwei ;
Feng, Quanlong .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :17052-17065
[40]   Joint Classification of Hyperspectral and LiDAR Data Using a Hierarchical CNN and Transformer [J].
Zhao, Guangrui ;
Ye, Qiaolin ;
Sun, Le ;
Wu, Zebin ;
Pan, Chengsheng ;
Jeon, Byeungwoo .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61