CMSE: Cross-Modal Semantic Enhancement Network for Classification of Hyperspectral and LiDAR Data

被引：11

作者：

Han, Wenqi ^{[1
]}

Miao, Wang ^{[1
]}

Geng, Jie ^{[1
]}

Jiang, Wen ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710129, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

关键词：

Semantics; Laser radar; Feature extraction; Hyperspectral imaging; Land surface; Data models; Data mining; Classification; land cover; multimodal; remote sensing (RS); semantic features; IMAGE CLASSIFICATION; NEURAL-NETWORK; FUSION;

D O I：

10.1109/TGRS.2024.3368509

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

The fusion of hyperspectral image (HSI) and light detection and ranging (LiDAR) data is widely used for land cover classification. However, due to different imaging mechanisms, HSI and LiDAR data always present significant image differences, and the dimensions and feature distributions of HSI and LiDAR are highly dissimilar. This makes it challenging to represent and correlate semantic information from multimodal data. Current methods for classifying pixel-by-pixel features, which rely on cascaded or attention-based fusion, cannot effectively use multimodal features. To achieve accurate classification results, extracting and fusing similar high-order semantic information and complementary discriminative information contained in multimodal data is vital. In this article, we propose a cross-modal semantic enhancement network (CMSE) for multimodal semantic information mining and fusion. Our proposed CMSE framework extracts features from the image on multiple scales, capturing more representative local sparse features with different sizes of convolution kernels. To represent high-level semantic features related to land cover, we establish a Gaussian-weighted matrix and semantically transform the spatial and spectral features of distinct branches. Finally, we build a multilevel residual fusion module to incrementally fuse spectral features from HSI and elevation features from LiDAR. Additionally, we introduce a cross-modal semantically constrained loss to guide multimodal semantic feature alignment. We evaluate our approach on three multimodal remote sensing (RS) datasets, namely the Houston2013, Trento, and MUUFL datasets. The experimental results demonstrate that our proposed CMSE model achieves superior performance in terms of accuracy and robustness compared to other related deep networks.

引用

页码：1 / 14

页数：14

共 50 条

[21] Deep EncoderDecoder Networks for Classification of Hyperspectral and LiDAR Data [J].

Hong, Danfeng ;

Gao, Lianru ;

Hang, Renlong ;

Zhang, Bing ;

Chanussot, Jocelyn .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19

[22] A Mamba-Aware SpatialSpectral Cross-Modal Network for Remote Sensing Classification [J].

Ma, Mengru ;

Zhao, Jiaxuan ;

Ma, Wenping ;

Jiao, Licheng ;

Li, Lingling ;

Liu, Xu ;

Liu, Fang ;

Yang, Shuyuan .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63

[23] DMSCA: deep multiscale cross-modal attention network for hyperspectral and light detection and ranging data fusion and joint classification [J].

Yu, Wenbo ;

Huang, Fenghua .

JOURNAL OF APPLIED REMOTE SENSING, 2024, 18 (03)

[24] Hashing-Based Deep Metric Learning for the Classification of Hyperspectral and LiDAR Data [J].

Song, Weiwei ;

Dai, Yong ;

Gao, Zhi ;

Fang, Leyuan ;

Zhang, Yongjun .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61

[25] Nearest Neighbor-Based Contrastive Learning for Hyperspectral and LiDAR Data Classification [J].

Wang, Meng ;

Gao, Feng ;

Dong, Junyu ;

Li, Heng-Chao ;

Du, Qian .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61

[26] Cross-Modal Semantic Communications [J].

Li, Ang ;

Wei, Xin ;

Wu, Dan ;

Zhou, Liang .

IEEE WIRELESS COMMUNICATIONS, 2022, 29 (06) :144-151

[27] Cross-Modal Semantic Relations Enhancement With Graph Attention Network for Image-Text Matching [J].

Xi, Xiaocong ;

Chow, Chee-Onn ;

Chuah, Joon Huang ;

Kanesan, Jeevan .

IEEE ACCESS, 2025, 13 :46124-46135

[28] Multimodal Feature Disentangle-Fusion Network for Hyperspectral and LiDAR Data Classification [J].

Pan, Yukai ;

Wu, Nan ;

Jin, Wei .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21

[29] MCKTNet: Multiscale Cross-Modal Knowledge Transfer Network for Semantic Segmentation of Remote Sensing Images [J].

Cui, Jian ;

Liu, Jiahang ;

Ni, Yue ;

Sun, Yuan ;

Guo, Mao .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63

[30] Hyperspectral and LiDAR Data Classification Based on Structural Optimization Transmission [J].

Zhang, Mengmeng ;

Li, Wei ;

Zhang, Yuxiang ;

Tao, Ran ;

Du, Qian .

IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) :3153-3164

← 1 2 3 4 5 →