Spatial-Gated Multilayer Perceptron for Land Use and Land Cover Mapping

被引：11

作者：

Jamali, Ali ^{[1
]}

Roy, Swalpa Kumar ^{[2
]}

Hong, Danfeng ^{[3
,4
]}

Atkinson, Peter M. ^{[5
]}

Ghamisi, Pedram ^{[6
,7
]}

机构：

[1] Simon Fraser Univ, Dept Geog, Burnaby, BC V5A 1S6, Canada

[2] Alipurduar Govt Engn & Management Coll, Dept Comp Sci & Engn, Chhipra 736206, W Bengal, India

[3] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[4] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China

[5] Univ Lancaster, Fac Sci & Technol, Lancaster LA1 4YW, England

[6] Helmholtz Zentrum Dresden Rossendorf HZDR, Helmholtz Inst Freiberg Resource Technol, D-09599 Freiberg, Germany

[7] Inst Adv Res Artificial Intelligence IARAI, A-1030 Vienna, Austria

来源：

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS | 2024年 / 21卷

关键词：

Feature extraction; Classification algorithms; Hyperspectral imaging; Data models; Transformers; Biological system modeling; Training data; Attention mechanism; image classification; spatial gating unit (SGU); vision transformers (ViTs); CLASSIFICATION;

D O I：

10.1109/LGRS.2024.3354175

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Due to its capacity to recognize detailed spectral differences, hyperspectral (HS) data have been extensively used for precise land use land cover (LULC) mapping. However, recent multimodal methods have shown their superior classification performance over the algorithms that use single datasets. On the other hand, convolutional neural networks (CNNs) are models extensively utilized for the hierarchical extraction of features. Vision transformers (ViTs), through a self-attention mechanism, have recently achieved superior modeling of global contextual information compared to CNNs. However, to harness their image classification strength, ViTs require substantial training datasets. In cases where the available training data is limited, current advanced multilayer perceptrons (MLPs) can provide viable alternatives to both deep CNNs and ViTs. In this letter, we developed the SGU-MLP, a deep-learning algorithm that effectively combines MLPs and spatial gating units (SGUs) for precise LULC mapping using multimodal data from multispectral, LiDAR, and HS data. Results illustrated the superiority of the developed SGU-MLP classification algorithm over several CNN- and CNN-ViT-based models, including HybridSN, ResNet, iFormer, EfficientFormer, and CoAtNet. The SGU-MLP classification model consistently outperformed the benchmark CNN- and CNN-ViT-based algorithms. The code will be made publicly available at https://github.com/aj1365/SGUMLP.

引用

页数：5

共 15 条

[1]

Dai Z, 2021, ADV NEUR IN, V34

[2] Hyperspectral and LiDAR Data Fusion: Outcome of the 2013 GRSS Data Fusion Contest [J].

Debes, Christian ;

Merentitis, Andreas ;

Heremans, Roel ;

Hahn, Juergen ;

Frangiadakis, Nikolaos ;

van Kasteren, Tim ;

Liao, Wenzhi ;

Bellens, Rik ;

Pizurica, Aleksandra ;

Gautama, Sidharta ;

Philips, Wilfried ;

Prasad, Saurabh ;

Du, Qian ;

Pacifici, Fabio .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2014, 7 (06) :2405-2418

[3] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[4] Multimodal remote sensing benchmark datasets for land cover classification with a shared and specific feature learning model [J].

Hong, Danfeng ;

Hu, Jingliang ;

Yao, Jing ;

Chanussot, Jocelyn ;

Zhu, Xiao Xiang .

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 178 :68-80

[5] Deep Learning for Hyperspectral Image Classification: An Overview [J].

Li, Shutao ;

Song, Weiwei ;

Fang, Leyuan ;

Chen, Yushi ;

Ghamisi, Pedram ;

Benediktsson, Jon Atli .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (09) :6690-6709

[6]

Li Yanyu, 2022, Efficientformer: Vision transformers at mobilenet speed

[7]

Liu Hong, 2021, ADV NEURAL INFORM PR, V34

[8]

Okujeni A., 2016, BERLIN URBAN GRADIEN, DOI [10.2312/enmap.2016.002, DOI 10.2312/ENMAP.2016.002]

[9] Multimodal Fusion Transformer for Remote Sensing Image Classification [J].

Roy, Swalpa Kumar ;

Deria, Ankur ;

Hong, Danfeng ;

Rasti, Behnood ;

Plaza, Antonio ;

Chanussot, Jocelyn .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61

[10] HybridSN: Exploring 3-D-2-D CNN Feature Hierarchy for Hyperspectral Image Classification [J].

Roy, Swalpa Kumar ;

Krishna, Gopal ;

Dubey, Shiv Ram ;

Chaudhuri, Bidyut B. .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (02) :277-281

← 1 2 →