A Vision Transformer Model for Convolution-Free Multilabel Classification of Satellite Imagery in Deforestation Monitoring

被引：65

作者：

Kaselimi, Maria ^{[1
]}

Voulodimos, Athanasios ^{[2
]}

Daskalopoulos, Ioannis ^{[3
]}

Doulamis, Nikolaos ^{[1
]}

Doulamis, Anastasios ^{[1
]}

机构：

[1] Natl Tech Univ Athens, Sch Rural & Surveying Engn, Athens 15773, Greece

[2] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens 15773, Greece

[3] Univ West Attica, Dept Informat & Comp Engn, Athens 15773, Greece

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 07期

关键词：

Forestry; Transformers; Satellites; Remote sensing; Monitoring; Earth; Artificial satellites; Deforestation; multilabel image classification; self-attention; vision transformers;

D O I：

10.1109/TNNLS.2022.3144791

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Understanding the dynamics of deforestation and land uses of neighboring areas is of vital importance for the design and development of appropriate forest conservation and management policies. In this article, we approach deforestation as a multilabel classification (MLC) problem in an endeavor to capture the various relevant land uses from satellite images. To this end, we propose a multilabel vision transformer model, ForestViT, which leverages the benefits of the self-attention mechanism, obviating any convolution operations involved in commonly used deep learning models utilized for deforestation detection. Experimental evaluation in open satellite imagery datasets yields promising results in the case of MLC, particularly for imbalanced classes, and indicates ForestViT's superiority compared with well-established convolutional structures (ResNET, VGG, DenseNet, and ModileNet neural networks). This superiority is more evident for minority classes.

引用

页码：3299 / 3307

页数：9

共 26 条

[21] IMPROVING HYPERSPECTRAL DATA CLASSIFICATION OF SATELLITE IMAGERY BY USING A SPARSE BASED NEW MODEL WITH LEARNING DICTIONARY
Zhang, Chunmei
Hao, Xiaoting
Bai, Jing
Dai, Mo
2014 6TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2014,
[22] ECGConVT: A Hybrid CNN and Vision Transformer Model for Enhanced 12-Lead ECG Images Classification
Khalid, Mudassar
Pluempitiwiriyawej, Charnchai
Abdulkadhem, Abdulkadhem A.
Afzal, Imran
Truong, Tien
IEEE ACCESS, 2024, 12 : 193043 - 193056
[23] Road Traffic Classification from Nighttime Videos Using the Multihead Self-Attention Vision Transformer Model and the SVM
Khalladi, Sofiane Abdelkrim
Ouessaia, Asmaa
Keche, Mokhtar
AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2024, 58 (05) : 544 - 554
[24] Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
Khalladi, Sofiane Abdelkrim
Ouessai, Asmaa
Benamara, Nadir Kamel
Keche, Mokhtar
TRANSPORT AND TELECOMMUNICATION JOURNAL, 2024, 25 (01) : 20 - 30
[25] Hybrid Deep Learning EfficientNetV2 and Vision Transformer (EffNetV2-ViT) Model for Breast Cancer Histopathological Image Classification
Hayat, Mansoor
Ahmad, Nouman
Nasir, Anam
Tariq, Zeeshan Ahmad
IEEE ACCESS, 2024, 12 : 184119 - 184131
[26] Free satellite imagery and digital elevation model analyses enabling natural resource management in the developing world: Case studies from Eastern Indonesia
Fisher, Rohan Peter
Hobgen, Sarah Elizabeth
Haleberek, Kristianus
Sula, Nelson
Mandaya, Iradaf
SINGAPORE JOURNAL OF TROPICAL GEOGRAPHY, 2018, 39 (01) : 45 - 61

← 1 2 3 →