A comprehensive evaluation of deep vision transformers for road extraction from very-high-resolution satellite data

被引：0

作者：

Bolcek, Jan ^{[1
,2
]}

Gibril, Mohamed Barakat A. ^{[1
]}

Al-Ruzouq, Rami ^{[1
]}

Shanableh, Abdallah ^{[1
,3
]}

Jena, Ratiranjan ^{[1
]}

Hammouri, Nezar ^{[1
]}

Sachit, Mourtadha Sarhan ^{[4
]}

Ghorbanzadeh, Omid ^{[5
]}

机构：

[1] Univ Sharjah, Res Inst Sci & Engn, GIS & Remote Sensing Ctr, Sharjah 27272, U Arab Emirates

[2] Brno Univ Technol, Fac Elect Engn & Commun, Dept Radio Elect, Brno Kralovo Pole 61600, Czech Republic

[3] Australian Univ, Sci Res Ctr, Kuwait, Kuwait

[4] Univ Thi Qar, Coll Engn, Dept Civil Engn, Nasiriyah 64001, Thi Qar, Iraq

[5] Univ Nat Resources & Life Sci, Inst Geomat, Peter Jordan Str 82, A-1190 Vienna, Austria

来源：

SCIENCE OF REMOTE SENSING | 2025年 / 11卷

关键词：

Remote sensing; Road extraction; Satellite data; Semantic segmentation; Vision Transformers; REMOTE-SENSING IMAGERY; NETWORK;

D O I：

10.1016/j.srs.2024.100190

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Transformer-based semantic segmentation architectures excel in extracting road networks from very-high- resolution (VHR) satellite images due to their ability to capture global contextual information. Nonetheless, there is a gap in research regarding their comparative effectiveness, efficiency, and performance in extracting road networks from multicity VHR data. This study evaluates 11 transformer-based models on three publicly available datasets (DeepGlobe Road Extraction Dataset, SpaceNet-3 Road Network Detection Dataset, and Massachusetts Road Dataset) to assess their performance, efficiency, and complexity in mapping road networks from multicity, multidate, and multisensory VHR optical satellite images. The evaluated models include Unified Perceptual Parsing for Scene Understanding (UperNet) based on the Swin transformer (UperNet-SwinT), and Multi-path Vision Transformer (UperNet-MpViT), Twins transformer, Segmenter, SegFormer, K-Net based on SwinT, Mask2Former based on SwinT (Mask2Former-SwinT), TopFormer, UniFormer, and PoolFormer. Results showed that the models recorded mean F-scores (mF-score) ranging from 82.22% to 90.70% for the DeepGlobe dataset, 58.98%-86.95% for the Massachusetts dataset, and 69.02%-86.14% for the SpaceNet-3 dataset. Mask2Former-SwinT, UperNet-MpViT, and SegFormer were the top performers among the evaluated models. The Mask2Former, based on the SwinT, demonstrated a strong balance of high performance across different satellite image datasets and moderate computational efficiency. This investigation aids in selecting the most suitable model for extracting road networks from remote sensing data.

引用

页数：17

共 50 条

[21] Crisscross-Global Vision Transformers Model for Very High Resolution Aerial Image Semantic Segmentation
Deng, Guohui
Wu, Zhaocong
Xu, Miaozhong
Wang, Chengjun
Wang, Zhiye
Lu, Zhongyuan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[22] An Integrated Multistage Framework for Automatic Road Extraction from High Resolution Satellite Imagery
Mirnalinee, T. T.
Das, Sukhendu
Varghese, Koshy
JOURNAL OF THE INDIAN SOCIETY OF REMOTE SENSING, 2011, 39 (01) : 1 - 25
[23] A Deep Cross-Modal Fusion Network for Road Extraction With High-Resolution Imagery and LiDAR Data
Luo, Hui
Wang, Zijing
Du, Bo
Dong, Yanni
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[24] Geoscene-based Vehicle Detection from Very-high-resolution Images
Shu, Mi
Du, Shihong
2016 4rth International Workshop on Earth Observation and Remote Sensing Applications (EORSA), 2016,
[25] IDANet: Iterative D-LinkNets with Attention for Road Extraction from High-Resolution Satellite Imagery
Xu, Benzhu
Bao, Shengshuai
Zheng, Liping
Zhang, Gaofeng
Wu, Wenming
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 140 - 152
[26] ConvNeXt-UperNet-Based Deep Learning Model for Road Extraction from High-Resolution Remote Sensing Images
Wang, Jing
Zhang, Chen
Lin, Tianwen
CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (02): : 1907 - 1925
[27] Road Extraction Methods in High-Resolution Remote Sensing Images: A Comprehensive Review
Lian, Renbao
Wang, Weixing
Mustafa, Nadir
Huang, Liqin
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 (13) : 5489 - 5507
[28] Fully automated road network extraction from high-resolution satellite multispectral imagery
Shackelford, AK
Davis, CH
IGARSS 2003: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS I - VII, PROCEEDINGS: LEARNING FROM EARTH'S SHAPES AND SIZES, 2003, : 461 - 463
[29] DSMSA-Net: Deep Spatial and Multi-scale Attention Network for Road Extraction in High Spatial Resolution Satellite Images
Khan, Sultan Daud
Alarabi, Louai
Basalamah, Saleh
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 1907 - 1920
[30] A Local-Global Dual-Stream Network for Building Extraction From Very-High-Resolution Remote Sensing Images
Zhang, Hongyan
Liao, Yue
Yang, Honghai
Yang, Guangyi
Zhang, Liangpei
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (03) : 1269 - 1283

← 1 2 3 4 5 →