A comprehensive evaluation of deep vision transformers for road extraction from very-high-resolution satellite data

被引：0

作者：

Bolcek, Jan ^{[1
,2
]}

Gibril, Mohamed Barakat A. ^{[1
]}

Al-Ruzouq, Rami ^{[1
]}

Shanableh, Abdallah ^{[1
,3
]}

Jena, Ratiranjan ^{[1
]}

Hammouri, Nezar ^{[1
]}

Sachit, Mourtadha Sarhan ^{[4
]}

Ghorbanzadeh, Omid ^{[5
]}

机构：

[1] Univ Sharjah, Res Inst Sci & Engn, GIS & Remote Sensing Ctr, Sharjah 27272, U Arab Emirates

[2] Brno Univ Technol, Fac Elect Engn & Commun, Dept Radio Elect, Brno Kralovo Pole 61600, Czech Republic

[3] Australian Univ, Sci Res Ctr, Kuwait, Kuwait

[4] Univ Thi Qar, Coll Engn, Dept Civil Engn, Nasiriyah 64001, Thi Qar, Iraq

[5] Univ Nat Resources & Life Sci, Inst Geomat, Peter Jordan Str 82, A-1190 Vienna, Austria

来源：

SCIENCE OF REMOTE SENSING | 2025年 / 11卷

关键词：

Remote sensing; Road extraction; Satellite data; Semantic segmentation; Vision Transformers; REMOTE-SENSING IMAGERY; NETWORK;

D O I：

10.1016/j.srs.2024.100190

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Transformer-based semantic segmentation architectures excel in extracting road networks from very-high- resolution (VHR) satellite images due to their ability to capture global contextual information. Nonetheless, there is a gap in research regarding their comparative effectiveness, efficiency, and performance in extracting road networks from multicity VHR data. This study evaluates 11 transformer-based models on three publicly available datasets (DeepGlobe Road Extraction Dataset, SpaceNet-3 Road Network Detection Dataset, and Massachusetts Road Dataset) to assess their performance, efficiency, and complexity in mapping road networks from multicity, multidate, and multisensory VHR optical satellite images. The evaluated models include Unified Perceptual Parsing for Scene Understanding (UperNet) based on the Swin transformer (UperNet-SwinT), and Multi-path Vision Transformer (UperNet-MpViT), Twins transformer, Segmenter, SegFormer, K-Net based on SwinT, Mask2Former based on SwinT (Mask2Former-SwinT), TopFormer, UniFormer, and PoolFormer. Results showed that the models recorded mean F-scores (mF-score) ranging from 82.22% to 90.70% for the DeepGlobe dataset, 58.98%-86.95% for the Massachusetts dataset, and 69.02%-86.14% for the SpaceNet-3 dataset. Mask2Former-SwinT, UperNet-MpViT, and SegFormer were the top performers among the evaluated models. The Mask2Former, based on the SwinT, demonstrated a strong balance of high performance across different satellite image datasets and moderate computational efficiency. This investigation aids in selecting the most suitable model for extracting road networks from remote sensing data.

引用

页数：17

共 50 条

[31] DSMSA-Net: Deep Spatial and Multi-scale Attention Network for Road Extraction in High Spatial Resolution Satellite Images
Sultan Daud Khan
Louai Alarabi
Saleh Basalamah
Arabian Journal for Science and Engineering, 2023, 48 : 1907 - 1920
[32] NEW NEURAL NETWORK AND AN IMAGE POSTPROCESSING METHOD FOR HIGH RESOLUTION SATELLITE IMAGERY ROAD EXTRACTION
Li, Yuxia
Peng, Bo
Fan, Kunlong
Yuan, Lang
Tong, Ling
He, Lei
2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 3935 - 3938
[33] CE-RoadNet: A Cascaded Efficient Road Network for Road Extraction from High-Resolution Satellite Images
Cheng, Ke-Nan
Ni, Weiping
Zhang, Han
Wu, Junzheng
Xiao, Xiao
Yang, Zhigang
REMOTE SENSING, 2025, 17 (05)
[34] Symmetrical Dense-Shortcut Deep Fully Convolutional Networks for Semantic Segmentation of Very-High-Resolution Remote Sensing Images
Chen, Guanzhou
Zhang, Xiaodong
Wang, Qing
Dai, Fan
Gong, Yuanfu
Zhu, Kun
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (05) : 1633 - 1644
[35] VEDAM: Urban Vegetation Extraction Based on Deep Attention Model from High-Resolution Satellite Images
Yang, Bin
Zhao, Mengci
Xing, Ying
Zeng, Fuping
Sun, Zhaoyang
ELECTRONICS, 2023, 12 (05)
[36] Research Status on Road Information Extraction from High Resolution Imagery
Chen, Hao
Ma, Li
Liang, Tian
MANUFACTURING PROCESS AND EQUIPMENT, PTS 1-4, 2013, 694-697 : 1970 - +
[37] AUTOMATIC ROAD EXTRACTION BASED ON LOCAL HISTOGRAM AND SUPPORT VECTOR DATA DESCRIPTION CLASSIFIER FROM VERY HIGH RESOLUTION DIGITAL AERIAL
Zhang, Rui
Lin, Xiangguo
2010 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2010, : 441 - 444
[38] Urban Road Network Extraction Based on Zebra Crossing Detection From a Very High Resolution RGB Aerial Image and DSM Data
Herumurti, Darlis
Uchimura, Keiichi
Koutaki, Gou
Uemura, Takumi
2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2013, : 79 - 84
[39] A Deep Learning-Based Solution for Large-Scale Extraction of the Secondary Road Network from High-Resolution Aerial Orthoimagery
Cira, Calimanut-Ionut
Alcarria, Ramon
Manso-Callejo, Miguel-Angel
Serradilla, Francisco
APPLIED SCIENCES-BASEL, 2020, 10 (20): : 1 - 18
[40] The Rapid Method for Road Extraction from High-Resolution Satellite Images Based on USM Algorithm
Liu Xu
Tao Jun
Yu Xiang
Cheng JianJie
Guo LiQian
PROCEEDINGS OF 2012 INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND SIGNAL PROCESSING, 2012, : 96 - +

← 1 2 3 4 5 →