DESAT: A Distance-Enhanced Strip Attention Transformer for Remote Sensing Image Super-Resolution

被引：1

作者：

Mao, Yujie ^{[1
,2
]}

He, Guojin ^{[1
,2
,3
,4
]}

Wang, Guizhou ^{[1
,2
,3
,4
]}

Yin, Ranyu ^{[1
,3
,4
]}

Peng, Yan ^{[1
,3
,4
]}

Guan, Bin ^{[5
]}

机构：

[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Kashgar Aerosp Informat Res Inst, Kashgar 844000, Peoples R China

[4] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Earth Observat Hainan Prov, Sanya 572029, Peoples R China

[5] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China

来源：

REMOTE SENSING | 2024年 / 16卷 / 22期

基金：

中国国家自然科学基金;

关键词：

remote sensing; image super-resolution; deep learning; transformer; self-attention; Gaofen-6; satellite;

D O I：

10.3390/rs16224251

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Transformer-based methods have demonstrated impressive performance in image super-resolution tasks. However, when applied to large-scale Earth observation images, the existing transformers encounter two significant challenges: (1) insufficient consideration of spatial correlation between adjacent ground objects; and (2) performance bottlenecks due to the underutilization of the upsample module. To address these issues, we propose a novel distance-enhanced strip attention transformer (DESAT). The DESAT integrates distance priors, easily obtainable from remote sensing images, into the strip window self-attention mechanism to capture spatial correlations more effectively. To further enhance the transfer of deep features into high-resolution outputs, we designed an attention-enhanced upsample block, which combines the pixel shuffle layer with an attention-based upsample branch implemented through the overlapping window self-attention mechanism. Additionally, to better simulate real-world scenarios, we constructed a new cross-sensor super-resolution dataset using Gaofen-6 satellite imagery. Extensive experiments on both simulated and real-world remote sensing datasets demonstrate that the DESAT outperforms state-of-the-art models by up to 1.17 dB along with superior qualitative results. Furthermore, the DESAT achieves more competitive performance in real-world tasks, effectively balancing spatial detail reconstruction and spectral transform, making it highly suitable for practical remote sensing super-resolution applications.

引用

页数：27

共 46 条

[1] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[2] Pre-Trained Image Processing Transformer [J].

Chen, Hanting ;

Wang, Yunhe ;

Guo, Tianyu ;

Xu, Chang ;

Deng, Yiping ;

Liu, Zhenhua ;

Ma, Siwei ;

Xu, Chunjing ;

Xu, Chao ;

Gao, Wen .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12294-12305

[3] Real-world single image super-resolution: A brief review [J].

Chen, Honggang ;

He, Xiaohai ;

Qing, Linbo ;

Wu, Yuanyuan ;

Ren, Chao ;

Sheriff, Ray E. ;

Zhu, Ce .

INFORMATION FUSION, 2022, 79 :124-145

[4] Activating More Pixels in Image Super-Resolution Transformer [J].

Chen, Xiangyu ;

Wang, Xintao ;

Zhou, Jiantao ;

Qiao, Yu ;

Dong, Chao .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22367-22377

[5]

Chen Z., 2024, P INT C LEARN REPR I

[6] Dual Aggregation Transformer for Image Super-Resolution [J].

Chen, Zheng ;

Zhang, Yulun ;

Gu, Jinjin ;

Kong, Linghe ;

Yang, Xiaokang ;

Yu, Fisher .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :12278-12287

[7] N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution [J].

Choi, Haram ;

Lee, Jeongmin ;

Yang, Jihoon .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :2071-2081

[8] Dual-domain strip attention for image restoration [J].

Cui, Yuning ;

Knoll, Alois .

NEURAL NETWORKS, 2024, 171 :429-439

[9] Accelerating the Super-Resolution Convolutional Neural Network [J].

Dong, Chao ;

Loy, Chen Change ;

Tang, Xiaoou .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :391-407

[10] Image Super-Resolution Using Deep Convolutional Networks [J].

Dong, Chao ;

Loy, Chen Change ;

He, Kaiming ;

Tang, Xiaoou .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307

← 1 2 3 4 5 →