DESAT: A Distance-Enhanced Strip Attention Transformer for Remote Sensing Image Super-Resolution

被引:1
作者
Mao, Yujie [1 ,2 ]
He, Guojin [1 ,2 ,3 ,4 ]
Wang, Guizhou [1 ,2 ,3 ,4 ]
Yin, Ranyu [1 ,3 ,4 ]
Peng, Yan [1 ,3 ,4 ]
Guan, Bin [5 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Kashgar Aerosp Informat Res Inst, Kashgar 844000, Peoples R China
[4] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Earth Observat Hainan Prov, Sanya 572029, Peoples R China
[5] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金;
关键词
remote sensing; image super-resolution; deep learning; transformer; self-attention; Gaofen-6; satellite;
D O I
10.3390/rs16224251
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Transformer-based methods have demonstrated impressive performance in image super-resolution tasks. However, when applied to large-scale Earth observation images, the existing transformers encounter two significant challenges: (1) insufficient consideration of spatial correlation between adjacent ground objects; and (2) performance bottlenecks due to the underutilization of the upsample module. To address these issues, we propose a novel distance-enhanced strip attention transformer (DESAT). The DESAT integrates distance priors, easily obtainable from remote sensing images, into the strip window self-attention mechanism to capture spatial correlations more effectively. To further enhance the transfer of deep features into high-resolution outputs, we designed an attention-enhanced upsample block, which combines the pixel shuffle layer with an attention-based upsample branch implemented through the overlapping window self-attention mechanism. Additionally, to better simulate real-world scenarios, we constructed a new cross-sensor super-resolution dataset using Gaofen-6 satellite imagery. Extensive experiments on both simulated and real-world remote sensing datasets demonstrate that the DESAT outperforms state-of-the-art models by up to 1.17 dB along with superior qualitative results. Furthermore, the DESAT achieves more competitive performance in real-world tasks, effectively balancing spatial detail reconstruction and spectral transform, making it highly suitable for practical remote sensing super-resolution applications.
引用
收藏
页数:27
相关论文
共 46 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]   Pre-Trained Image Processing Transformer [J].
Chen, Hanting ;
Wang, Yunhe ;
Guo, Tianyu ;
Xu, Chang ;
Deng, Yiping ;
Liu, Zhenhua ;
Ma, Siwei ;
Xu, Chunjing ;
Xu, Chao ;
Gao, Wen .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12294-12305
[3]   Real-world single image super-resolution: A brief review [J].
Chen, Honggang ;
He, Xiaohai ;
Qing, Linbo ;
Wu, Yuanyuan ;
Ren, Chao ;
Sheriff, Ray E. ;
Zhu, Ce .
INFORMATION FUSION, 2022, 79 :124-145
[4]   Activating More Pixels in Image Super-Resolution Transformer [J].
Chen, Xiangyu ;
Wang, Xintao ;
Zhou, Jiantao ;
Qiao, Yu ;
Dong, Chao .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22367-22377
[5]  
Chen Z., 2024, P INT C LEARN REPR I
[6]   Dual Aggregation Transformer for Image Super-Resolution [J].
Chen, Zheng ;
Zhang, Yulun ;
Gu, Jinjin ;
Kong, Linghe ;
Yang, Xiaokang ;
Yu, Fisher .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :12278-12287
[7]   N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution [J].
Choi, Haram ;
Lee, Jeongmin ;
Yang, Jihoon .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :2071-2081
[8]   Dual-domain strip attention for image restoration [J].
Cui, Yuning ;
Knoll, Alois .
NEURAL NETWORKS, 2024, 171 :429-439
[9]   Accelerating the Super-Resolution Convolutional Neural Network [J].
Dong, Chao ;
Loy, Chen Change ;
Tang, Xiaoou .
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :391-407
[10]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307