SWCGAN: Generative Adversarial Network Combining Swin Transformer and CNN for Remote Sensing Image Super-Resolution

被引:54
作者
Tu, Jingzhi [1 ]
Mei, Gang [1 ]
Ma, Zhengjing [1 ]
Piccialli, Francesco [2 ]
机构
[1] China Univ Geosci, Sch Engn & Technol, Beijing 100083, Peoples R China
[2] Univ Naples Federico II, Dept Math & Applicat R Caccioppoli, I-80138 Naples, Italy
基金
中国国家自然科学基金;
关键词
Feature extraction; Superresolution; Transformers; Remote sensing; Image reconstruction; Generative adversarial networks; Task analysis; Convolutional layers; generative adversarial network (GAN); remote sensing images; super-resolution reconstruction; swin transformer;
D O I
10.1109/JSTARS.2022.3190322
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Easy and efficient acquisition of high-resolution remote sensing images is of importance in geographic information systems. Previously, deep neural networks composed of convolutional layers have achieved impressive progress in super-resolution reconstruction. However, the inherent problems of the convolutional layer, including the difficulty of modeling the long-range dependency, limit the performance of these networks on super-resolution reconstruction. To address the abovementioned problems, we propose a generative adversarial network (GAN) by combining the advantages of the swin transformer and convolutional layers, called SWCGAN. It is different from the previous super-resolution models, which are composed of pure convolutional blocks. The essential idea behind the proposed method is to generate high-resolution images by a generator network with a hybrid of convolutional and swin transformer layers and then to use a pure swin transformer discriminator network for adversarial training. In the proposed method, first, we employ a convolutional layer for shallow feature extraction that can be adapted to flexible input sizes; second, we further propose the residual dense swin transformer block to extract deep features for upsampling to generate high-resolution images; and third, we use a simplified swin transformer as the discriminator for adversarial training. To evaluate the performance of the proposed method, we compare the proposed method with other state-of-the-art methods by utilizing the UCMerced benchmark dataset, and we apply the proposed method to real-world remote sensing images. The results demonstrate that the reconstruction performance of the proposed method outperforms other state-of-the-art methods in most metrics.
引用
收藏
页码:5662 / 5673
页数:12
相关论文
共 44 条
[1]  
Akramullah S., 2014, DIGITAL VIDEO CONCEP, P101, DOI 10.1007/978-1-4302-6713-3_4
[2]  
Bahdanau D., 2015, ICLR
[3]   Remote Sensing Image Super-Resolution via Residual Aggregation and Split Attentional Fusion Network [J].
Chen, Long ;
Liu, Hui ;
Yang, Minhang ;
Qian, Yurong ;
Xiao, Zhengqing ;
Zhong, Xiwu .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :9546-9556
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[6]  
Dosovitskiy A., 2020, INT C LEARN REPRESEN, P1
[7]   Detection of tuberculosis from chest X-ray images: Boosting the performance with vision transformer and transfer learning [J].
Duong, Linh T. ;
Le, Nhi H. ;
Tran, Toan B. ;
Ngo, Vuong M. ;
Nguyen, Phuong T. .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
[8]   Road Extraction Using a Dual Attention Dilated-LinkNet Based on Satellite Images and Floating Vehicle Trajectory Data [J].
Gao, Lipeng ;
Wang, Jingyu ;
Wang, Qixin ;
Shi, Wenzhong ;
Zheng, Jiangbin ;
Gan, Hongping ;
Lv, Zhiyong ;
Qiao, Honghai .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :10428-10438
[9]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[10]   UNSUPERVISED LEARNING FOR OPTICAL FLOW ESTIMATION USING PYRAMID CONVOLUTION LSTM [J].
Guan, Shuosen ;
Li, Haoxin ;
Zheng, Wei-Shi .
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, :181-186