A Unified Generative Adversarial Network With Convolution and Transformer for Remote Sensing Image Fusion

被引:0
作者
Wu, Yuanyuan [1 ,2 ]
Huang, Mengxing [1 ,3 ]
机构
[1] Hainan Univ, Sch Informat & Commun Engn, Haikou 570228, Peoples R China
[2] Guangdong Ocean Univ, Sch Elect & Informat Engn, Zhanjiang 524088, Peoples R China
[3] Hainan Univ, State Key Lab Marine Resource Utilizat South China, Haikou 570228, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Spatial resolution; Image resolution; Transformers; Generative adversarial networks; Biological system modeling; Pansharpening; Data models; Bidirectional local-global feature encoder; convolution and Transformer; multihead cross-attention fusion; multiresolution convolutional Transformer discriminators; remote sensing image (RSI) unified fusion model; SATELLITE IMAGES; LANDSAT; QUALITY; REFLECTANCE; FRAMEWORK; MODEL; MS;
D O I
10.1109/TGRS.2024.3441719
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Images derived from an individual sensor fail to simultaneously satisfy the demands of high spatial, spectral, and temporal resolutions. Multisource remote sensing image (RSI) fusion provides efficient access to high-spatial-resolution multispectral (HRMS) images [spatial-spectral fusion (SSF)] and high temporal- and spatial-resolution images [spatiotemporal fusion (STF)]. While existing deep learning (DL)-based models can mainly implement either SSF or STF, there is an urgent need for models that can simultaneously implement both SSF and STF. A unified generative adversarial network with convolution and Transformer (CTUGAN) for SSF and STF is proposed. CTUGAN contains a adaptive convolutional Transformer generator (ACTG) and multiresolution convolutional Transformer discriminator (MCTD), both with the convolution and Transformer. First, a bidirectional local-global feature encoder is devised in the ACTG to extract local-global features via a high-to-low resolution and a low-to-high resolution. Then, a multihead cross-attention fusion decoder (MCAFD) is devised to aggregate and fuse complementary local-global features of various levels and resolutions hierarchically to restore valuable information. Moreover, MCTDs adversely learn multiresolution local-global features to identify the relative reality of products, and a generalized loss function is built to accomplish full supervision. Finally, numerous experiments on the SSF data (Gaofen-2 (GF-2) and QuikBird) and STF data [Coleambally Irrigation Area (CIA) and lower Gwydir catchment (LGC)] demonstrate that the proposed CTUGAN model outperforms both subjective and objective evaluations.
引用
收藏
页数:22
相关论文
共 50 条
[21]   MSNet: A Multi-Stream Fusion Network for Remote Sensing Spatiotemporal Fusion Based on Transformer and Convolution [J].
Li, Weisheng ;
Cao, Dongwen ;
Peng, Yidong ;
Yang, Chao .
REMOTE SENSING, 2021, 13 (18)
[22]   A Supervised Progressive Growing Generative Adversarial Network for Remote Sensing Image Scene Classification [J].
Ma, Ailong ;
Yu, Ning ;
Zheng, Zhuo ;
Zhong, Yanfei ;
Zhang, Liangpei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[23]   DCDGAN-STF: A Multiscale Deformable Convolution Distillation GAN for Remote Sensing Image Spatiotemporal Fusion [J].
Zhang, Yan ;
Fan, Rongbo ;
Duan, PeiPei ;
Dong, Jinfang ;
Lei, Zhiyong .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :19436-19450
[24]   Multimodal Fusion Generative Adversarial Network for Image Synthesis [J].
Zhao, Liang ;
Hu, Qinghao ;
Li, Xiaoyuan ;
Zhao, Jingyuan .
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 :1865-1869
[25]   Dual-Task Cascaded Network for Spatial-Temporal-Spectral Remote Sensing Image Fusion [J].
Meng, Xiangchao ;
Chen, Xu ;
Zhang, Mengjing ;
Shao, Feng ;
Yang, Gang ;
Sun, Weiwei ;
Chen, Liang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[26]   Remote sensing image fusion based on two-stream fusion network [J].
Liu, Xiangyu ;
Liu, Qingjie ;
Wang, Yunhong .
INFORMATION FUSION, 2020, 55 :1-15
[27]   An Attention Encoder-Decoder Network Based on Generative Adversarial Network for Remote Sensing Image Dehazing [J].
Zhao, Liquan ;
Zhang, Yupeng ;
Cui, Ying .
IEEE SENSORS JOURNAL, 2022, 22 (11) :10890-10900
[28]   DMNet: A Network Architecture Using Dilated Convolution and Multiscale Mechanisms for Spatiotemporal Fusion of Remote Sensing Images [J].
Li, Weisheng ;
Zhang, Xiayan ;
Peng, Yidong ;
Dong, Meilin .
IEEE SENSORS JOURNAL, 2020, 20 (20) :12190-12202
[29]   SwinSTFM: Remote Sensing Spatiotemporal Fusion Using Swin Transformer [J].
Chen, Guanyu ;
Jiao, Peng ;
Hu, Qing ;
Xiao, Linjie ;
Ye, Zijian .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[30]   STAFusion: An Adversarial Learning Network for Infrared and Visible Image Fusion via Swin Transformer [J].
Zhai, Yi ;
Song, Baoping ;
Cheng, Jinyong ;
Dong, Aimei ;
Lv, Guohua .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025,