A Unified Generative Adversarial Network With Convolution and Transformer for Remote Sensing Image Fusion

被引：0

作者：

Wu, Yuanyuan ^{[1
,2
]}

Huang, Mengxing ^{[1
,3
]}

机构：

[1] Hainan Univ, Sch Informat & Commun Engn, Haikou 570228, Peoples R China

[2] Guangdong Ocean Univ, Sch Elect & Informat Engn, Zhanjiang 524088, Peoples R China

[3] Hainan Univ, State Key Lab Marine Resource Utilizat South China, Haikou 570228, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

基金：

中国国家自然科学基金;

关键词：

Spatial resolution; Image resolution; Transformers; Generative adversarial networks; Biological system modeling; Pansharpening; Data models; Bidirectional local-global feature encoder; convolution and Transformer; multihead cross-attention fusion; multiresolution convolutional Transformer discriminators; remote sensing image (RSI) unified fusion model; SATELLITE IMAGES; LANDSAT; QUALITY; REFLECTANCE; FRAMEWORK; MODEL; MS;

D O I：

10.1109/TGRS.2024.3441719

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Images derived from an individual sensor fail to simultaneously satisfy the demands of high spatial, spectral, and temporal resolutions. Multisource remote sensing image (RSI) fusion provides efficient access to high-spatial-resolution multispectral (HRMS) images [spatial-spectral fusion (SSF)] and high temporal- and spatial-resolution images [spatiotemporal fusion (STF)]. While existing deep learning (DL)-based models can mainly implement either SSF or STF, there is an urgent need for models that can simultaneously implement both SSF and STF. A unified generative adversarial network with convolution and Transformer (CTUGAN) for SSF and STF is proposed. CTUGAN contains a adaptive convolutional Transformer generator (ACTG) and multiresolution convolutional Transformer discriminator (MCTD), both with the convolution and Transformer. First, a bidirectional local-global feature encoder is devised in the ACTG to extract local-global features via a high-to-low resolution and a low-to-high resolution. Then, a multihead cross-attention fusion decoder (MCAFD) is devised to aggregate and fuse complementary local-global features of various levels and resolutions hierarchically to restore valuable information. Moreover, MCTDs adversely learn multiresolution local-global features to identify the relative reality of products, and a generalized loss function is built to accomplish full supervision. Finally, numerous experiments on the SSF data (Gaofen-2 (GF-2) and QuikBird) and STF data [Coleambally Irrigation Area (CIA) and lower Gwydir catchment (LGC)] demonstrate that the proposed CTUGAN model outperforms both subjective and objective evaluations.

引用

页数：22

共 50 条

[21] MSNet: A Multi-Stream Fusion Network for Remote Sensing Spatiotemporal Fusion Based on Transformer and Convolution [J].

Li, Weisheng ;

Cao, Dongwen ;

Peng, Yidong ;

Yang, Chao .

REMOTE SENSING, 2021, 13 (18)

[22] A Supervised Progressive Growing Generative Adversarial Network for Remote Sensing Image Scene Classification [J].

Ma, Ailong ;

Yu, Ning ;

Zheng, Zhuo ;

Zhong, Yanfei ;

Zhang, Liangpei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

[23] DCDGAN-STF: A Multiscale Deformable Convolution Distillation GAN for Remote Sensing Image Spatiotemporal Fusion [J].

Zhang, Yan ;

Fan, Rongbo ;

Duan, PeiPei ;

Dong, Jinfang ;

Lei, Zhiyong .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 :19436-19450

[24] Multimodal Fusion Generative Adversarial Network for Image Synthesis [J].

Zhao, Liang ;

Hu, Qinghao ;

Li, Xiaoyuan ;

Zhao, Jingyuan .

IEEE SIGNAL PROCESSING LETTERS, 2024, 31 :1865-1869

[25] Dual-Task Cascaded Network for Spatial-Temporal-Spectral Remote Sensing Image Fusion [J].

Meng, Xiangchao ;

Chen, Xu ;

Zhang, Mengjing ;

Shao, Feng ;

Yang, Gang ;

Sun, Weiwei ;

Chen, Liang .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63

[26] Remote sensing image fusion based on two-stream fusion network [J].

Liu, Xiangyu ;

Liu, Qingjie ;

Wang, Yunhong .

INFORMATION FUSION, 2020, 55 :1-15

[27] An Attention Encoder-Decoder Network Based on Generative Adversarial Network for Remote Sensing Image Dehazing [J].

Zhao, Liquan ;

Zhang, Yupeng ;

Cui, Ying .

IEEE SENSORS JOURNAL, 2022, 22 (11) :10890-10900

[28] DMNet: A Network Architecture Using Dilated Convolution and Multiscale Mechanisms for Spatiotemporal Fusion of Remote Sensing Images [J].

Li, Weisheng ;

Zhang, Xiayan ;

Peng, Yidong ;

Dong, Meilin .

IEEE SENSORS JOURNAL, 2020, 20 (20) :12190-12202

[29] SwinSTFM: Remote Sensing Spatiotemporal Fusion Using Swin Transformer [J].

Chen, Guanyu ;

Jiao, Peng ;

Hu, Qing ;

Xiao, Linjie ;

Ye, Zijian .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

[30] STAFusion: An Adversarial Learning Network for Infrared and Visible Image Fusion via Swin Transformer [J].

Zhai, Yi ;

Song, Baoping ;

Cheng, Jinyong ;

Dong, Aimei ;

Lv, Guohua .

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025,

← 1 2 3 4 5 →