Text-to-Remote-Sensing-Image Generation With Structured Generative Adversarial Networks

被引：11

作者：

Zhao, Rui ^{[1
,2
,3
]}

Shi, Zhenwei ^{[1
,2
,3
]}

机构：

[1] Beihang Univ, Image Proc Ctr, Sch Astronaut, Beijing 100191, Peoples R China

[2] Beihang Univ, Beijing Key Lab Digital Media, Beijing 100191, Peoples R China

[3] Beihang Univ, Sch Astronaut, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China

来源：

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS | 2022年 / 19卷

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Remote sensing; Generators; Task analysis; Bridges; Sensors; Semantics; Image segmentation; Generative adversarial networks (GANs); remote sensing image synthesize; structural rationality; text description;

D O I：

10.1109/LGRS.2021.3068391

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Synthesizing high-resolution remote sensing images based on the given text descriptions has great potential in expanding the image data set to release the power of deep learning in the remote sensing image processing field. However, there has been no efficient research carried out on this formidable task yet. Given a remote sensing image, the structural rationality of ground objects is critical to judge it whether real or fake, e.g., real bridges are always straight, while a sinuous one can be easily judged as fake. Inspired by this, we propose a multistage structured generative adversarial network (StrucGAN) to synthesize remote sensing images in a structured way given the text descriptions. StrucGAN utilizes structural information extracted by an unsupervised segmentation module to enable the discriminators to distinguish the image in a structured way. The generators of StrucGAN are, thus, forced to synthesize structural reasonable image contents, which could enhance the image authenticity. The multistage framework enables the StrucGAN to generate remote sensing images with increasing resolution stage by stage. The quantitative and qualitative experiments' results show that the proposed StrucGAN achieves better performance compared with the baseline, and it could synthesize high resolution, realistic, structural reasonable remote sensing images that are semantically consistent with the given text descriptions.

引用

页数：5

共 18 条

[1] Improving Text Encoding for Retro-Remote Sensing
Bejiga, Mesay Belete
Hoxha, Genc
Melgani, Farid
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (04) : 622 - 626
[2] Bejiga MB, 2020, 2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), P89, DOI [10.1109/m2garss47143.2020.9105139, 10.1109/M2GARSS47143.2020.9105139]
[3] Retro-Remote Sensing: Generating Images From Ancient Texts
Bejiga, Mesay Belete
Melgani, Farid
Vascotto, Antonio
[J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (03) : 950 - 960
[4] Generative Adversarial Networks
Goodfellow, Ian
Pouget-Abadie, Jean
Mirza, Mehdi
Xu, Bing
Warde-Farley, David
Ozair, Sherjil
Courville, Aaron
Bengio, Yoshua
[J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144
[5] Guided Image Filtering
He, Kaiming
Sun, Jian
Tang, Xiaoou
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (06) : 1397 - 1409
[6] Exploring Models and Data for Remote Sensing Image Caption Generation
Lu, Xiaoqiang
Wang, Binqiang
Zheng, Xiangtao
Li, Xuelong
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (04): : 2183 - 2195
[7] MirrorGAN: Learning Text-to-image Generation by Redescription
Qiao, Tingting
Zhang, Jing
Xu, Duanqing
Tao, Dacheng
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1505 - 1514
[8] Le Q, 2014, PR MACH LEARN RES, V32, P1188
[9] Reed S, 2016, ADV NEUR IN, V29
[10] Reed S, 2016, PR MACH LEARN RES, V48

← 1 2 →