CRS-Diff: Controllable Remote Sensing Image Generation With Diffusion Model

被引：3

作者：

Tang, Datao ^{[1
,2
]}

Cao, Xiangyong ^{[1
,2
]}

Hou, Xingsong ^{[3
]}

Jiang, Zhongyuan ^{[4
]}

Liu, Junmin ^{[5
]}

Meng, Deyu ^{[2
,5
,6
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian 710049, Peoples R China

[2] Xi An Jiao Tong Univ, Key Lab Intelligent Networks & Network Secur, Minist Educ, Xian 710049, Peoples R China

[3] Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Xian 710049, Shaanxi, Peoples R China

[4] Xidian Univ, Sch Cyber Engn, Xian 710049, Shaanxi, Peoples R China

[5] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Shaanxi, Peoples R China

[6] Macau Univ Scienceand Technol, Macao Inst Syst Engn, Taipa, Macao, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

关键词：

Diffusion models; Image synthesis; Image resolution; Text to image; Remote sensing; Training; Task analysis; Controllable generation; deep learning; diffusion model; remote sensing (RS) image;

D O I：

10.1109/TGRS.2024.3453414

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

The emergence of generative models has revolutionized the field of remote sensing (RS) image generation. Despite generating high-quality images, existing methods are limited in relying mainly on text control conditions, and thus do not always generate images accurately and stably. In this article, we propose CRS-Diff, a new RS generative framework specifically tailored for RS image generation, leveraging the inherent advantages of diffusion models while integrating more advanced control mechanisms. Specifically, CRS-Diff can simultaneously support text-condition, metadata-condition, and image-condition control inputs, thus enabling more precise control to refine the generation process. To effectively integrate multiple condition control information, we introduce a new conditional control mechanism to achieve multiscale feature fusion (FF), thus enhancing the guiding effect of control conditions. To the best of our knowledge, CRS-Diff is the first multiple-condition controllable RS generative model. Experimental results in single-condition and multiple-condition cases have demonstrated the superior ability of our CRS-Diff to generate RS images both quantitatively and qualitatively compared with previous methods. Additionally, our CRS-Diff can serve as a data engine that generates high-quality training data for downstream tasks, e.g., road extraction. The code is available at https://github.com/Sonettoo/CRS-Diff.

引用

页数：14

共 58 条

[1] Functional Map of the World
Christie, Gordon
Fendley, Neil
Wilson, James
Mukherjee, Ryan
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6172 - 6180
[2] Diffusion Models in Vision: A Survey
Croitoru, Florinel-Alin
Hondru, Vlad
Ionescu, Radu Tudor
Shah, Mubarak
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 10850 - 10869
[3] Attentional Feature Fusion
Dai, Yimian
Gieseke, Fabian
Oehmcke, Stefan
Wu, Yiquan
Barnard, Kobus
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3559 - 3568
[4] On a Model of Associative Memory with Huge Storage Capacity
Demircigil, Mete
Heusel, Judith
Loewe, Matthias
Upgang, Sven
Vermet, Franck
[J]. JOURNAL OF STATISTICAL PHYSICS, 2017, 168 (02) : 288 - 299
[5] Espinosa Miguel, 2023, ARXIV
[6] ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
Feng, Zhida
Zhang, Zhenyu
Yu, Xintong
Fang, Yewei
Li, Lanxin
Chen, Xuyi
Lu, Yuxiang
Liu, Jiaxiang
Yin, Weichong
Feng, Shikun
Sun, Yu
Chen, Li
Tian, Hao
Wu, Hua
Wang, Haifeng
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10135 - 10145
[7] Vector Quantized Diffusion Model for Text-to-Image Synthesis
Gu, Shuyang
Chen, Dong
Bao, Jianmin
Wen, Fang
Zhang, Bo
Chen, Dongdong
Yuan, Lu
Guo, Baining
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10686 - 10696
[8] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[9] Hensel M, 2017, ADV NEUR IN, V30
[10] Ho J., 2022, arXiv

← 1 2 3 4 5 6 →