CRS-Diff: Controllable Remote Sensing Image Generation With Diffusion Model

被引:3
作者
Tang, Datao [1 ,2 ]
Cao, Xiangyong [1 ,2 ]
Hou, Xingsong [3 ]
Jiang, Zhongyuan [4 ]
Liu, Junmin [5 ]
Meng, Deyu [2 ,5 ,6 ]
机构
[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, Key Lab Intelligent Networks & Network Secur, Minist Educ, Xian 710049, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Xian 710049, Shaanxi, Peoples R China
[4] Xidian Univ, Sch Cyber Engn, Xian 710049, Shaanxi, Peoples R China
[5] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Shaanxi, Peoples R China
[6] Macau Univ Scienceand Technol, Macao Inst Syst Engn, Taipa, Macao, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
Diffusion models; Image synthesis; Image resolution; Text to image; Remote sensing; Training; Task analysis; Controllable generation; deep learning; diffusion model; remote sensing (RS) image;
D O I
10.1109/TGRS.2024.3453414
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The emergence of generative models has revolutionized the field of remote sensing (RS) image generation. Despite generating high-quality images, existing methods are limited in relying mainly on text control conditions, and thus do not always generate images accurately and stably. In this article, we propose CRS-Diff, a new RS generative framework specifically tailored for RS image generation, leveraging the inherent advantages of diffusion models while integrating more advanced control mechanisms. Specifically, CRS-Diff can simultaneously support text-condition, metadata-condition, and image-condition control inputs, thus enabling more precise control to refine the generation process. To effectively integrate multiple condition control information, we introduce a new conditional control mechanism to achieve multiscale feature fusion (FF), thus enhancing the guiding effect of control conditions. To the best of our knowledge, CRS-Diff is the first multiple-condition controllable RS generative model. Experimental results in single-condition and multiple-condition cases have demonstrated the superior ability of our CRS-Diff to generate RS images both quantitatively and qualitatively compared with previous methods. Additionally, our CRS-Diff can serve as a data engine that generates high-quality training data for downstream tasks, e.g., road extraction. The code is available at https://github.com/Sonettoo/CRS-Diff.
引用
收藏
页数:14
相关论文
共 58 条
  • [1] Functional Map of the World
    Christie, Gordon
    Fendley, Neil
    Wilson, James
    Mukherjee, Ryan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6172 - 6180
  • [2] Diffusion Models in Vision: A Survey
    Croitoru, Florinel-Alin
    Hondru, Vlad
    Ionescu, Radu Tudor
    Shah, Mubarak
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 10850 - 10869
  • [3] Attentional Feature Fusion
    Dai, Yimian
    Gieseke, Fabian
    Oehmcke, Stefan
    Wu, Yiquan
    Barnard, Kobus
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3559 - 3568
  • [4] On a Model of Associative Memory with Huge Storage Capacity
    Demircigil, Mete
    Heusel, Judith
    Loewe, Matthias
    Upgang, Sven
    Vermet, Franck
    [J]. JOURNAL OF STATISTICAL PHYSICS, 2017, 168 (02) : 288 - 299
  • [5] Espinosa Miguel, 2023, ARXIV
  • [6] ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
    Feng, Zhida
    Zhang, Zhenyu
    Yu, Xintong
    Fang, Yewei
    Li, Lanxin
    Chen, Xuyi
    Lu, Yuxiang
    Liu, Jiaxiang
    Yin, Weichong
    Feng, Shikun
    Sun, Yu
    Chen, Li
    Tian, Hao
    Wu, Hua
    Wang, Haifeng
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10135 - 10145
  • [7] Vector Quantized Diffusion Model for Text-to-Image Synthesis
    Gu, Shuyang
    Chen, Dong
    Bao, Jianmin
    Wen, Fang
    Zhang, Bo
    Chen, Dongdong
    Yuan, Lu
    Guo, Baining
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10686 - 10696
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Hensel M, 2017, ADV NEUR IN, V30
  • [10] Ho J., 2022, arXiv