StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

被引:209
作者
Gal, Rinon [1 ,2 ]
Patashnik, Or [1 ]
Maron, Haggai [2 ]
Bermano, Amit H. [1 ]
Chechik, Gal [2 ]
Cohen-Or, Daniel [1 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] NVIDIA, Tel Aviv, Israel
来源
ACM TRANSACTIONS ON GRAPHICS | 2022年 / 41卷 / 04期
关键词
Generator Domain Adaptation; Text-Guided Content Generation; Zero-Shot Training;
D O I
10.1145/3528223.3530164
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Can a generative model be trained to produce images from a specific domain, guided only by a text prompt, without seeing any image? In other words: can an image generator be trained "blindly"? Leveraging the semantic power of large scale Contrastive-Language-Image-Pre-training (CLIP) models, we present a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image. We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains characterized by diverse styles and shapes. Notably, many of these modifications would be difficult or infeasible to reach with existing methods. We conduct an extensive set of experiments across a wide range of domains. These demonstrate the effectiveness of our approach, and show that our models preserve the latent-space structure that makes generative models appealing for downstream tasks. Code and videos available at: stylegan-nada.github.io/
引用
收藏
页数:13
相关论文
共 68 条
  • [1] StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows
    Abdal, Rameen
    Zhu, Peihao
    Mitra, Niloy J.
    Wonka, Peter
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (03):
  • [2] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
    Abdal, Rameen
    Qin, Yipeng
    Wonka, Peter
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4431 - 4440
  • [3] Alaluf Y, 2021, Arxiv, DOI [arXiv:2102.02754, DOI 10.48550/ARXIV.2102.02754]
  • [4] Alaluf Yuval, 2021, arXiv
  • [5] [Anonymous], 1990, PARTITIONING MEDOIDS, P68, DOI [10.1002/9780470316801.ch2arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/9780470316801.ch2, DOI 10.1002/9780470316801.CH2ARXIV:HTTPS://ONLINELIBRARY.WILEY.COM/DOI/PDF/10.1002/9780470316801.CH2]
  • [6] Bau D, 2021, Arxiv, DOI arXiv:2103.10951
  • [7] Brock A, 2019, Arxiv, DOI arXiv:1809.11096
  • [8] Sariyildiz MB, 2020, Arxiv, DOI arXiv:2008.01392
  • [9] Chen Yen-Chun, 2020, ECCV
  • [10] Choi Y, 2020, PROC CVPR IEEE, P8185, DOI 10.1109/CVPR42600.2020.00821