StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

被引:276
作者
Gal, Rinon [1 ,2 ]
Patashnik, Or [1 ]
Maron, Haggai [2 ]
Bermano, Amit H. [1 ]
Chechik, Gal [2 ]
Cohen-Or, Daniel [1 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] NVIDIA, Tel Aviv, Israel
来源
ACM TRANSACTIONS ON GRAPHICS | 2022年 / 41卷 / 04期
关键词
Generator Domain Adaptation; Text-Guided Content Generation; Zero-Shot Training;
D O I
10.1145/3528223.3530164
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Can a generative model be trained to produce images from a specific domain, guided only by a text prompt, without seeing any image? In other words: can an image generator be trained "blindly"? Leveraging the semantic power of large scale Contrastive-Language-Image-Pre-training (CLIP) models, we present a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image. We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains characterized by diverse styles and shapes. Notably, many of these modifications would be difficult or infeasible to reach with existing methods. We conduct an extensive set of experiments across a wide range of domains. These demonstrate the effectiveness of our approach, and show that our models preserve the latent-space structure that makes generative models appealing for downstream tasks. Code and videos available at: stylegan-nada.github.io/
引用
收藏
页数:13
相关论文
共 68 条
[21]  
Karras T, 2020, ADV NEUR IN, V33
[22]  
Karras T, 2021, Arxiv, DOI [arXiv:2106.12423, DOI 10.48550/ARXIV.2106.12423]
[23]   A Style-Based Generator Architecture for Generative Adversarial Networks [J].
Karras, Tero ;
Laine, Samuli ;
Aila, Timo .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4396-4405
[24]   Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network [J].
Ledig, Christian ;
Theis, Lucas ;
Huszar, Ferenc ;
Caballero, Jose ;
Cunningham, Andrew ;
Acosta, Alejandro ;
Aitken, Andrew ;
Tejani, Alykhan ;
Totz, Johannes ;
Wang, Zehan ;
Shi, Wenzhe .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :105-114
[25]  
Li Gen, 2020, PROC AAAI
[26]  
Li Xiujun, 2020, EUROPEAN C COMPUTER
[27]  
Li YJ, 2020, Arxiv, DOI arXiv:2012.02780
[28]  
Li YJ, 2017, ADV NEUR IN, V30
[29]   Visual Attribute Transfer through Deep Image Analogy [J].
Liao, Jing ;
Yao, Yuan ;
Yuan, Lu ;
Hua, Gang ;
Kang, Sing Bing .
ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04)
[30]  
Liu B, 2020, P INT C LEARN REPR