StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

被引：276

作者：

Gal, Rinon ^{[1
,2
]}

Patashnik, Or ^{[1
]}

Maron, Haggai ^{[2
]}

Bermano, Amit H. ^{[1
]}

Chechik, Gal ^{[2
]}

Cohen-Or, Daniel ^{[1
]}

机构：

[1] Tel Aviv Univ, Tel Aviv, Israel

[2] NVIDIA, Tel Aviv, Israel

来源：

ACM TRANSACTIONS ON GRAPHICS | 2022年 / 41卷 / 04期

关键词：

Generator Domain Adaptation; Text-Guided Content Generation; Zero-Shot Training;

D O I：

10.1145/3528223.3530164

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Can a generative model be trained to produce images from a specific domain, guided only by a text prompt, without seeing any image? In other words: can an image generator be trained "blindly"? Leveraging the semantic power of large scale Contrastive-Language-Image-Pre-training (CLIP) models, we present a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image. We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains characterized by diverse styles and shapes. Notably, many of these modifications would be difficult or infeasible to reach with existing methods. We conduct an extensive set of experiments across a wide range of domains. These demonstrate the effectiveness of our approach, and show that our models preserve the latent-space structure that makes generative models appealing for downstream tasks. Code and videos available at: stylegan-nada.github.io/

引用

页数：13

共 68 条

[21]

Karras T, 2020, ADV NEUR IN, V33

[22]

Karras T, 2021, Arxiv, DOI [arXiv:2106.12423, DOI 10.48550/ARXIV.2106.12423]

[23] A Style-Based Generator Architecture for Generative Adversarial Networks [J].

Karras, Tero ;

Laine, Samuli ;

Aila, Timo .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4396-4405

[24] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network [J].

Ledig, Christian ;

Theis, Lucas ;

Huszar, Ferenc ;

Caballero, Jose ;

Cunningham, Andrew ;

Acosta, Alejandro ;

Aitken, Andrew ;

Tejani, Alykhan ;

Totz, Johannes ;

Wang, Zehan ;

Shi, Wenzhe .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :105-114

[25]

Li Gen, 2020, PROC AAAI

[26]

Li Xiujun, 2020, EUROPEAN C COMPUTER

[27]

Li YJ, 2020, Arxiv, DOI arXiv:2012.02780

[28]

Li YJ, 2017, ADV NEUR IN, V30

[29] Visual Attribute Transfer through Deep Image Analogy [J].

Liao, Jing ;

Yao, Yuan ;

Yuan, Lu ;

Hua, Gang ;

Kang, Sing Bing .

ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04)

[30]

Liu B, 2020, P INT C LEARN REPR

← 1 2 3 4 5 6 7 →