Anycost GANs for Interactive Image Synthesis and Editing

被引：58

作者：

Lin, Ji ^{[1
,2
]}

Zhang, Richard ^{[2
]}

Ganz, Frieder ^{[2
]}

Han, Song ^{[1
]}

Zhu, Jun-Yan ^{[2
,3
]}

机构：

[1] MIT, Cambridge, MA 02139 USA

[2] Adobe Res, San Jose, CA 95110 USA

[3] CMU, Pittsburgh, PA USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR46437.2021.01474

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing. However, due to the high computational cost of large-scale generators (e.g., StyleGAN2), it usually takes seconds to see the results of a single edit on edge devices, prohibiting interactive user experience. In this paper, inspired by quick preview features in modern rendering software, we propose Anycost GAN for interactive natural image editing. We train the Anycost GAN to support elastic resolutions and channels for faster image generation at versatile speeds. Running subsets of the full generator produce outputs that are perceptually similar to the full generator, making them a good proxy for quick preview. By using sampling-based multi-resolution training, adaptive-channel training, and a generator-conditioned discriminator, the anycost generator can be evaluated at various configurations while achieving better image quality compared to separately trained models. Furthermore, we develop new encoder training and latent code optimization techniques to encourage consistency between the different sub-generators during image projection. Anycost GAN can be executed at various cost budgets (up to 10x computation reduction) and adapt to a wide range of hardware and latency requirements. When deployed on desktop CPUs and edge devices, our model can provide perceptually similar previews at 6-12x speedup, enabling interactive image editing. The code and demo are publicly available.

引用

页码：14981 / 14991

页数：11

共 81 条

[1] Image2StyleGAN++: How to Edit the Embedded Images? [J].

Abdal, Rameen ;

Qin, Yipeng ;

Wonka, Peter .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8293-8302

[2] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? [J].

Abdal, Rameen ;

Qin, Yipeng ;

Wonka, Peter .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4431-4440

[3]

Aguinaldo A., 2019, CORR

[4] MimicGAN: Robust Projection onto Image Manifolds with Corruption Mimicking [J].

Anirudh, Rushil ;

Thiagarajan, Jayaraman J. ;

Kailkhura, Bhavya ;

Bremer, Peer-Timo .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (10-11) :2459-2477

[5]

[Anonymous], 2017, INT C COMP VIS

[6]

[Anonymous], 2019, INT C MACH LEARN ICM

[7] Semantic Photo Manipulation with a Generative Image Prior [J].

Bau, David ;

Strobelt, Hendrik ;

Peebles, William ;

Wulff, Jonas ;

Zhou, Bolei ;

Zhu, Jun-Yan ;

Torralba, Antonio .

ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)

[8]

Brock A., 2019, INT C LEARNING REPRE

[9]

Brock Andrew, 2017, P 5 INT C LEARN REPR

[10]

Cai H., 2020, P INT C LEARN REPR

← 1 2 3 4 5 6 7 8 9 →