Ensembling with Deep Generative Views

被引：29

作者：

Chai, Lucy ^{[1
,2
]}

Zhu, Jun-Yan ^{[2
,3
]}

Shechtman, Eli ^{[2
]}

Isola, Phillip ^{[1
]}

Zhang, Richard ^{[2
]}

机构：

[1] MIT, Cambridge, MA 02139 USA

[2] Adobe Res, San Jose, CA 95110 USA

[3] CMU, Pittsburgh, PA USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR46437.2021.01475

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose, simply by learning from unlabeled image collections. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. Using a pretrained generator, we first find the latent code corresponding to a given real input image. Applying perturbations to the code creates natural variations of the image, which can then be ensembled together at test-time. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars. Critically, we find that several design decisions are required towards making this process work; the perturbation procedure, weighting between the augmentations and original image, and training the classifier on synthesized images can all impact the result. Currently, we find that while test-time ensembling with GAN-based augmentations can offer some small improvements, the remaining bottlenecks are the efficiency and accuracy of the GAN reconstructions, coupled with classifier sensitivities to artifacts in GAN-generated images.

引用

页码：14992 / 15002

页数：11

共 70 条

[1] Image2StyleGAN++: How to Edit the Embedded Images? [J].

Abdal, Rameen ;

Qin, Yipeng ;

Wonka, Peter .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8293-8302

[2] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? [J].

Abdal, Rameen ;

Qin, Yipeng ;

Wonka, Peter .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4431-4440

[3]

Azulay Aharon, 2020, WHY DO DEEP CONVOLUT

[4]

Bau D., 2019, P ICLR

[5] Seeing What a GAN Cannot Generate [J].

Bau, David ;

Zhu, Jun-Yan ;

Wulff, Jonas ;

Peebles, William ;

Strobelt, Hendrik ;

Zhou, Bolei ;

Torralba, Antonio .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4501-4510

[6] Semantic Photo Manipulation with a Generative Image Prior [J].

Bau, David ;

Strobelt, Hendrik ;

Peebles, William ;

Wulff, Jonas ;

Zhou, Bolei ;

Zhu, Jun-Yan ;

Torralba, Antonio .

ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)

[7]

Brock A., 2016, ICLR

[8]

Brock Andrew, 2018, Large scale GAN training for high fidelity natural image synthesis

[9] Towards Evaluating the Robustness of Neural Networks [J].

Carlini, Nicholas ;

Wagner, David .

2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57

[10] What Makes Fake Images Detectable? Understanding Properties that Generalize [J].

Chai, Lucy ;

Bau, David ;

Lim, Ser-Nam ;

Isola, Phillip .

COMPUTER VISION - ECCV 2020, PT XXVI, 2020, 12371 :103-120

← 1 2 3 4 5 6 7 →