Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

被引：154

作者：

Chen, Runfa ^{[1
]}

Huang, Wenbing ^{[1
]}

Huang, Binghui ^{[1
]}

Sun, Fuchun ^{[1
]}

Fang, Bin ^{[1
]}

机构：

[1] Tsinghua Univ, Tsinghua Univ THUAI,Dept Comp Sci & Technol, Inst Artificial Intelligence,State Key Lab Intell, Beijing Natl Res Ctr Informat Sci & Technol BNRis, Beijing, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR42600.2020.00819

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component. Our codes are available at https://github.com/alpc91/NICE-GAN-pytorch.

引用

页码：8165 / 8174

页数：10

共 40 条

[1] TraVeLGAN: Image-to-image Translation by Transformation Vector Learning [J].

Amodio, Matthew ;

Krishnaswamy, Smita .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8975-8984

[2]

[Anonymous], 2018, ADV NEURAL INFORM PR

[3]

[Anonymous], 2018, ICLR

[4] ComboGAN: Unrestrained Scalability for Image Domain Translation [J].

Anoosheh, Asha ;

Agustsson, Eirikur ;

Timofte, Radu ;

Van Gool, Luc .

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :896-903

[5]

Binkowski Mikolaj, 2018, INT C LEARNING REPRE

[6]

Brock Andrew, 2017, 5 INT C LEARN REPR

[7] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [J].

Choi, Yunjey ;

Choi, Minje ;

Kim, Munyoung ;

Ha, Jung-Woo ;

Kim, Sunghun ;

Choo, Jaegul .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8789-8797

[8] Sparse, Smart Contours to Represent and Edit Images [J].

Dekel, Tali ;

Gan, Chuang ;

Krishnan, Dilip ;

Liu, Ce ;

Freeman, William T. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3511-3520

[9]

Demir U., 2018, arXiv

[10]

Durugkar I., 2017, P INT C LEARN REPR

← 1 2 3 4 →