BEGAN v3: Avoiding Mode Collapse in GANs Using Variational Inference

被引:24
作者
Park, Sung-Wook [1 ]
Huh, Jun-Ho [2 ]
Kim, Jong-Chan [1 ]
机构
[1] Sunchon Natl Univ, Dept Comp Engn, 255 Jungang Ro, Suncheon City 57922, Jeollanam Do, South Korea
[2] Korea Maritime & Ocean Univ, Dept Data Informat, 727 Taejong Ro, Busan 49112, South Korea
基金
新加坡国家研究基金会;
关键词
deep learning; mode collapse; generative adversarial networks; boundary equilibrium generative adversarial networks; variational inference; computer vision; artificial intelligence;
D O I
10.3390/electronics9040688
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of deep learning, the generative model did not attract much attention until GANs (generative adversarial networks) appeared. In 2014, Google's Ian Goodfellow proposed a generative model called GANs. GANs use different structures and objective functions from the existing generative model. For example, GANs use two neural networks: a generator that creates a realistic image, and a discriminator that distinguishes whether the input is real or synthetic. If there are no problems in the training process, GANs can generate images that are difficult even for experts to distinguish in terms of authenticity. Currently, GANs are the most researched subject in the field of computer vision, which deals with the technology of image style translation, synthesis, and generation, and various models have been unveiled. The issues raised are also improving one by one. In image synthesis, BEGAN (Boundary Equilibrium Generative Adversarial Network), which outperforms the previously announced GANs, learns the latent space of the image, while balancing the generator and discriminator. Nonetheless, BEGAN also has a mode collapse wherein the generator generates only a few images or a single one. Although BEGAN-CS (Boundary Equilibrium Generative Adversarial Network with Constrained Space), which was improved in terms of loss function, was introduced, it did not solve the mode collapse. The discriminator structure of BEGAN-CS is AE (AutoEncoder), which cannot create a particularly useful or structured latent space. Compression performance is not good either. In this paper, this characteristic of AE is considered to be related to the occurrence of mode collapse. Thus, we used VAE (Variational AutoEncoder), which added statistical techniques to AE. As a result of the experiment, the proposed model did not cause mode collapse but converged to a better state than BEGAN-CS.
引用
收藏
页数:31
相关论文
共 46 条
[1]  
Abadi M., 2015, P 12 USENIX S OPERAT
[2]  
[Anonymous], 2014, Comput. Sci.
[3]  
[Anonymous], representation learning with deep convolutional generative
[4]  
[Anonymous], ARXIV151106434
[5]  
[Anonymous], 2016, ARXIV160907093
[6]  
[Anonymous], 2017, BEGAN BOUNDARY EQUIL
[7]  
[Anonymous], 2017, P INT C LEARN REPR
[8]  
[Anonymous], 2017, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2017.632
[9]  
[Anonymous], ADVERSARIAL VARIATIO
[10]  
[Anonymous], ARXIV160505396