Balancing Reconstruction Error and Kullback-Leibler Divergence in Variational Autoencoders

被引：72

作者：

Asperti, Andrea ^{[1
]}

Trentin, Matteo ^{[1
]}

机构：

[1] Univ Bologna, Dept Informat Sci & Engn DISI, I-40127 Bologna, Italy

来源：

IEEE ACCESS | 2020年 / 8卷

关键词：

Image reconstruction; Training; Gaussian distribution; Shape; Mathematical model; Probabilistic logic; Data models; Generative models; likelilhood-based frameworks; Kullback-Leibler divergence; two-stage generation; variational autoencoders;

D O I：

10.1109/ACCESS.2020.3034828

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Likelihood-based generative frameworks are receiving increasing attention in the deep learning community, mostly on account of their strong probabilistic foundation. Among them, Variational Autoencoders (VAEs) are reputed for their fast and tractable sampling and relatively stable training, but if not properly tuned they may easily produce poor generative performances. The loss function of Variational Autoencoders is the sum of two components, with somehow contrasting effects: the reconstruction loss, improving the quality of the resulting images, and the Kullback-Leibler divergence, acting as a regularizer of the latent space. Correctly balancing these two components is a delicate issue, and one of the major problems of VAEs. Recent techniques address the problem by allowing the network to learn the balancing factor during training, according to a suitable loss function. In this article, we show that learning can be replaced by a simple deterministic computation, expressing the balancing factor in terms of a running average of the reconstruction error over the last minibatches. As a result, we keep a constant balance between the two components along training: as reconstruction improves, we proportionally decrease KL-divergence in order to prevent its prevalence, that would forbid further improvements of the quality of reconstructions. Our technique is simple and effective: it clarifies the learning objective for the balancing factor, and it produces faster and more accurate behaviours. On typical datasets such as Cifar10 and CelebA, our technique sensibly outperforms all previous VAE architectures with comparable parameter capacity.

引用

页码：199440 / 199448

页数：9

共 27 条

[1]

Alemi AA., 2018, PMLR, V80, P159

[2]

[Anonymous], 2019, 22 INT C ART INT STA

[3]

Arjovsky M, 2017, PR MACH LEARN RES, V70

[4]

Asperti A., 2020, MACHINE LEARN ING OP

[5]

Asperti A., 2019, PROC 1 INT C ADV SIG, P1

[6] About Generative Aspects of Variational Autoencoders [J].

Asperti, Andrea .

MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 :71-82

[7]

Bauer M, 2019, PR MACH LEARN RES, V89, P66

[8]

Bowman S. R., 2015, arXiv preprint arXiv:1511.06349

[9]

Burda Yuri, 2015, INT C LEARN REPR ICL

[10]

Burgess P. Christopher, 2018, ARXIV180403599

← 1 2 3 →