A Connection Between Score Matching and Denoising Autoencoders

被引：646

作者：

Vincent, Pascal ^{[1
]}

机构：

[1] Univ Montreal, Dept Informat, Montreal, PQ H3C 3J7, Canada

来源：

NEURAL COMPUTATION | 2011年 / 23卷 / 07期

基金：

加拿大自然科学与工程研究理事会;

关键词：

CONTRASTIVE DIVERGENCE;

D O I：

10.1162/NECO_a_00142

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Denoising autoencoders have been previously shown to be competitive alternatives to restricted Boltzmann machines for unsupervised pretraining of each layer of a deep architecture. We show that a simple denoising autoencoder training criterion is equivalent to matching the score (with respect to the data) of a specific energy-based model to that of a nonparametric Parzen density estimator of the data. This yields several useful insights. It defines a proper probabilistic model for the denoising autoencoder technique, which makes it in principle possible to sample from them or rank examples by their energy. It suggests a different way to apply score matching that is related to learning to denoise and does not require computing second derivatives. It justifies the use of tied weights between the encoder and decoder and suggests ways to extend the success of denoising autoencoders to a larger family of energy-based models.

引用

页码：1661 / 1674

页数：14

共 22 条

[1]

[Anonymous], 2010, J. Mach. Learn. Res.

[2]

[Anonymous], ARXIV09064779

[3]

[Anonymous], THESIS U BRIT COLUMB

[4]

Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, DOI DOI 10.7551/MITPRESS/7503.003.0024

[5] HYBRID MONTE-CARLO [J].

DUANE, S ;

KENNEDY, AD ;

PENDLETON, BJ ;

ROWETH, D .

PHYSICS LETTERS B, 1987, 195 (02) :216-222

[6]

Erhan D, 2010, J MACH LEARN RES, V11, P625

[7]

GALLINARI P, 1987, P COGNITIVA 87 PAR C

[8] Training products of experts by minimizing contrastive divergence [J].

Hinton, GE .

NEURAL COMPUTATION, 2002, 14 (08) :1771-1800

[9] A fast learning algorithm for deep belief nets [J].

Hinton, Geoffrey E. ;

Osindero, Simon ;

Teh, Yee-Whye .

NEURAL COMPUTATION, 2006, 18 (07) :1527-1554

[10]

Hyvärinen A, 2005, J MACH LEARN RES, V6, P695

← 1 2 3 →