Learning mixture models with the regularized latent maximum entropy principle

被引：3

作者：

Wang, SJ ^{[1
]}

Schuurmans, D

Peng, FC

Zhao, YX

机构：

[1] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada

[2] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA

[3] Univ Missouri, Dept Comp Engn & Comp Sci, Columbia, MO 65201 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 2004年 / 15卷 / 04期

关键词：

expectation maximization (EM); iterative scaling; latent variables; maximum entropy; mixture models; regularization;

D O I：

10.1109/TNN.2004.828755

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a new approach to estimating mixture models based on a recent inference principle we have proposed: the latent maximum entropy principle (LME). LME is different from Jaynes' maximum entropy principle, standard maximum likelihood, and maximum a posteriori probability estimation. We demonstrate the LME principle by deriving new algorithms for mixture model estimation, and show how robust new variants of the expectation maximization (EM) algorithm can be developed. We show that a regularized version of LME (RLME), is effective at estimating mixture models. It generally yields better results than plain LME, which in turn is often better than maximum likelihood and maximum a posterior estimation, particularly when inferring latent variable models from small amounts of data.

引用

页码：903 / 916

页数：14

共 26 条

[1]

ACKLEY DH, 1985, COGNITIVE SCI, V9, P147

[2]

[Anonymous], 2000, Bayesian theory

[3] APPROXIMATION OF DENSITY-FUNCTIONS BY SEQUENCES OF EXPONENTIAL-FAMILIES [J].

BARRON, AR ;

SHEU, CH .

ANNALS OF STATISTICS, 1991, 19 (03) :1347-1369

[4]

Bertsekas D.P., 1999, Nonlinear Programming

[5]

Borwein J. M., 2000, CMS BOOKS MATH

[6]

Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X

[7]

Csiszar I, 1996, FUND THEOR, V79, P35

[8] GENERALIZED ITERATIVE SCALING FOR LOG-LINEAR MODELS [J].

DARROCH, JN ;

RATCLIFF, D .

ANNALS OF MATHEMATICAL STATISTICS, 1972, 43 (05) :1470-&

[9] Inducing features of random fields [J].

DellaPietra, S ;

DellaPietra, V ;

Lafferty, J .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) :380-393

[10] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

← 1 2 3 →