Bayesian k-Means as a "Maximization-Expectation" Algorithm

被引:25
作者
Kurihara, Kenichi [1 ]
Welling, Max [2 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan
[2] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
基金
美国国家科学基金会;
关键词
MIXTURE;
D O I
10.1162/neco.2008.12-06-421
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new class of "maximization-expectation" (ME) algorithms where we maximize over hidden variables but marginalize over random parameters. This reverses the roles of expectation and maximization in the classical expectation-maximization algorithm. In the context of clustering, we argue that these hard assignments open the door to very fast implementations based on data structures such as kd-trees and conga lines. The marginalization over parameters ensures that we retain the ability to infer model structure (i. e., number of clusters). As an important example, we discuss a top-down Bayesian k-means algorithm and a bottom-up agglomerative clustering algorithm. In experiments, we compare these algorithms against a number of alternative algorithms that have recently appeared in the literature.
引用
收藏
页码:1145 / 1172
页数:28
相关论文
共 18 条
[1]  
Attias H., 2000, Advances in Neural Information Processing Systems, V12
[2]  
Bayardo RJ., 1999, P 5 ACM SIGKDD INT C, DOI DOI 10.1145/312129.312248
[3]  
Dasgupta S., 2000, P 16 C UNCERTAINTY A, P143, DOI [10.5555/647234.719759, DOI 10.5555/647234.719759]
[4]  
Elkan C., 2003, MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE, P147, DOI DOI 10.1016/0026-2714(92)90278-S
[5]  
EPPSTEIN D, 1998, SODA
[6]  
Ghahramani Z., 2006, NIPS 2005), V18, P475
[7]  
Ghahramani Zoubin., 2000, Advances in Neural Information Processing Systems, V12
[8]  
Hamerly Greg., 2003, Advances in Neural Information Processing Systems, V17
[9]  
HINTON U, 1993, P 6 ANN WORKSH COMP, P5
[10]   Estimating mixture of Dirichlet process models [J].
MacEachern, SN ;
Muller, P .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1998, 7 (02) :223-238