Clustering with mixtures of log-concave distributions

被引:35
作者
Chang, George T. [1 ]
Walther, Guenther [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
EM algorithm; log-concave distribution; clustering; normal copula;
D O I
10.1016/j.csda.2007.01.008
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The EM algorithm is a popular tool for clustering observations via a parametric mixture model. Two disadvantages of this approach are that its success depends on the appropriateness of the assumed parametric model, and that each model requires a different implementation of the EM algorithm based on model-specific theoretical derivations. We show how this algorithm can be extended to work with the flexible, nonparametric class of log-concave component distributions. The advantages of the resulting algorithm are: first, it is not restricted to parametric models, so it no longer requires to specify such a model and its results are no longer sensitive to a misspecification thereof. Second, only one implementation of the algorithm is necessary. Furthermore, simulation studies based on the normal mixture model show that there seems to be no noticeable performance penalty of this more general nonparametric algorithm vis-a-vis the parametric EM algorithm in the special case where the assumed parametric model is indeed correct. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:6242 / 6251
页数:10
相关论文
共 12 条
[1]  
[Anonymous], 2000, WILEY SERIES PROBABI
[2]  
EILERS PHC, 2006, UNPUB
[3]   Model-based clustering, discriminant analysis, and density estimation [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (458) :611-631
[4]  
Hastie T, 1996, J ROY STAT SOC B, V58, P155
[5]  
HUNTER DR, 2006, IN PRESS ANN STAT
[6]  
Jongbloed G, 1998, J COMPUT GRAPH STAT, V7, P310
[7]   Discriminant analysis through a semiparametric model [J].
Lin, Y ;
Jeon, Y .
BIOMETRIKA, 2003, 90 (02) :379-392
[8]  
McLachlan G. J., 1997, EM ALGORITHM EXTENSI
[9]  
RUFIBACH K, 2006, IN PRESS J STAT COMP
[10]   Multiscale maximum likelihood analysis of a semiparametric model, with applications [J].
Walther, G .
ANNALS OF STATISTICS, 2001, 29 (05) :1297-1319