Clustering univariate observations via mixtures of unimodal normal mixtures

被引:8
作者
Bartolucci, F [1 ]
机构
[1] Univ Urbino, Inst Sci Econ, I-61029 Urbino, Italy
关键词
Bayesian information criterion; constrained maximization; density estimation; EM algorithm;
D O I
10.1007/s00357-005-0014-7
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
A mixture model is proposed in which any component is modelled in a flexible way through a unimodal mixture of normal distributions with the same variance and equispaced support points. The main application of the model is for clustering univariate observations where any component identifies a different cluster and conventional mixture models may lead to an overestimate of the number of clusters when the component distribution is misspecified. Maximum likelihood estimation of the model is carried on through an EM algorithm where the maximization of the complete log-likelihood under the constraint of unimodality is performed by solving a series of least squares problems under linear inequality constraints. The Bayesian Information Criterion is used to select the number of components. A simulation study shows that this criterion performs well even when the true component distribution has strong skewness and/or kurtosis. This is due to the flexibility of the proposed model and is particularly useful when the model is used for clustering.
引用
收藏
页码:203 / 219
页数:17
相关论文
共 15 条
  • [1] Detecting features in spatial point processes with clutter via model-based clustering
    Dasgupta, A
    Raftery, AE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (441) : 294 - 302
  • [2] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [3] Ferguson T.S., 1983, Recent Advances in Statistics: Papers in Honor of Herman Chernojf on His Sixtieth Birthday, P287
  • [4] Gordon A, 1999, Classification
  • [5] Analysis of tomato root initiation using a normal mixture distribution
    Gutierrez, RG
    Carroll, RJ
    Wang, NY
    Lee, GH
    Taylor, BH
    [J]. BIOMETRICS, 1995, 51 (04) : 1461 - 1468
  • [6] Hall P, 2002, STAT SINICA, V12, P965
  • [7] THE DIP TEST OF UNIMODALITY
    HARTIGAN, JA
    HARTIGAN, PM
    [J]. ANNALS OF STATISTICS, 1985, 13 (01) : 70 - 84
  • [8] MCLACHLAN G., 2000, WILEY SER PROB STAT, DOI 10.1002/0471721182
  • [9] McLachlan GJ., 1988, MIXTURE MODELS INFER
  • [10] Minnotte MC, 1997, ANN STAT, V25, P1646