Robust estimation in the normal mixture model based on robust clustering

被引:34
作者
Cuesta-Albertos, J. A. [1 ]
Matran, C. [2 ]
Mayo-Iscar, A. [2 ]
机构
[1] Univ Cantabria, Fac Ciencias, Dept Matemat Estadist & Computac, E-39005 Santander, Spain
[2] Univ Valladolid, E-47002 Valladolid, Spain
关键词
asymptotics; breakdown point; censored maximum likelihood; EM algorithm; identifiability; influence function; multivariate normal mixture model; trimmed k-means;
D O I
10.1111/j.1467-9868.2008.00657.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We introduce a robust estimation procedure that is based on the choice of a representative trimmed subsample through an initial robust clustering procedure, and subsequent improvements based on maximum likelihood. To obtain the initial trimming we resort to the trimmed k-means, a simple procedure designed for finding the core of the clusters under appropriate configurations. By handling the trimmed data as censored, maximum likelihood estimation provides in each step the location and shape of the next trimming. Data-driven restrictions on the parameters, requiring that every distribution in the mixture must be sufficiently represented in the initial clustered region, allow singularities to be avoided and guarantee the existence of the estimator. Our analysis includes robustness properties and asymptotic results as well as worked examples.
引用
收藏
页码:779 / 802
页数:24
相关论文
共 24 条
[21]   Bayesian analysis of mixture models with an unknown number of components - An alternative to reversible jump methods [J].
Stephens, M .
ANNALS OF STATISTICS, 2000, 28 (01) :40-74
[22]  
van der Vaart A., 1996, WEAK CONVERGENCE EMP
[23]   ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM [J].
WU, CFJ .
ANNALS OF STATISTICS, 1983, 11 (01) :95-103
[24]   ON IDENTIFIABILITY OF FINITE MIXTURES [J].
YAKOWITZ, SJ ;
SPRAGINS, JD .
ANNALS OF MATHEMATICAL STATISTICS, 1968, 39 (01) :209-&