Flexible clustering via extended mixtures of common t-factor analyzers

被引：0

作者：

Wan-Lun Wang

Tsung-I Lin

机构：

[1] Feng Chia University,Department of Statistics, Graduate Institute of Statistics and Actuarial Science

[2] National Chung Hsing University,Institute of Statistics

[3] China Medical University,Department of Public Health

来源：

AStA Advances in Statistical Analysis | 2017年 / 101卷

关键词：

Clustering; Classification; Factor loadings; Mixture models; Outlier detection; 62H25; 62H30;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Mixtures of t-factor analyzers have been broadly used for model-based density estimation and clustering of high-dimensional data from a heterogeneous population with longer-than-normal tails or atypical observations. To reduce the number of parameters in the component covariance matrices, the mixtures of common t-factor analyzers (MCtFA) have been recently proposed by assuming a common factor loading across different components. In this paper, we present an extended version of MCtFA using distinct covariance matrices for component errors. The modified mixture model offers a more appropriate way to represent the data in a graphical fashion. Two flexible EM-type algorithms are developed for iteratively computing maximum likelihood estimates of parameters. Practical considerations for the specification of starting values, model-based clustering, classification of new subject and identification of potential outliers are also provided. We demonstrate the superiority of the proposed methodology by analyzing the Italian wine data and a simulation study.

引用

页码：227 / 252

页数：25

共 82 条

[1]

Andrews JL(2011)Extending mixtures of multivariate Stat. Comput. 21 361-373

[2]

McNicholas PD(2011)-factor analyzers Bioinformatics 27 1269-1276

[3]

Baek J(2010)Mixtures of common IEEE Trans. Pattern Anal. Mach. Intell. 32 1-13

[4]

McLachlan GJ(2000)-factor analyzers for clustering high-dimensional microarray data IEEE Trans. Pattern Anal. Mach. Intell. 22 719-725

[5]

Baek J(1977)Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data J. R. Stat. Soc. B 39 1-38

[6]

McLachlan GJ(1994)Assessing a mixture model for clustering with the integrated completed likelihood J. R. Stat. Soc. B 56 363-375

[7]

Flack LK(1984)Maximum likelihood from incomplete data via the EM algorithm (with discussion) J. Am. Stat. Assoc. 79 892-898

[8]

Biernacki C(2003)Estimation of finite mixtures through Bayesian sampling Mach. Learn. 50 73-94

[9]

Celeux G(1986)Common principle components in Vitis 25 189-201

[10]

Govaert G(2002) groups J. Am. Stat. Assoc. 97 611-631

← 1 2 3 4 5 6 7 8 9 →