A countably infinite mixture model for clustering and feature selection

被引:0
作者
Nizar Bouguila
Djemel Ziou
机构
[1] CIISE,
[2] Concordia University,undefined
[3] Université de Sherbrooke,undefined
来源
Knowledge and Information Systems | 2012年 / 33卷
关键词
Non-parametric Bayesian methods; Dirichlet process; Clustering; Feature selection; Mixture models; Generalized Dirichlet; MCMC; Categorization;
D O I
暂无
中图分类号
学科分类号
摘要
Mixture modeling is one of the most useful tools in machine learning and data mining applications. An important challenge when applying finite mixture models is the selection of the number of clusters which best describes the data. Recent developments have shown that this problem can be handled by the application of non-parametric Bayesian techniques to mixture modeling. Another important crucial preprocessing step to mixture learning is the selection of the most relevant features. The main approach in this paper, to tackle these problems, consists on storing the knowledge in a generalized Dirichlet mixture model by applying non-parametric Bayesian estimation and inference techniques. Specifically, we extend finite generalized Dirichlet mixture models to the infinite case in which the number of components and relevant features do not need to be known a priori. This extension provides a natural representation of uncertainty regarding the challenging problem of model selection. We propose a Markov Chain Monte Carlo algorithm to learn the resulted infinite mixture. Through applications involving text and image categorization, we show that infinite mixture models offer a more powerful and robust performance than classic finite mixtures for both clustering and feature selection.
引用
收藏
页码:351 / 370
页数:19
相关论文
共 52 条
[21]  
Jordan MI(2004)Distinctive image features from scale-invariant keypoints Int J Comput Vis 60 91-110
[22]  
Beal MI(undefined)undefined undefined undefined undefined-undefined
[23]  
Matthew J(undefined)undefined undefined undefined undefined-undefined
[24]  
Blei DM(undefined)undefined undefined undefined undefined-undefined
[25]  
Bouguila N(undefined)undefined undefined undefined undefined-undefined
[26]  
Bouguila N(undefined)undefined undefined undefined undefined-undefined
[27]  
Ziou D(undefined)undefined undefined undefined undefined-undefined
[28]  
Hammoud RI(undefined)undefined undefined undefined undefined-undefined
[29]  
Dempster AP(undefined)undefined undefined undefined undefined-undefined
[30]  
Laird NM(undefined)undefined undefined undefined undefined-undefined