Infinite Liouville mixture models with application to text and texture categorization

被引:37
作者
Bouguila, Nizar [1 ]
机构
[1] Concordia Univ, Fac Engn & Comp Sci, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 2W1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Liouville family of distributions; Infinite mixture models; Proportional data; Nonparametric Bayesian inference; MCMC; Gibbs sampling; UNSUPERVISED SELECTION; DIRICHLET; CLASSIFICATION; DISTRIBUTIONS; ESTIMATORS;
D O I
10.1016/j.patrec.2011.09.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of proportional data modeling and clustering using mixture models, a problem of great interest and of importance for many practical pattern recognition, image processing, data mining and computer vision applications. Finite mixture models are broadly applicable to clustering problems. But, they involve the challenging problem of the selection of the number of clusters which requires a certain trade-off. The number of clusters must be sufficient to provide the discriminating capability between clusters required for a given application. Indeed, if too many clusters are employed overfitting problems may occur and if few are used we have a problem of underfitting. Here we approach the problem of modeling and clustering proportional data using infinite mixtures which have been shown to be an efficient alternative to finite mixtures by overcoming the concern regarding the selection of the optimal number of mixture components. In particular, we propose and discuss the consideration of infinite Liouville mixture model whose parameter values are fitted to the data through a principled Bayesian algorithm that we have developed and which allows uncertainty in the number of mixture components. Our experimental evaluation involves two challenging applications namely text classification and texture discrimination, and suggests that the proposed approach can be an excellent choice for proportional data modeling. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 110
页数:8
相关论文
共 50 条
[11]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[12]   OCCAM RAZOR [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
INFORMATION PROCESSING LETTERS, 1987, 24 (06) :377-380
[13]   Selection of generative models in classification [J].
Bouchard, G ;
Celeux, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (04) :544-554
[14]   Practical Bayesian estimation of a finite beta mixture through gibbs sampling and its applications [J].
Bouguila, N ;
Ziou, D ;
Monga, E .
STATISTICS AND COMPUTING, 2006, 16 (02) :215-225
[15]   Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application [J].
Bouguila, N ;
Ziou, D ;
Vaillancourt, J .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (11) :1533-1543
[16]  
Bouguila N, 2007, IEEE T PATTERN ANAL, V29, P1716, DOI [10.1109/TPAMI.2007.1095, 10.1109/TPAMl.2007.1095]
[17]   Unsupervised selection of a finite Dirichlet mixture model: An MML-based approach [J].
Bouguila, Nizar ;
Ziou, Djemel .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (08) :993-1009
[18]   A hybrid SEM algorithm for high-dimensional unsupervised learning using a finite generalized dirichlet mixture [J].
Bouguila, Nizar ;
Ziou, Djemel .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2006, 15 (09) :2657-2668
[19]   A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling [J].
Bouguila, Nizar ;
Ziou, Djemel .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (01) :107-122
[20]   A DIRICHLET PROCESS MIXTURE OF DIRICHLET DISTRIBUTIONS FOR CLASSIFICATION AND PREDICTION [J].
Bouguila, Nizar ;
Ziou, Djemel .
2008 IEEE WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2008, :297-+