Variational learning of hierarchical infinite generalized Dirichlet mixture models and applications

被引:16
作者
Fan, Wentao [1 ]
Sallay, Hassen [2 ]
Bouguila, Nizar [3 ]
Bourouis, Sami [4 ]
机构
[1] Huaqiao Univ, Dept Comp Sci & Technol, Xiamen, Peoples R China
[2] Umm Al Qura Univ, Coll Comp & Informat Syst, Mecca, Saudi Arabia
[3] Concordia Univ, CIISE, Montreal, PQ, Canada
[4] Taif Univ, At Taif, Saudi Arabia
关键词
Mixture models; Hierarchical Dirichlet processes; Generalized Dirichlet distribution; Variational Bayes; Image categorization; Web services; INTRUSION DETECTION; WEB SERVICES; REPRESENTATION; RECOGNITION; SELECTION;
D O I
10.1007/s00500-014-1557-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data clustering is a fundamental unsupervised learning task in several domains such as data mining, computer vision, information retrieval, and pattern recognition. In this paper, we propose and analyze a new clustering approach based on both hierarchical Dirichlet processes and the generalized Dirichlet distribution, which leads to an interesting statistical framework for data analysis and modelling. Our approach can be viewed as a hierarchical extension of the infinite generalized Dirichlet mixture model previously proposed in Bouguila and Ziou (IEEE Trans Neural Netw 21(1): 107-122, 2010). The proposed clustering approach tackles the problem of modelling grouped data where observations are organized into groups that we allow to remain statistically linked by sharing mixture components. The resulting clustering model is learned using a principled variational Bayes inference-based algorithm that we have developed. Extensive experiments and simulations, based on two challenging applications namely images categorization and web service intrusion detection, demonstrate our model usefulness and merits.
引用
收藏
页码:979 / 990
页数:12
相关论文
共 59 条
[1]  
Agarwal S, 2002, LECT NOTES COMPUT SC, V2353, P113
[2]  
[Anonymous], 2011, P 14 INT C ART INT S
[3]  
[Anonymous], 2010, BAYESIAN NONPARAMETR
[4]  
[Anonymous], BMVC
[5]  
[Anonymous], 1999, P ADV NEUR INF PROC
[6]  
[Anonymous], 1983, RECNT ADV STAT
[7]  
[Anonymous], 2013, P 6 C INT THINGS SMA
[8]  
[Anonymous], P IEEE C COMP VIS PA
[9]  
[Anonymous], 2010, 2010 2 INT WORKSHOP
[10]  
Banerjee A, 2004, SIAM PROC S, P234