Unsupervised clustering and feature weighting based on Generalized Dirichlet mixture modeling

被引:13
|
作者
Ben Ismail, Mohamed Maher [1 ]
Frigui, Hichem [2 ]
机构
[1] King Saud Univ, Dept Comp Sci, Coll Comp & Informat Sci, Riyadh 11548, Saudi Arabia
[2] Univ Louisville, CECS Dept, Louisville, KY 40292 USA
关键词
Unsupervised learning; Mixture model; Feature weighting; Generalized Dirichlet mixture; Possibilistic approach; Image collection categorization; FEATURE-SELECTION; FUZZY; ALGORITHM;
D O I
10.1016/j.ins.2014.02.146
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a possibilistic approach for Generalized Dirichlet mixture parameter estimation, data clustering, and feature weighting. The proposed algorithm, called Robust and Unsupervised Learning of Finite Generalized Dirichlet Mixture Models (RULe_GDM), exploits a property of the Generalized Dirichlet distributions that transforms the data to make the features independent and follow Beta distributions. Then, it learns optimal relevance weights for each feature within each cluster. This property makes RULe_GDM suitable for noisy and high-dimensional feature spaces. In addition, RULe_GDM associates two types of memberships with each data sample. The first one is the posterior probability and indicates how well a sample fits each estimated distribution. The second membership represents the degree of typicality and is used to identify and discard noise points and outliers. RULe_GDM minimizes one objective function which combines learning the two membership functions, distribution parameters, and the relevance weights for each feature within each distribution. We also extend our algorithm to find the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. The performance of RULe_GDM is illustrated and compared to similar algorithms. We use synthetic data to illustrate its robustness to noisy and high dimensional features. We also compare our approach to other relevant algorithms using several standard data sets. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:35 / 54
页数:20
相关论文
共 50 条
  • [31] A new unsupervised feature selection algorithm using similarity-based feature clustering
    Zhu, Xiaoyan
    Wang, Yu
    Li, Yingbin
    Tan, Yonghui
    Wang, Guangtao
    Song, Qinbao
    COMPUTATIONAL INTELLIGENCE, 2019, 35 (01) : 2 - 22
  • [32] UNSUPERVISED FEATURE SELECTION BASED ON FEATURE RELEVANCE
    Zhang, Feng
    Zhao, Ya-Jun
    Chen, Jun-Fen
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 487 - +
  • [33] Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures
    Bourouis, Sami
    Al-Osaimi, Faisal R.
    Bouguila, Nizar
    Sallay, Hassen
    Aldosari, Fahd
    Al Mashrgy, Mohamed
    SOFT COMPUTING, 2019, 23 (14) : 5799 - 5813
  • [34] Non-negative consistency affinity graph learning for unsupervised feature selection and clustering
    Xu, Ziwei
    Jiang, Luxi
    Zhu, Xingyu
    Chen, Xiuhong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 135
  • [35] Feature Selection Using Differential Evolution for Unsupervised Image Clustering
    Gutoski, Matheus
    Ribeiro, Manasses
    Romero Aquino, Nelson Marcelo
    Hattori, Leandro Takeshi
    Lazzaretti, Andre Eugenio
    Lopes, Heitor Silverio
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2018, PT I, 2018, 10841 : 376 - 385
  • [36] A clustering-based feature selection via feature separability
    Jiang, Shengyi
    Wang, Lianxi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 927 - 937
  • [37] Unsupervised Bayesian feature selection based on k-means clustering
    Yan, Liu
    Yan, Peng
    IC-BNMT 2007: PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY, 2007, : 352 - 356
  • [38] Unsupervised Hybrid Feature Extraction Selection for High-Dimensional Non-Gaussian Data Clustering with Variational Inference
    Fan, Wentao
    Bouguila, Nizar
    Ziou, Djemel
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (07) : 1670 - 1685
  • [39] A tutorial on Dirichlet process mixture modeling
    Li, Yuelin
    Schofield, Elizabeth
    Gonen, Mithat
    JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2019, 91 : 128 - 144
  • [40] Mutual information evaluation: A way to predict the performance of feature weighting on clustering
    Ji, Bo
    Ye, Yang-Dong
    Xiao, Yu
    INTELLIGENT DATA ANALYSIS, 2013, 17 (06) : 1001 - 1021