Unsupervised clustering and feature weighting based on Generalized Dirichlet mixture modeling

被引:13
|
作者
Ben Ismail, Mohamed Maher [1 ]
Frigui, Hichem [2 ]
机构
[1] King Saud Univ, Dept Comp Sci, Coll Comp & Informat Sci, Riyadh 11548, Saudi Arabia
[2] Univ Louisville, CECS Dept, Louisville, KY 40292 USA
关键词
Unsupervised learning; Mixture model; Feature weighting; Generalized Dirichlet mixture; Possibilistic approach; Image collection categorization; FEATURE-SELECTION; FUZZY; ALGORITHM;
D O I
10.1016/j.ins.2014.02.146
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a possibilistic approach for Generalized Dirichlet mixture parameter estimation, data clustering, and feature weighting. The proposed algorithm, called Robust and Unsupervised Learning of Finite Generalized Dirichlet Mixture Models (RULe_GDM), exploits a property of the Generalized Dirichlet distributions that transforms the data to make the features independent and follow Beta distributions. Then, it learns optimal relevance weights for each feature within each cluster. This property makes RULe_GDM suitable for noisy and high-dimensional feature spaces. In addition, RULe_GDM associates two types of memberships with each data sample. The first one is the posterior probability and indicates how well a sample fits each estimated distribution. The second membership represents the degree of typicality and is used to identify and discard noise points and outliers. RULe_GDM minimizes one objective function which combines learning the two membership functions, distribution parameters, and the relevance weights for each feature within each distribution. We also extend our algorithm to find the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. The performance of RULe_GDM is illustrated and compared to similar algorithms. We use synthetic data to illustrate its robustness to noisy and high dimensional features. We also compare our approach to other relevant algorithms using several standard data sets. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:35 / 54
页数:20
相关论文
共 50 条
  • [1] Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted Dirichlet mixture models
    Al Mashrgy, Mohamed
    Bdiri, Taoufik
    Bouguila, Nizar
    KNOWLEDGE-BASED SYSTEMS, 2014, 59 : 182 - 195
  • [2] Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting
    Maanicshah, Kamal
    Ali, Samr
    Fan, Wentao
    Bouguila, Nizar
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2019), PT II, 2019, 11663 : 94 - 105
  • [3] A Hierarchical Infinite Generalized Dirichlet Mixture Model with Feature Selection
    Fan, Wentao
    Sallay, Hassen
    Bouguila, Nizar
    Bourouis, Sami
    ADAPTIVE AND INTELLIGENT SYSTEMS, ICAIS 2014, 2014, 8779 : 1 - 10
  • [4] IMAGE DATABASE CATEGORIZATION USING ROBUST UNSUPERVISED LEARNING OF FINITE GENERALIZED DIRICHLET MIXTURE MODELS
    Ben Ismail, M. Maher
    Frigui, Hichem
    2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011,
  • [5] Unsupervised Feature Weighting Based on Local Feature Relatedness
    Yun, Jiali
    Jing, Liping
    Yu, Jian
    Huang, Houkuan
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 38 - 49
  • [6] Simultaneous Bayesian clustering and feature selection using RJMCMC-based learning of finite generalized Dirichlet mixture models
    Elguebaly, Tarek
    Bouguila, Nizar
    SIGNAL PROCESSING, 2013, 93 (06) : 1531 - 1546
  • [7] Image and Video Segmentation by Combining Unsupervised Generalized Gaussian Mixture Modeling and Feature Selection
    Allili, Mohand Said
    Ziou, Djemel
    Bouguila, Nizar
    Boutemedjet, Sabri
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2010, 20 (10) : 1373 - 1377
  • [8] ENDOSCOPY VIDEO SUMMARIZATION BASED ON MULTI-MODAL DESCRIPTORS AND POSSIBILISTIC UNSUPERVISED LEARNING AND FEATURE SUBSET WEIGHTING
    Ben Ismail, Mohamed Maher
    Bchir, Ouiem
    Emam, Ahmed Z.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2014, 20 (03) : 381 - 402
  • [9] Laplacian regularized generalized Dirichlet mixture distribution for data clustering
    Li, Baohua
    Hu, Lixia
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2020, 49 (01) : 16 - 28
  • [10] Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection
    Fan, Wentao
    Bouguila, Nizar
    PATTERN RECOGNITION, 2013, 46 (10) : 2754 - 2769