Gaussian mixture model with feature selection: An embedded approach

被引:30
|
作者
Fu, Yinlin [1 ]
Liu, Xiaonan [1 ]
Sarkar, Suryadipto [1 ]
Wu, Teresa [1 ]
机构
[1] Arizona State Univ, Sch Comp, Informat, Decis Syst Engn, 699 South Mill Ave, Tempe, AZ 85281 USA
关键词
Gaussian Mixture Model (GMM); Expectation Maximization (EM); Feature selection; VARIABLE SELECTION; EM;
D O I
10.1016/j.cie.2020.107000
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Gaussian Mixture Model (GMM) is a popular clustering algorithm due to its neat statistical properties, which enable the "soft" clustering and the determination of the number of clusters. Expectation-Maximization (EM) is usually applied to estimate the GMM parameters. While promising, the inclusion of features that are not contributing to clustering may confuse the model and increase computational cost. Recognizing the issue, in this paper, we propose a new algorithm, termed Expectation Selection Maximization (ESM), by adding a feature selection step (5). Specifically, we introduce a relevancy index (RI), a metric indicating the probability of assigning a data point to a specific clustering group. The RI index reveals the contribution of the feature to the clustering process thus can assist the feature selection. We conduct theoretical analysis to justify the use of RI for feature selection. Also, to demonstrate the efficacy of the proposed ESM, two synthetic datasets, four benchmark datasets, and an Alzheimer's Disease dataset are studied.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Bayesian feature and model selection for Gaussian mixture models
    Constantinopoulos, C
    Titsias, MK
    Likas, A
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (06) : 1013 - U1
  • [2] A GAUSSIAN MIXTURE MODEL TO DETECT CLUSTERS EMBEDDED IN FEATURE SUBSPACE
    Li, Yuanhong
    Dong, Ming
    Hua, Jing
    COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2007, 7 (04) : 337 - 352
  • [3] Iterative feature selection in Gaussian mixture clustering with automatic model selection
    Zeng, Hong
    Cheung, Yiu-Ming
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 2277 - 2282
  • [4] An efficient feature selection approach for clustering: Using a Gaussian mixture model of data dissimilarity
    Tsai, Chieh-Yuan
    Chiu, Chuang-Cheng
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2007, PT 1, PROCEEDINGS, 2007, 4705 : 1107 - 1118
  • [5] A maximum weighted likelihood approach to simultaneous model selection and feature weighting in Gaussian mixture
    Cheung, Yiu-Ming
    Zeng, Hong
    ARTIFICIAL NEURAL NETWORKS - ICANN 2007, PT 1, PROCEEDINGS, 2007, 4668 : 78 - +
  • [7] Color Image Segmentation with Bounded Generalized Gaussian Mixture Model and Feature Selection
    Channoufi, Ines
    Bourouis, Sami
    Bouguila, Nizar
    Hamrouni, Kamel
    2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
  • [8] Estimating mutual information using Gaussian mixture model for feature ranking and selection
    Lan, Tian
    Erdogmus, Deniz
    Ozertem, Umut
    Huang, Yonghong
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 5034 - 5039
  • [9] A new feature selection method for Gaussian mixture clustering
    Zeng, Hong
    Cheung, Yiu-Ming
    PATTERN RECOGNITION, 2009, 42 (02) : 243 - 250
  • [10] Variational bayesian feature selection for Gaussian mixture models
    Valente, F
    Wellekens, C
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 513 - 516