A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering

被引:99
|
作者
Boutemedjet, Sabri [1 ]
Bouguila, Nizar [2 ]
Ziou, Djemel [1 ]
机构
[1] Univ Sherbrooke, Dept Informat, Sherbrooke, PQ J1K 2R1, Canada
[2] Concordia Univ, Concordia Inst Informat Engn CIISE, Montreal, PQ H3G 1T7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Unsupervised learning; mixture models; feature selection; dimensionality reduction; generalized Dirichlet mixture; EM; MML; information theory; object image categorization; STATISTICAL PATTERN-RECOGNITION; DIRICHLET MIXTURE MODEL; UNSUPERVISED SELECTION;
D O I
10.1109/TPAMI.2008.155
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an unsupervised approach for feature selection and extraction in mixtures of generalized Dirichlet (GD) distributions. Our method defines a new mixture model that is able to extract independent and non-Gaussian features without loss of accuracy. The proposed model is learned using the Expectation-Maximization algorithm by minimizing the message length of the data set. Experimental results show the merits of the proposed methodology in the categorization of object images.
引用
收藏
页码:1429 / 1443
页数:15
相关论文
共 50 条
  • [21] A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis
    Borah, Kasmika
    Das, Himanish Shekhar
    Seth, Soumita
    Mallick, Koushik
    Rahaman, Zubair
    Mallik, Saurav
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2024, 24 (05)
  • [22] A density-based clustering algorithm for high-dimensional data with feature selection
    Qi Xianting
    Wang Pan
    2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2016, : 114 - 118
  • [23] Differential Privacy High-Dimensional Data Publishing Based on Feature Selection and Clustering
    Chu, Zhiguang
    He, Jingsha
    Zhang, Xiaolei
    Zhang, Xing
    Zhu, Nafei
    ELECTRONICS, 2023, 12 (09)
  • [24] Subspace selection for clustering high-dimensional data
    Baumgartner, C
    Plant, C
    Kailing, K
    Kriegel, HP
    Kröger, P
    FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 11 - 18
  • [25] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [26] Feature selection for high-dimensional data in astronomy
    Zheng, Hongwen
    Zhang, Yanxia
    ADVANCES IN SPACE RESEARCH, 2008, 41 (12) : 1960 - 1964
  • [27] Feature selection for high-dimensional imbalanced data
    Yin, Liuzhi
    Ge, Yong
    Xiao, Keli
    Wang, Xuehua
    Quan, Xiaojun
    NEUROCOMPUTING, 2013, 105 : 3 - 11
  • [28] A filter feature selection for high-dimensional data
    Janane, Fatima Zahra
    Ouaderhman, Tayeb
    Chamlal, Hasna
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
  • [29] Feature selection for high-dimensional temporal data
    Michail Tsagris
    Vincenzo Lagani
    Ioannis Tsamardinos
    BMC Bioinformatics, 19
  • [30] Feature Selection with High-Dimensional Imbalanced Data
    Van Hulse, Jason
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    Wald, Randall
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 507 - 514