A new feature subset selection using bottom-up clustering

被引:0
作者
Zeinab Dehghan
Eghbal G. Mansoori
机构
[1] Shiraz University,School of Electrical and Computer Engineering
来源
Pattern Analysis and Applications | 2018年 / 21卷
关键词
Dimensionality reduction; Feature selection; Hierarchical clustering; Feature clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Feature subset selection and/or dimensionality reduction is an essential preprocess before performing any data mining task, especially when there are too many features in the problem space. In this paper, a clustering-based feature subset selection (CFSS) algorithm is proposed for discriminating more relevant features. In each level of agglomeration, it uses similarity measure among features to merge two most similar clusters of features. By gathering similar features into clusters and then introducing representative features of each cluster, it tries to remove some redundant features. To identify the representative features, a criterion based on mutual information is proposed. Since CFSS works in a filter manner in specifying the representatives, it is noticeably fast. As an advantage of hierarchical clustering, it does not need to determine the number of clusters in advance. In CFSS, the clustering process is repeated until all features are distributed in some clusters. However, to diffuse the features in a reasonable number of clusters, a recently proposed approach is used to obtain a suitable level for cutting the clustering tree. To assess the performance of CFSS, we have applied it on some valid UCI datasets and compared with some popular feature selection methods. The experimental results reveal the efficiency and fastness of our proposed method.
引用
收藏
页码:57 / 66
页数:9
相关论文
共 50 条
  • [41] Feature selection and clustering based web service selection using qoSs
    Purohit, Lalit
    Rathore, Santosh S.
    Kumar, Sandeep
    APPLIED INTELLIGENCE, 2023, 53 (11) : 13352 - 13377
  • [42] Simultaneous feature selection and clustering using mixture models
    Law, MHC
    Figueiredo, MAT
    Jain, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (09) : 1154 - 1166
  • [43] Brain storm optimization for feature selection using new individual clustering and updating mechanism
    Wan-qiu Zhang
    Yong Zhang
    Chao Peng
    Applied Intelligence, 2019, 49 : 4294 - 4302
  • [44] Brain storm optimization for feature selection using new individual clustering and updating mechanism
    Zhang, Wan-qiu
    Zhang, Yong
    Peng, Chao
    APPLIED INTELLIGENCE, 2019, 49 (12) : 4294 - 4302
  • [45] An Improved Fast Clustering-Based Feature Subset Selection Algorithm for Multi Featured dataset
    Sharma, Poonam
    Mathur, Abhisek
    Chaturvedi, Sushil
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ENGINEERING AND TECHNOLOGY RESEARCH (ICAETR), 2014,
  • [46] A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data
    Song, Qinbao
    Ni, Jingjie
    Wang, Guangtao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (01) : 1 - 14
  • [47] An efficient feature selection technique for clustering based on a new measure of feature importance
    Goswami, Saptarsi
    Chakrabarti, Amlan
    Chakraborty, Basabi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 32 (06) : 3847 - 3858
  • [48] Feature Subset Selection Using Genetic Algorithm for Named Entity Recognition
    Hasanuzzaman, Md
    Saha, Sriparna
    Ekbal, Asif
    PROCEEDINGS OF THE 24TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2010, : 153 - 162
  • [49] Enhanced Feature Subset Selection Using Niche Based Bat Algorithm
    Saleem, Noman
    Zafar, Kashif
    Sabzwari, Alizaa Fatima
    COMPUTATION, 2019, 7 (03)
  • [50] A Clustering Based Genetic Algorithm for Feature Selection
    Rostami, Mehrdad
    Moradi, Parham
    2014 6TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2014, : 112 - 116