A new feature subset selection using bottom-up clustering

被引:0
作者
Zeinab Dehghan
Eghbal G. Mansoori
机构
[1] Shiraz University,School of Electrical and Computer Engineering
来源
Pattern Analysis and Applications | 2018年 / 21卷
关键词
Dimensionality reduction; Feature selection; Hierarchical clustering; Feature clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Feature subset selection and/or dimensionality reduction is an essential preprocess before performing any data mining task, especially when there are too many features in the problem space. In this paper, a clustering-based feature subset selection (CFSS) algorithm is proposed for discriminating more relevant features. In each level of agglomeration, it uses similarity measure among features to merge two most similar clusters of features. By gathering similar features into clusters and then introducing representative features of each cluster, it tries to remove some redundant features. To identify the representative features, a criterion based on mutual information is proposed. Since CFSS works in a filter manner in specifying the representatives, it is noticeably fast. As an advantage of hierarchical clustering, it does not need to determine the number of clusters in advance. In CFSS, the clustering process is repeated until all features are distributed in some clusters. However, to diffuse the features in a reasonable number of clusters, a recently proposed approach is used to obtain a suitable level for cutting the clustering tree. To assess the performance of CFSS, we have applied it on some valid UCI datasets and compared with some popular feature selection methods. The experimental results reveal the efficiency and fastness of our proposed method.
引用
收藏
页码:57 / 66
页数:9
相关论文
共 50 条
  • [31] PSO and Statistical Clustering for Feature Selection: A New Representation
    Nguyen, Hoai Bach
    Xue, Bing
    Liu, Ivy
    Zhang, Mengjie
    SIMULATED EVOLUTION AND LEARNING (SEAL 2014), 2014, 8886 : 569 - 581
  • [32] Feature subset selection using constrained binary/integer biogeography-based optimization
    Yazdani, Samaneh
    Shanbehzadeh, Jamshid
    Aminian, Ehsan
    ISA TRANSACTIONS, 2013, 52 (03) : 383 - 390
  • [33] Top-down vs Bottom-up methods of Linkage for Asymmetric Agglomerative Hierarchical Clustering
    Takumi, Satoshi
    Miyamoto, Sadaaki
    2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 459 - 464
  • [34] Unsupervised Feature Selection with Feature Clustering
    Cheung, Yiu-ming
    Jia, Hong
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 9 - 15
  • [35] Towards an optimal feature subset selection
    Shiba, OA
    Saeed, W
    Sulaiman, MN
    Ahmad, F
    Mamat, A
    SCORED 2003: STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT, PROCEEDINGS: NETWORKING THE FUTURE MIND IN CONVERGENCE TECHNOLOGY, 2003, : 376 - 380
  • [36] Heterogeneous feature subset selection using mutual information-based feature transformation
    Wei, Min
    Chow, Tommy W. S.
    Chan, Rosa H. M.
    NEUROCOMPUTING, 2015, 168 : 706 - 718
  • [37] Feature subset selection in an ICA space
    Bressan, M
    Vitrià, J
    TOPICS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 2504 : 196 - 206
  • [38] Feature selection for genomic data sets through feature clustering
    Zheng, Fengbin
    Shen, Xiajiong
    Fu, Zhengye
    Zheng, Shanshan
    Li, Guangrong
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2010, 4 (02) : 228 - 240
  • [39] Feature selection and clustering based web service selection using QoSs
    Lalit Purohit
    Santosh S. Rathore
    Sandeep Kumar
    Applied Intelligence, 2023, 53 : 13352 - 13377
  • [40] Text clustering with feature selection by using statistical data
    Li, Yanjun
    Luo, Congnan
    Chung, Soon M.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (05) : 641 - 652