A Clustering-Based Approach to Reduce Feature Redundancy

被引:1
作者
de Amorim, Renato Cordeiro [1 ]
Mirkin, Boris [2 ]
机构
[1] Univ Hertfordshire, Sch Comp Sci, Coll Lane Campus, Hatfield AL10 9AB, Herts, England
[2] Birkbeck Univ London, Dept Comp Sci & Informat Syst, Malet St, London WC1E 7HX, England
来源
KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS: RECENT TRENDS, ADVANCES AND SOLUTIONS, KICSS 2013 | 2016年 / 364卷
关键词
Unsupervised feature selection; Feature weighting; Redundant features; Clustering; Mental task separation; FEATURE-SELECTION; VARIABLES;
D O I
10.1007/978-3-319-19090-7_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.
引用
收藏
页码:465 / 475
页数:11
相关论文
共 50 条
  • [41] Improving Clustering-Based Forecasting of Aggregated Distribution Transformer Loadings With Gradient Boosting and Feature Selection
    Rouwhorst, George
    Duque, Edgar Mauricio Salazar
    Nguyen, Phuong H.
    Slootweg, Han
    IEEE ACCESS, 2022, 10 : 443 - 455
  • [42] A novel feature selection approach based on clustering algorithm
    Moslehi, Fateme
    Haeri, Abdorrahman
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2021, 91 (03) : 581 - 604
  • [43] A Clustering-Based Approach to Identify Joint Impedance During Walking
    Arami, Arash
    van Asseldonk, Edwin
    van der Kooij, Herman
    Burdet, Etienne
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2020, 28 (08) : 1808 - 1816
  • [44] A Clustering-based QoS Prediction Approach for Web Service Selection
    Zhang, Xuejie
    Wang, Zhijian
    Lv, Xin
    Qi, Rongzhi
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CLOUD COMPUTING COMPANION (ISCC-C), 2014, : 201 - 206
  • [45] A Novel Clustering-Based Season Factor Approach for Broiler Breeding
    Huang, Peijie
    Lin, Piyuan
    Yan, Shangwei
    Xiao, Meiyan
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 2811 - 2814
  • [46] On a Clustering-Based Approach for Traffic Sub-area Division
    Zhu, Jiahui
    Niu, Xinzheng
    Wu, Chase Q.
    ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: FROM THEORY TO PRACTICE, 2019, 11606 : 516 - 529
  • [47] A clustering-based approach for efficient identification of microRNA combinatorial biomarkers
    Yang Yang
    Ning Huang
    Luning Hao
    Wei Kong
    BMC Genomics, 18
  • [48] A survey of load balancing and implementation of clustering-based approach for clouds
    Sharma A.
    Pandey R.
    Singh S.P.
    Kumar R.
    Recent Advances in Computer Science and Communications, 2021, 14 (03) : 669 - 677
  • [49] Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation
    Dubey, Aditya
    Rasool, Akhtar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 710 - 714
  • [50] An Effective Clustering-based Approach for Conceptual Association Rules Mining
    Quan, Tho T.
    Ngo, Linh N.
    Hui, Siu Cheung
    2009 IEEE-RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION AND VISION FOR THE FUTURE, 2009, : 257 - +