A Clustering-Based Approach to Reduce Feature Redundancy

被引:1
作者
de Amorim, Renato Cordeiro [1 ]
Mirkin, Boris [2 ]
机构
[1] Univ Hertfordshire, Sch Comp Sci, Coll Lane Campus, Hatfield AL10 9AB, Herts, England
[2] Birkbeck Univ London, Dept Comp Sci & Informat Syst, Malet St, London WC1E 7HX, England
来源
KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS: RECENT TRENDS, ADVANCES AND SOLUTIONS, KICSS 2013 | 2016年 / 364卷
关键词
Unsupervised feature selection; Feature weighting; Redundant features; Clustering; Mental task separation; FEATURE-SELECTION; VARIABLES;
D O I
10.1007/978-3-319-19090-7_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.
引用
收藏
页码:465 / 475
页数:11
相关论文
共 50 条
  • [1] Clustering-based feature subset selection with analysis on the redundancy-complementarity dimension
    Chen, Zhijun
    Chen, Qiushi
    Zhang, Yishi
    Zhou, Lei
    Jiang, Junfeng
    Wu, Chaozhong
    Huang, Zhen
    COMPUTER COMMUNICATIONS, 2021, 168 : 65 - 74
  • [2] CWC: A clustering-based feature weighting approach for text classification
    Zhu, Lin
    Guan, Jihong
    Zhou, Shuigeng
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4617 : 204 - +
  • [3] A clustering-based feature selection via feature separability
    Jiang, Shengyi
    Wang, Lianxi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 927 - 937
  • [4] Clustering-based hybrid feature selection approach for high dimensional microarray data
    Babu, Samson Anosh P.
    Annavarapu, Chandra Sekhara Rao
    Dara, Suresh
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 213
  • [5] Clustering-based Sequential Feature Selection Approach for High Dimensional Data Classification
    Alimoussa, M.
    Porebski, A.
    Vandenbroucke, N.
    Thami, R. Oulad Haj
    El Fkihi, S.
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 4: VISAPP, 2021, : 122 - 132
  • [6] Clustering-Based Feature Selection for Content Based Remote Sensing Image Retrieval
    Li, Shijin
    Zhu, Jiali
    Feng, Jun
    Wan, Dingsheng
    IMAGE ANALYSIS AND RECOGNITION, PT I, 2012, 7324 : 427 - 435
  • [7] A clustering-based feature selection method for automatically generated relational attributes
    Rezaei, Mostafa
    Cribben, Ivor
    Samorani, Michele
    ANNALS OF OPERATIONS RESEARCH, 2021, 303 (1-2) : 233 - 263
  • [8] CBFS: A Clustering-Based Feature Selection Mechanism for Network Anomaly Detection
    Mao, Jiewen
    Hu, Yongquan
    Jiang, Dong
    Wei, Tongquan
    Shen, Fuke
    IEEE ACCESS, 2020, 8 : 116216 - 116225
  • [9] A clustering-based feature selection method for automatically generated relational attributes
    Mostafa Rezaei
    Ivor Cribben
    Michele Samorani
    Annals of Operations Research, 2021, 303 : 233 - 263
  • [10] ICN clustering-based approach for VANETs
    Fourati, Lamia Chaari
    Ayed, Samiha
    Ben Rejeb, Mohamed Ali
    ANNALS OF TELECOMMUNICATIONS, 2021, 76 (9-10) : 745 - 757