A Clustering-Based Approach to Reduce Feature Redundancy

被引:1
作者
de Amorim, Renato Cordeiro [1 ]
Mirkin, Boris [2 ]
机构
[1] Univ Hertfordshire, Sch Comp Sci, Coll Lane Campus, Hatfield AL10 9AB, Herts, England
[2] Birkbeck Univ London, Dept Comp Sci & Informat Syst, Malet St, London WC1E 7HX, England
来源
KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS: RECENT TRENDS, ADVANCES AND SOLUTIONS, KICSS 2013 | 2016年 / 364卷
关键词
Unsupervised feature selection; Feature weighting; Redundant features; Clustering; Mental task separation; FEATURE-SELECTION; VARIABLES;
D O I
10.1007/978-3-319-19090-7_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.
引用
收藏
页码:465 / 475
页数:11
相关论文
共 50 条
  • [31] A clustering-based obstacle segmentation approach for urban environments
    Ridel, Daniela A.
    Shinzato, Patrick Y.
    Wolf, Denis F.
    2015 12TH LATIN AMERICAN ROBOTICS SYMPOSIUM AND 2015 3RD BRAZILIAN SYMPOSIUM ON ROBOTICS (LARS-SBR), 2015, : 265 - 270
  • [32] A mixed clustering-based approach for a territorial hydrological regionalization
    Oumaima Rami
    Moulay Driss Hasnaoui
    Driss Ouazar
    Ahmed Bouziane
    Arabian Journal of Geosciences, 2022, 15 (1)
  • [33] LQG Control of Large Networks: A Clustering-Based Approach
    Xue, Nan
    Chakrabortty, Aranya
    2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 2333 - 2338
  • [34] A clustering-based feature selection framework for handwritten Indic script classification
    Chatterjee, Iman
    Ghosh, Manosij
    Sing, Pawan Kumar
    Sarkar, Ram
    Nasipuri, Mita
    EXPERT SYSTEMS, 2019, 36 (06)
  • [35] Fuzzy Clustering-based GMDH Model to Feature Selection in Customer Analysis
    Zhao, Hengjun
    He, Changzheng
    Ye, Zhen
    ISBIM: 2008 INTERNATIONAL SEMINAR ON BUSINESS AND INFORMATION MANAGEMENT, VOL 1, 2009, : 461 - 464
  • [36] Improving EEG Decoding via Clustering-Based Multitask Feature Learning
    Zhang, Yu
    Zhou, Tao
    Wu, Wei
    Xie, Hua
    Zhu, Hongru
    Zhou, Guoxu
    Cichocki, Andrzej
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3587 - 3597
  • [37] A clustering-based hybrid approach for dual data reduction
    Ratnoo, Saroj
    Rathee, Seema
    Ahuja, Jyoti
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2018, 6 (05) : 468 - 490
  • [38] A clustering-based Approach for Unsupervised Word Sense Disambiguation
    Martin-Wanton, Tamara
    Berlanga-Llavori, Rafael
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 49 - 56
  • [39] Improving Clustering-Based Forecasting of Aggregated Distribution Transformer Loadings With Gradient Boosting and Feature Selection
    Rouwhorst, George
    Duque, Edgar Mauricio Salazar
    Nguyen, Phuong H.
    Slootweg, Han
    IEEE ACCESS, 2022, 10 : 443 - 455
  • [40] Fuzzy clustering-based feature extraction method for mental task classification
    Gupta A.
    Kumar D.
    Brain Informatics, 2017, 4 (2) : 135 - 145