A Genetic Programming approach for feature selection in highly dimensional skewed data

被引:59
作者
Viegas, Felipe [2 ]
Rocha, Leonardo [1 ]
Goncalves, Marcos [2 ]
Mourao, Fernando [1 ]
Sa, Giovanni [1 ]
Salles, Thiago [2 ]
Andrade, Guilherme [2 ]
Sandin, Isac [1 ]
机构
[1] Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil
[2] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
关键词
Feature selection; Classification; Genetic Programming; CLASSIFICATION;
D O I
10.1016/j.neucom.2017.08.050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High dimensionality, also known as the curse of dimensionality, is still a major challenge for automatic classification solutions. Accordingly, several feature selection (FS) strategies have been proposed for dimensionality reduction over the years. However, they potentially perform poorly in face of unbalanced data. In this work, we propose a novel feature selection strategy based on Genetic Programming, which is resilient to data skewness issues, in other words, it works well with both, balanced and unbalanced data. The proposed strategy aims at combining the most discriminative feature sets selected by distinct feature selection metrics in order to obtain a more effective and impartial set of the most discriminative features, departing from the hypothesis that distinct feature selection metrics produce different (and potentially complementary) feature space projections. We evaluated our proposal in biological and textual datasets. Our experimental results show that our proposed solution not only increases the efficiency of the learning process, reducing up to 83% the size of the data space, but also significantly increases its effectiveness in some scenarios. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:554 / 569
页数:16
相关论文
共 50 条
  • [21] A New Approach for Wrapper Feature Selection Using Genetic Algorithm for Big Data
    Bouaguel, Waad
    INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2015, 2016, 5 : 75 - 83
  • [22] Genetic Programming Representations for Multi-dimensional Feature Learning in Biomedical Classification
    La Cava, William
    Silva, Sara
    Vanneschi, Leonardo
    Spector, Lee
    Moore, Jason
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2017, PT I, 2017, 10199 : 158 - 173
  • [23] PhysicsGP: A genetic programming approach to event selection
    Cranmer, K
    Bowman, RS
    COMPUTER PHYSICS COMMUNICATIONS, 2005, 167 (03) : 165 - 176
  • [24] Evolutionary feature selection on high dimensional data using a search space reduction approach
    Garcia-Torres, Miguel
    Ruiz, Roberto
    Divina, Federico
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [25] SLUG: Feature Selection Using Genetic Algorithms and Genetic Programming
    Rodrigues, Nuno M.
    Batista, Joao E.
    La Cava, William
    Vanneschi, Leonardo
    Silva, Sara
    GENETIC PROGRAMMING (EUROGP 2022), 2022, : 68 - 84
  • [26] Feature selection and classification of metabolomics data using artificial bee colony programming (ABCP)
    Ozturk, Celal
    Tarim, Mustafa
    Arslan, Sibel
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 23 (02) : 101 - 118
  • [27] A Novel Genetic Algorithm Approach to Simultaneous Feature Selection and Instance Selection
    Albuquerque, Inti Mateus Resende
    Bach Hoai Nguyen
    Xue, Bing
    Zhang, Mengjie
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 616 - 623
  • [28] A filter feature selection for high-dimensional data
    Janane, Fatima Zahra
    Ouaderhman, Tayeb
    Chamlal, Hasna
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
  • [29] Genetic Programming for Feature Selection and Feature Combination in Salient Object Detection
    Afzali, Shima
    Al-Sahaf, Harith
    Xue, Bing
    Hollitt, Christopher
    Zhang, Mengjie
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2019, 2019, 11454 : 308 - 324
  • [30] Feature selection for speaker verification using genetic programming
    Loughran R.
    Agapitos A.
    Kattan A.
    Brabazon A.
    O’Neill M.
    Evolutionary Intelligence, 2017, 10 (1-2) : 1 - 21