Fast Genetic Algorithm for feature selection-A qualitative approximation approach

被引:28
作者
Altarabichi, Mohammed Ghaith [1 ]
Nowaczyk, Slawomir [1 ]
Pashami, Sepideh [1 ]
Mashhadi, Peyman Sheikholharam [1 ]
机构
[1] Halmstad Univ, Ctr Appl Intelligent Syst Res, Halmstad, Sweden
关键词
Feature selection; Evolutionary computation; Genetic Algorithm; Particle Swarm Intelligence; Fitness approximation; Meta-model; Optimization; EVOLUTIONARY ALGORITHMS; INSTANCE SELECTION; CONVERGENCE;
D O I
10.1016/j.eswa.2022.118528
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Evolutionary Algorithms (EAs) are often challenging to apply in real-world settings since evolutionary computations involve a large number of evaluations of a typically expensive fitness function. For example, an evaluation could involve training a new machine learning model. An approximation (also known as meta -model or a surrogate) of the true function can be used in such applications to alleviate the computation cost. In this paper, we propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection in a wrapper setting for large datasets.We define "Approximation Usefulness"to capture the necessary conditions to ensure correctness of the EA computations when an approximation is used. Based on this definition, we propose a procedure to construct a lightweight qualitative meta-model by the active selection of data instances. We then use a meta-model to carry out the feature selection task. We apply this procedure to the GA-based algorithm CHC (Cross generational elitist selection, Heterogeneous recombination and Cataclysmic mutation) to create a Qualitative approXimations variant, CHCQX. We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy (as compared to CHC), particularly for large datasets with over 100K instances. We also demonstrate the applicability of the thinking behind our approach more broadly to Swarm Intelligence (SI), another branch of the Evolutionary Computation (EC) paradigm with results of PSOQX, a qualitative approximation adaptation of the Particle Swarm Optimization (PSO) method. A GitHub repository with the complete implementation is available.2
引用
收藏
页数:13
相关论文
共 55 条
  • [1] Extracting Invariant Features for Predicting State of Health of Batteries in Hybrid Energy Buses
    Altarabichi, Mohammed Ghaith
    Fan, Yuantao
    Pashami, Sepideh
    Mashhadi, Peyman Sheikholharam
    Nowaczyk, Slawomir
    [J]. 2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
  • [2] Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection
    Altarabichi, Mohammed Ghaith
    Nowaczyk, Slawomir
    Pashami, Sepideh
    Mashhadi, Peyman Sheikholharam
    [J]. 2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 776 - 785
  • [3] A New Initialization Approach in Particle Swarm Optimization for Global Optimization Problems
    Bangyal, Waqas Haider
    Hameed, Abdul
    Alosaimi, Wael
    Alyami, Hashem
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [4] A Survey of Evolutionary Algorithms for Decision-Tree Induction
    Barros, Rodrigo Coelho
    Basgalupp, Marcio Porto
    de Carvalho, Andre C. P. L. F.
    Freitas, Alex A.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (03): : 291 - 312
  • [5] Benchmark for filter methods for feature selection in high-dimensional classification data
    Bommert, Andrea
    Sun, Xudong
    Bischl, Bernd
    Rahnenfuehrer, Joerg
    Lang, Michel
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
  • [6] Bottou L., 2007, LARGE SCALE KERNEL M, P301, DOI DOI 10.7551/MITPRESS/7496.003.0003
  • [7] FAST GENETIC SELECTION OF FEATURES FOR NEURAL NETWORK CLASSIFIERS
    BRILL, FZ
    BROWN, DE
    MARTIN, WN
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02): : 324 - 328
  • [8] Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study
    Cano, JR
    Herrera, F
    Lozano, M
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2003, 7 (06) : 561 - 575
  • [9] Pneumonia detection from lung X-ray images using local search aided sine cosine algorithm based deep feature selection method
    Chattopadhyay, Soumitri
    Kundu, Rohit
    Singh, Pawan Kumar
    Mirjalili, Seyedali
    Sarkar, Ram
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (07) : 3777 - 3814
  • [10] A large population size can be unhelpful in evolutionary algorithms
    Chen, Tianshi
    Tang, Ke
    Chen, Guoliang
    Yao, Xin
    [J]. THEORETICAL COMPUTER SCIENCE, 2012, 436 : 54 - 70