Fast Genetic Algorithm for feature selection-A qualitative approximation approach

被引:31
作者
Altarabichi, Mohammed Ghaith [1 ]
Nowaczyk, Slawomir [1 ]
Pashami, Sepideh [1 ]
Mashhadi, Peyman Sheikholharam [1 ]
机构
[1] Halmstad Univ, Ctr Appl Intelligent Syst Res, Halmstad, Sweden
关键词
Feature selection; Evolutionary computation; Genetic Algorithm; Particle Swarm Intelligence; Fitness approximation; Meta-model; Optimization; EVOLUTIONARY ALGORITHMS; INSTANCE SELECTION; CONVERGENCE;
D O I
10.1016/j.eswa.2022.118528
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Evolutionary Algorithms (EAs) are often challenging to apply in real-world settings since evolutionary computations involve a large number of evaluations of a typically expensive fitness function. For example, an evaluation could involve training a new machine learning model. An approximation (also known as meta -model or a surrogate) of the true function can be used in such applications to alleviate the computation cost. In this paper, we propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection in a wrapper setting for large datasets.We define "Approximation Usefulness"to capture the necessary conditions to ensure correctness of the EA computations when an approximation is used. Based on this definition, we propose a procedure to construct a lightweight qualitative meta-model by the active selection of data instances. We then use a meta-model to carry out the feature selection task. We apply this procedure to the GA-based algorithm CHC (Cross generational elitist selection, Heterogeneous recombination and Cataclysmic mutation) to create a Qualitative approXimations variant, CHCQX. We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy (as compared to CHC), particularly for large datasets with over 100K instances. We also demonstrate the applicability of the thinking behind our approach more broadly to Swarm Intelligence (SI), another branch of the Evolutionary Computation (EC) paradigm with results of PSOQX, a qualitative approximation adaptation of the Particle Swarm Optimization (PSO) method. A GitHub repository with the complete implementation is available.2
引用
收藏
页数:13
相关论文
共 54 条
[1]   Extracting Invariant Features for Predicting State of Health of Batteries in Hybrid Energy Buses [J].
Altarabichi, Mohammed Ghaith ;
Fan, Yuantao ;
Pashami, Sepideh ;
Mashhadi, Peyman Sheikholharam ;
Nowaczyk, Slawomir .
2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
[2]   Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection [J].
Altarabichi, Mohammed Ghaith ;
Nowaczyk, Slawomir ;
Pashami, Sepideh ;
Mashhadi, Peyman Sheikholharam .
2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, :776-785
[3]   A New Initialization Approach in Particle Swarm Optimization for Global Optimization Problems [J].
Bangyal, Waqas Haider ;
Hameed, Abdul ;
Alosaimi, Wael ;
Alyami, Hashem .
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[4]   A Survey of Evolutionary Algorithms for Decision-Tree Induction [J].
Barros, Rodrigo Coelho ;
Basgalupp, Marcio Porto ;
de Carvalho, Andre C. P. L. F. ;
Freitas, Alex A. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (03) :291-312
[5]   Benchmark for filter methods for feature selection in high-dimensional classification data [J].
Bommert, Andrea ;
Sun, Xudong ;
Bischl, Bernd ;
Rahnenfuehrer, Joerg ;
Lang, Michel .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
[6]  
Bottou L., 2007, Large scale kernel Mach, V3, P301, DOI DOI 10.7551/MITPRESS/7496.003.0003
[7]   FAST GENETIC SELECTION OF FEATURES FOR NEURAL NETWORK CLASSIFIERS [J].
BRILL, FZ ;
BROWN, DE ;
MARTIN, WN .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02) :324-328
[8]   Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study [J].
Cano, JR ;
Herrera, F ;
Lozano, M .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2003, 7 (06) :561-575
[9]   Pneumonia detection from lung X-ray images using local search aided sine cosine algorithm based deep feature selection method [J].
Chattopadhyay, Soumitri ;
Kundu, Rohit ;
Singh, Pawan Kumar ;
Mirjalili, Seyedali ;
Sarkar, Ram .
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (07) :3777-3814
[10]   A large population size can be unhelpful in evolutionary algorithms [J].
Chen, Tianshi ;
Tang, Ke ;
Chen, Guoliang ;
Yao, Xin .
THEORETICAL COMPUTER SCIENCE, 2012, 436 :54-70