Feature selection in high dimensional data: A specific preordonnances-based memetic algorithm

被引:8
作者
Chamlal, Hasna [1 ]
Ouaderhman, Tayeb [1 ]
El Mourtji, Basma [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci Ain Chock, Dept Math & Comp Sci, Fundamental & Appl Math Lab, Casablanca, Morocco
关键词
Mixed-type data; Concordance measures; Memetic algorithms; Feature selection; Classification; Machine learning; PARTICLE SWARM OPTIMIZATION; HYBRID FEATURE-SELECTION; FEATURE SUBSET-SELECTION; SUPPORT VECTOR MACHINE; GENE SELECTION; MARKOV BLANKET; SEARCH; CLASSIFICATION; INFORMATION;
D O I
10.1016/j.knosys.2023.110420
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In supervised learning scenarios, feature selection has been largely investigated in the literature because only a few features carry valuable information. This study introduces an algorithm for hetero-geneous variable selection in the discrimination problem. The proposed algorithm, Specific Memetic Algorithm Preordonnances-based (SMAP), uses association techniques based on preordonnances theory and is a hybrid filter-wrapper algorithm to make full use of the benefits of each: The filter phase measures the relevance of features by their agreement with the target variable and discards those that disagree ; this leads to a reduction of the search space, while the wrapper phase measures the usefulness of subsets of features to identify the best one using a memetic algorithm based on preordonnances theory. The association is quantified by a coefficient measuring the concordance between two variables, even if one is numeric and the other is categorical (mixed). We propose a generalization of this coefficient measuring the concordance between several (more than two) heterogeneous variables. In this study, a new feature discrimination power measure combining the two coefficients is introduced to intensify and diversify the search taking a minimum amount of time. SMAP is empirically analyzed by comparing its performance to that of recently referred state-of-art approaches on seven datasets and on simulated ones using three different classifiers. The experimental results show the superiority of SMAP over comparative methods.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 74 条
[1]   A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing [J].
Alomari, Osama Ahmad ;
Khader, Ahamad Tajudin ;
Al-Betar, Mohammed Azmi ;
Awadallah, Mohammed A. .
APPLIED INTELLIGENCE, 2018, 48 (11) :4429-4447
[2]   Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm [J].
Alomari, Osama Ahmad ;
Khader, Ahamad Tajudin ;
Al-Betar, Mohammed Azmi ;
Abualigah, Laith Mohammad .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (01) :32-51
[3]  
[Anonymous], 2012, P 2012 IEEE C EV COM, DOI DOI 10.1109/CEC.2012.6256130
[4]  
[Anonymous], 1997, 14 INT C MACH LEARN
[5]  
Bjornsdotter Malin, 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P1076, DOI 10.1109/ICPR.2010.269
[6]  
Bonilla-Huerta Edmundo, 2011, Bio-Inspired Computing and Applications. 7th International Conference on Intelligent Computing, ICIC 2011.Revised Selected Papers, P453, DOI 10.1007/978-3-642-24553-4_60
[7]   Lower bounds and a tabu search algorithm for the minimum deficiency problem [J].
Bouchard, Mathieu ;
Hertz, Alain ;
Desaulniers, Guy .
JOURNAL OF COMBINATORIAL OPTIMIZATION, 2009, 17 (02) :168-191
[8]   A survey on optimization metaheuristics [J].
Boussaid, Ilhern ;
Lepagnot, Julien ;
Siarry, Patrick .
INFORMATION SCIENCES, 2013, 237 :82-117
[9]  
Cervante L, 2012, IEEE C EVOL COMPUTAT
[10]  
Chamlal H., 2021, MULTICRITERIA APPROA, P7, DOI [10.1109/ICDS53782.2021.9626744, DOI 10.1109/ICDS53782.2021.9626744]