New feature selection paradigm based on hyper-heuristic technique

被引:16
作者
Ibrahim, Rehab Ali [1 ,2 ]
Abd Elaziz, Mohamed [2 ]
Ewees, Ahmed A. [3 ,4 ]
El-Abd, Mohammed [5 ]
Lu, Songfeng [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Cyber Sci & Engn, Wuhan 430074, Peoples R China
[2] Zagazig Univ, Dept Math, Fac Sci, Zagazig, Egypt
[3] Univ Bisha, Dept E Syst, Bisha 61922, Saudi Arabia
[4] Damietta Univ, Dept Comp, Damietta Governorate, Egypt
[5] Amer Univ Kuwait, Coll Engn & Appl Sci, POB 3323, Safat 13034, Kuwait
基金
中国博士后科学基金;
关键词
Meta-heuristic; Chaotic maps; Differential evolution; Opposition-based learning; Feature selection; Hyper-heuristic; GREY WOLF OPTIMIZATION; SALP SWARM ALGORITHM; NEAREST-NEIGHBOR; CLASSIFICATION; REGRESSION; EVOLUTIONARY; PREDICTION; SCHEME;
D O I
10.1016/j.apm.2021.04.018
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Feature selection (FS) is a crucial step for effective data mining since it has largest effect on improving the performance of classifiers. This is achieved by removing the irrelevant features and using only the relevant features. Many metaheuristic approaches exist in the literature in attempt to address this problem. The performance of these approaches differ based on the settings of a number of factors including the use of chaotic maps, opposition -based learning (OBL) and the percentage of the population that OBL will be applied to, the metaheuristic (MH) algorithm adopted, the classifier utilized, and the threshold value used to convert real solutions to binary ones. However, it is not an easy task to identify the best settings for these different components in order to determine the relevant fea-tures for a specific dataset. Moreover, running extensive experiments to fine tune these settings for each and every dataset will consume considerable time. In order to mitigate this important issue, a hyper-heuristic based FS paradigm is proposed. In the proposed model, a two-stage approach is adopted to identify the best combination of these com-ponents. In the first stage, referred to as the training stage, the Differential Evolution (DE) algorithm is used as a controller for selecting the best combination of components to be used by the second stage. In the second stage, referred to as the testing stage, the received combination will be evaluated using a testing set. Empirical evaluation of the proposed framework is based on numerous experiments performed on the most popular 18 datasets from the UCI machine learning repository. Experimental results illustrates that the gener-ated generic configuration provides a better performance than eight other metaheuristic algorithms over all performance measures when applied to the UCI dataset. Moreover, The overall paradigm ranks at number one when compared against state-of-the-art algorithms. Finally, the generic configuration provides a very competitive performance for high dimen-sional datasets. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:14 / 37
页数:24
相关论文
共 111 条
  • [71] SCA: A Sine Cosine Algorithm for solving optimization problems
    Mirjalili, Seyedali
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 96 : 120 - 133
  • [72] Multi-Verse Optimizer: a nature-inspired algorithm for global optimization
    Mirjalili, Seyedali
    Mirjalili, Seyed Mohammad
    Hatamlou, Abdolreza
    [J]. NEURAL COMPUTING & APPLICATIONS, 2016, 27 (02) : 495 - 513
  • [73] The Ant Lion Optimizer
    Mirjalili, Seyedali
    [J]. ADVANCES IN ENGINEERING SOFTWARE, 2015, 83 : 80 - 98
  • [74] Grey Wolf Optimizer
    Mirjalili, Seyedali
    Mirjalili, Seyed Mohammad
    Lewis, Andrew
    [J]. ADVANCES IN ENGINEERING SOFTWARE, 2014, 69 : 46 - 61
  • [75] Mitchell T., 2015, MACH LEARN
  • [76] Mitra M, 2011, ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, P329
  • [77] Montazeri M., 2016, ARXIV PREPRINT ARXIV
  • [78] Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection
    Neggaz, Nabil
    Ewees, Ahmed A.
    Abd Elaziz, Mohamed
    Mafarja, Majdi
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 145 (145)
  • [79] Neha J.V, 2016, INT J COMPUT APPL, V146
  • [80] An improved brainstorm optimization using chaotic opposite-based learning with disruption operator for global optimization and feature selection
    Oliva, Diego
    Elaziz, Mohamed Abd
    [J]. SOFT COMPUTING, 2020, 24 (18) : 14051 - 14072