A novel binary greater cane rat algorithm for feature selection

被引:5
作者
Agushaka, Jeffrey O. [1 ,2 ]
Akinola, Olatunji [3 ]
Ezugwu, Absalom E. [1 ]
Oyelade, Olaide N. [4 ]
机构
[1] North West Univ, Unit Data Sci & Comp, 11 Hoffman St, ZA-2520 Potchefstroom, South Africa
[2] Fed Univ Lafia, Dept Comp Sci, Lafia 950101, Nigeria
[3] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, Pietermaritzburg, South Africa
[4] Ahmadu Bello Univ, Fac Phys Sci, Dept Comp Sci, Zaria, Nigeria
来源
RESULTS IN CONTROL AND OPTIMIZATION | 2023年 / 11卷
关键词
Feature selection; Metaheuristic algorithm; Binary greater cane rat; Optimization; Transfer function; PARTICLE SWARM OPTIMIZATION; CLASSIFICATION; VERSION;
D O I
10.1016/j.rico.2023.100225
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
There is a surge in the application of population -based metaheuristic algorithms to find the optimal feature subset from high dimensional datasets. Many of these approaches cannot properly scale especially as they are expected to maintain two opposing goals: maximizing the accuracy of classification while at the same time minimizing the number of feature subsets selected. In this study, a novel binary greater cane rat algorithm (GCRA), inspired by intelligent nocturnal behavior of the GCR which significantly affects their foraging and mating activities. They leave trails to food sources, shelters, and water as they forage, and this information is kept by the dominant. Also, they split into male and female groups during mating season is during abundant food supply and near water source. This information is modeled into and effective method for selecting the optimal feature subset from high -dimensional datasets using two different approaches. Firstly, five variants of binary GCRA are developed using one each from the family of S-shaped, V-shaped, U-shaped, Z -shaped, and quadratic transfer functions to binarize the GCRA. Secondly, the threshold which maps a variable to 0 or 1 is used to develop a variant of GCRA. The performance of the six (6) variants were evaluated using 12 datasets with different dimensionalities. The experimental results show the stability of all the proposed methods as they generally performed competitively. However, the threshold version known as BGCRA showed better performance in yielding the highest accuracy of classification on 9 of the 12 datasets utilized in the study and performed second in selecting the least number of important feature sets. It also showed superiority over other variants in yielding the least average fitness values in 11 of 12 (91.6%) of the datasets used. Hence, the BGCRA was utilized for further comparative analysis against 5 other popular feature selection (FS) algorithms with outstanding performance in terms of producing the highest mean accuracy of classification on 91.6% (11 of 12) of the datasets, 100% least average fitness values, and 91.6% in selecting the least average number of significant features. The results were also validated by statistical tests which showed that the BGCRA is significantly superior compared to other methods.
引用
收藏
页数:16
相关论文
共 75 条
[1]   Parameter estimation of solar cells diode models by an improved opposition-based whale optimization algorithm [J].
Abd Elaziz, Mohamed ;
Oliva, Diego .
ENERGY CONVERSION AND MANAGEMENT, 2018, 171 :1843-1859
[2]   A comparative study of feature selection and classification methods for gene expression data of glioma [J].
Abusamra, Heba .
4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS-BIOLOGY AND BIOINFORMATICS (CSBIO2013), 2013, 23 :5-14
[3]   Text feature selection using ant colony optimization [J].
Aghdam, Mehdi Hosseinzadeh ;
Ghasem-Aghaee, Nasser ;
Basiri, Mohammad Ehsan .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :6843-6853
[4]   Metaheuristic Algorithms on Feature Selection: A Survey of One Decade of Research (2009-2019) [J].
Agrawal, Prachi ;
Abutarboush, Hattan F. ;
Ganesh, Talari ;
Mohamed, Ali Wagdy .
IEEE ACCESS, 2021, 9 :26766-26791
[5]  
Agushaka JO, 2021, PREPRINT
[6]   Multi-area economic emission dispatch for large-scale multi-fueled power plants contemplating inter-connected grid tie-lines power flow limitations [J].
Ahmed, Ijaz ;
Rehan, Muhammad ;
Basit, Abdul ;
Malik, Saddam Hussain ;
Alvi, Um-E-Habiba ;
Hong, Keum-Shik .
ENERGY, 2022, 261
[7]   Greenhouse gases emission reduction for electric power generation sector by efficient dispatching of thermal plants integrated with renewable systems [J].
Ahmed, Ijaz ;
Rehan, Muhammad ;
Basit, Abdul ;
Hong, Keum-Shik .
SCIENTIFIC REPORTS, 2022, 12 (01)
[8]   Binary Ebola Optimization Search Algorithm for Feature Selection and Classification Problems [J].
Akinola, Olatunji ;
Oyelade, Olaide N. ;
Ezugwu, Absalom E. .
APPLIED SCIENCES-BASEL, 2022, 12 (22)
[9]   Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems [J].
Akinola, Olatunji A. ;
Agushaka, Jeffrey O. ;
Ezugwu, Absalom E. .
PLOS ONE, 2022, 17 (10)
[10]   Multiclass feature selection with metaheuristic optimization algorithms: a review [J].
Akinola, Olatunji O. ;
Ezugwu, Absalom E. ;
Agushaka, Jeffrey O. ;
Abu Zitar, Raed ;
Abualigah, Latih .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (22) :19751-19790