Effective Feature Selection Using Ensemble Techniques and Genetic Algorithm

被引:1
作者
Ghorpade-Aher, Jayshree [1 ]
Sonkamble, Balwant [2 ]
机构
[1] MIT World Peace Univ, PICT, Pune, Maharashtra, India
[2] Pune Inst Comp Technol, Pune, Maharashtra, India
来源
PROCEEDINGS OF SIXTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICICT 2021), VOL 2 | 2022年 / 236卷
关键词
Feature selection; Genetic algorithm; Ensemble; Machine learning; Heterogeneous data; Bootstrap;
D O I
10.1007/978-981-16-2380-6_32
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Individual feature selection algorithms, used for processing highdimensional multi-source heterogeneous data may lead to weak predictions. The traditional single method process may not ensure the selection of relevant features. The selections of features are susceptible to the changes in input data, and thus fail to perform consistently. These challenges can be overcome by having a robust feature selection algorithm that generates a subset of original features and evaluates the candidate set to check for its relevance. Also, it determines the feasibility of the selected subset of features. The fundamental tasks of selecting feature subset minimize the complexity of the model and help to facilitate the further processing of the model. The limitations of using single feature selection technique can be reduced by combining multiple techniques to generate the effective features. There is a need to design efficient approaches and technique for estimating the feature relevance. This ensemble approach will help to include diversity at input data level, as well as the computational technique. The proposed method-Ensemble Bootstrap Genetic Algorithm (EnBGA)-generates the effective feature subset for the multi-source heterogeneous data. Various univariate and multivariate base selectors are combined together to ensure the robustness and stability of the algorithm. In this pandemic of COVID-19, it's observed that patients already diagnosed with diseases such as diabetes had an increased mortality rate. The proposed method performs feature analysis for such data, where the Genetic Algorithm searches the feature subset and extracts the most relevant features.
引用
收藏
页码:367 / 375
页数:9
相关论文
共 16 条
[1]  
Aruna Kumari GL, 2020, INT J ENG ADV TECHNO, V9
[2]   Multiple Relevant Feature Ensemble Selection Based on Multilayer Co-Evolutionary Consensus MapReduce [J].
Ding, Weiping ;
Lin, Chin-Teng ;
Pedrycz, Witold .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (02) :425-439
[3]   Extensions to Online Feature Selection Using Bagging and Boosting [J].
Ditzler, Gregory ;
LaBarck, Joseph ;
Ritchie, James ;
Rosen, Gail ;
Polikar, Robi .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) :4504-4509
[4]  
Ghorpade J, 2020, 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2020), P40, DOI 10.1109/ICCCS49078.2020.9118578
[5]  
Khaire U. M., 2022, J KING SAUD UNIV-COM, V34, P1060, DOI DOI 10.1016/j.jksuci.2019.06.012
[6]   Feature Selection: A Data Perspective [J].
Li, Jundong ;
Cheng, Kewei ;
Wang, Suhang ;
Morstatter, Fred ;
Trevino, Robert P. ;
Tang, Jiliang ;
Liu, Huan .
ACM COMPUTING SURVEYS, 2018, 50 (06)
[7]   Feature Extraction and Selection for Parsimonious Classifiers With Multiobjective Genetic Programming [J].
Nag, Kaustuv ;
Pal, Nikhil R. .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2020, 24 (03) :454-466
[8]   Constrained Mixed Integer Programming Solver Based on the Compact Genetic Algorithm [J].
Palhares, P. H. S. ;
Brito, L. C. .
IEEE LATIN AMERICA TRANSACTIONS, 2018, 16 (05) :1493-1498
[10]   Ensemble feature selection: Homogeneous and heterogeneous approaches [J].
Seijo-Pardo, B. ;
Porto-Diaz, I. ;
Bolon-Canedo, V. ;
Alonso-Betanzos, A. .
KNOWLEDGE-BASED SYSTEMS, 2017, 118 :124-139