Bayesian Network-Based Multi-objective Estimation of Distribution Algorithm for Feature Selection Tailored to Regression Problems

被引:0
作者
Lopez, Jose A. [1 ]
Morales-Osorio, Felipe [2 ]
Lara, Maximiliano [3 ]
Velasco, Jonas [1 ,4 ]
Sanchez, Claudia N. [3 ]
机构
[1] Ctr Invest Matemat CIMAT AC, Aguascalientes 20200, Aguascalientes, Mexico
[2] MIT, Cambridge, MA 02139 USA
[3] Univ Panamer, Fac Ingn, Aguascalientes 20296, Aguascalientes, Mexico
[4] Consejo Nacl Humanidades Ciencias & Tecnol CONAHC, Mexico City 03940, DF, Mexico
来源
ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2023, PT I | 2024年 / 14391卷
关键词
Feature selection; estimation distribution algorithms; bayesian network; multi-objective optimization; regression problems; OPTIMIZATION;
D O I
10.1007/978-3-031-47765-2_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is an essential pre-processing step in Machine Learning for improving the performance of models, reducing the time of predictions, and, more importantly, identifying the most significant features. Sometimes, this identification can reduce the time and cost of obtaining feature values because it could imply buying fewer sensors or spending less human time. This paper proposes an Estimation of Distribution Algorithm (EDA) for feature selection tailored to regression problems with a multi-objective approach. The objective is to maximize the performance of learning models and minimize the number of selected features. We use a Bayesian Network (BN) as the EDA distribution probability model. The main contribution of this work is the process used to create this BN structure. It aims to capture the redundancy and relevance among features. Also, the BN is used to create the initial EDA population. We test and compare the performance of our proposal with other multi-objective algorithms: an EDA with a Bernoulli distribution probability model, NSGA II, and AGEMOEA, using different datasets. The experimental results show that the proposed algorithm found solutions with a considerably fewer number of features. Additionally, the proposed algorithm achieves comparable results on models' performance compared with the other algorithms. Our proposal generally expended less time and had fewer objective function evaluations.
引用
收藏
页码:309 / 326
页数:18
相关论文
共 24 条
[1]   Metaheuristic Algorithms on Feature Selection: A Survey of One Decade of Research (2009-2019) [J].
Agrawal, Prachi ;
Abutarboush, Hattan F. ;
Ganesh, Talari ;
Mohamed, Ali Wagdy .
IEEE ACCESS, 2021, 9 :26766-26791
[2]   Pymoo: Multi-Objective Optimization in Python']Python [J].
Blank, Julian ;
Deb, Kalyanmoy .
IEEE ACCESS, 2020, 8 :89497-89509
[3]   Multi-objective feature selection using a Bayesian artificial immune system [J].
Castro, Pablo A. D. ;
Von Zuben, Fernando J. .
INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2010, 3 (02) :235-256
[4]  
Collette Y., 2004, Multiobjective Optimization: Principles and Case Studies, DOI DOI 10.1007/978-3-662-08883-8
[5]  
Cormen T.H., 2002, INTRO ALGORITHMS, V2nd
[6]  
Dash M., 1997, Intelligent Data Analysis, V1
[7]   A fast and elitist multiobjective genetic algorithm: NSGA-II [J].
Deb, K ;
Pratap, A ;
Agarwal, S ;
Meyarivan, T .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) :182-197
[8]   A comprehensive survey on feature selection in the various fields of machine learning [J].
Dhal, Pradip ;
Azad, Chandrashekhar .
APPLIED INTELLIGENCE, 2022, 52 (04) :4543-4581
[9]  
Guyon I., 2003, Journal of Machine Learning Research, V3, P1157, DOI 10.1162/153244303322753616
[10]  
Hamdani T.M., 2007, Technical report