Feature selection using principal component analysis and genetic algorithm

被引:18
作者
Adhao, Rahul [1 ]
Pachghare, Vinod [1 ]
机构
[1] Coll Engn Pune, Dept Comp Engn, Pune 412015, Maharashtra, India
关键词
Feature Selection; Intrusion detection system; Principal component analysis; Genetic algorithm;
D O I
10.1080/09720529.2020.1729507
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Feature engineering is the way toward utilizing domain knowledge of the records to build features that in turn assist Machine Learning (ML) algorithms to provide efficient results. It is crucial to the utilization of ML and is both difficult and costly. The next buzz word after big data is feature engineering, which involves both feature selection and feature extraction. Feature Selection (FS also called attribute selection) is a procedure of selecting a subset of pertinent features for use in model building. It is an optimization problem. In our case, we have used principal component analysis for feature transformation followed by genetic algorithm to select optimal feature set and in the last, decision tree as a classifier. The proposed approach shows that use of principal component analysis before genetic algorithms improves the accuracy of the model with less number of features.
引用
收藏
页码:595 / 602
页数:8
相关论文
共 12 条
[1]  
[Anonymous], MACHINE LEARNIN 1006
[2]  
[Anonymous], 1992, HOLLAND ADAPTATION N
[3]  
[Anonymous], 2010 22 INT TEL C IT
[4]  
Carr J., 2014, Senior Project, V1, P7
[5]   A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms [J].
Chakraborty, Basabi ;
Kawamura, Atsushi .
JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2018, 2 (02) :163-180
[6]   Feature selection for texture analysis using genetic algorithms [J].
Doloca, A .
INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2000, 74 (03) :279-292
[7]   Effect of dimension reduction by principal component analysis on clustering [J].
Erisoglu, Murat ;
Erisoglu, Ulku .
JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS, 2011, 14 (02) :277-287
[8]  
Fortin FA, 2012, J MACH LEARN RES, V13, P2171
[9]  
Gharib A., 2016, 2016 INT C INF SCI S, P1
[10]   Principal component analysis: a review and recent developments [J].
Jolliffe, Ian T. ;
Cadima, Jorge .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2065)