A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms

被引:17
作者
Chakraborty, Basabi [1 ]
Kawamura, Atsushi [1 ]
机构
[1] Iwate Prefectural Univ, Dept Software & Informat Sci, 152-52 Sugo, Takizawa 0200693, Japan
关键词
Feature subset selection; wrapper fitness function with penalty; evolutionary computation;
D O I
10.1080/24751839.2018.1423792
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature subset selection is an important preprocessing task for any real life data mining or pattern recognition problem. Evolutionary computational (EC) algorithms are popular as a search algorithm for feature subset selection. With the classification accuracy as the fitness function, the EC algorithms end up with feature subsets having considerably high recognition accuracy but the number of residual features also remain quite high. For high dimensional data, reduction of number of features is also very important to minimize computational cost of overall classification process. In this work, a wrapper fitness function composed of classification accuracy with another penalty term which penalizes for large number of features has been proposed. The proposed wrapper fitness function is used for feature subset evaluation and subsequent selection of optimal feature subset with several EC algorithms. The simulation experiments are done with several benchmark data sets having small to large number of features. The simulation results show that the proposed wrapper fitness function is efficient in reducing the number of features in the final selected feature subset without significant reduction of classification accuracy. The proposed fitness function has been shown to perform well for high-dimensional data sets with dimension up to 10,000.
引用
收藏
页码:163 / 180
页数:18
相关论文
共 53 条
[1]  
Alia A.F., 2017, INT J INF TECHNOL CO, V9, P63, DOI [10.5815/ijitcs.2017.04.08, DOI 10.5815/IJITCS.2017.04.08]
[2]  
Behjat AR, 2013, LECT NOTES COMPUT SC, V7803, P377, DOI 10.1007/978-3-642-36543-0_39
[3]   Overview of particle swarm optimisation for feature selection in classification [J].
Tran, Binh ;
Xue, Bing ;
Zhang, Mengjie .
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8886 :605-617
[4]  
Chakraborty B, 2002, ISIE 2002: PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, VOLS 1-4, P315, DOI 10.1109/ISIE.2002.1026085
[5]  
Chakraborty B, 2002, IEICE T FUND ELECTR, VE85A, P2089
[6]  
Chakraborty B., 2010, ICIC EXPRESS LETT, V4, P1161
[7]  
Chakraborty B., 2010, INT J SOFT COMPUTING, V1, P59
[8]  
Chakraborty B, 2014, IEEE SYS MAN CYBERN, P723, DOI 10.1109/SMC.2014.6973995
[9]   Binary Particle Swarm Optimization based Algorithm for Feature Subset Selection [J].
Chakraborty, Basabi .
ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, :145-148
[10]   Feature Subset Selection by Particle Swarm Optimization with Fuzzy Fitness Function [J].
Chakraborty, Basabi .
2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, :1038-1042