Differential evolution for filter feature selection based on information theory and feature ranking

被引:277
作者
Hancer, Emrah [1 ,2 ]
Xue, Bing [2 ]
Zhang, Mengjie [2 ]
机构
[1] Mehmet Akif Ersoy Univ, Dept Comp Technol & Informat Syst, TR-15039 Burdur, Turkey
[2] Victoria Univ Wellington, Sch Engn & Comp Sci, Evolutionary Computat Res Grp, Wellington 6140, New Zealand
关键词
Mutual information; ReliefF; Fisher Score; differential evolution; feature selection; PARTICLE SWARM OPTIMIZATION; MUTUAL INFORMATION; FEATURE-EXTRACTION; GENETIC ALGORITHM; ANT COLONY; CLASSIFICATION; HYBRID;
D O I
10.1016/j.knosys.2017.10.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is an essential step in various tasks, where filter feature selection algorithms are increasingly attractive due to their simplicity and fast speed. A common filter is to use mutual information to estimate the relationships between each feature and the class labels (mutual relevancy), and between each pair of features (mutual redundancy). This strategy has gained popularity resulting a variety of criteria based on mutual information. Other well-known strategies are to order each feature based on the nearest neighbor distance as in ReliefF, and based on the between-class variance and the within-class variance as in Fisher Score. However, each strategy comes with its own advantages and disadvantages. This paper proposes a new filter criterion inspired by the concepts of mutual information, ReliefF and Fisher Score. Instead of using mutual redundancy, the proposed criterion tries to choose the highest ranked features determined by ReliefF and Fisher Score while providing the mutual relevance between features and the class labels. Based on the proposed criterion, two new differential evolution (DE) based filter approaches are developed. While the former uses the proposed criterion as a single objective problem in a weighted manner, the latter considers the proposed criterion in a multi-objective design. Moreover, a well known mutual information feature selection approach (MIFS) based on maximum-relevance and minimum-redundancy is also adopted in single-objective and multi-objective DE algorithms for feature selection. The results show that the proposed criterion outperforms MIFS in both single objective and multi-objective DE frameworks. The results also indicate that considering feature selection as a multi objective problem can generally provide better performance in terms of the feature subset size and the classification accuracy. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 119
页数:17
相关论文
共 68 条
  • [1] Al-Ani A, 2005, PROC WRLD ACAD SCI E, V4, P35
  • [2] A new technique for combining multiple classifiers using the Dempster-Shafer theory of evidence
    Al-Ani, M
    Deriche, M
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2002, 17 : 333 - 361
  • [3] [Anonymous], 2009, SIGKDD Explorations, DOI DOI 10.1145/1656274.1656278
  • [4] [Anonymous], 2009, P 12 INT C ART INT S
  • [5] [Anonymous], 2005, NIPS
  • [6] [Anonymous], P 11 INT C MACH LEAR
  • [7] [Anonymous], 2014, Differential Evolution: A Practical Approach to Global Optimization
  • [8] USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING
    BATTITI, R
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04): : 537 - 550
  • [9] Benesty J, 2009, SPRINGER TOP SIGN PR, V2, P1, DOI 10.1007/978-3-642-00296-0_1
  • [10] Bishop C.M., 1995, Neural networks for pattern recognition