Feature selection and threshold method based on fuzzy joint mutual information

被引:25
作者
Salem, Omar A. M. [1 ,2 ]
Liu, Feng [1 ]
Chen, Yi-Ping Phoebe [3 ]
Chen, Xi [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[2] Suez Canal Univ, Fac Comp & Informat, Dept Informat Syst, Ismailia 41522, Egypt
[3] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic 3086, Australia
关键词
Feature selection; Fuzzy sets; Mutual information; Feature representative matrix; MAX-RELEVANCE; REDUNDANCY;
D O I
10.1016/j.ijar.2021.01.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Improving classification performance is one of the main challenges in a variety of real-world applications. Unfortunately, classification models are sensitive to undesirable features of data such as redundant and irrelevant features. Feature selection (FS) is a powerful solution to address the negative effect of these features. Among various methods, Feature selection based on mutual information (MI) is an effective method to select the significant features and deny the undesirable ones. Although most of the existing methods can estimate the feature relevancy efficiently, they may find difficulty to estimate the feature redundancy well. This is due to the individual estimation between the candidate feature and the features of pre-selected subset. To address this limitation, this paper introduces a novel feature selection method, called Fuzzy Joint Mutual Information (FJMI). The proposed method also overcomes common limitations as dealing with continuous feature without information loss and returns the best feature subset automatically without a user-defined threshold. To evaluate the effectiveness of the proposed method, FJMI is compared with six conventional and state-of-the-art feature selection methods. The experimental results on 16 benchmark datasets, with moderate size, show a promising improvement by the proposed method in terms of classification performance, feature selection stability, and number of selected features. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:107 / 126
页数:20
相关论文
共 49 条
[1]  
[Anonymous], 2000, INTRO SUPPORT VECTOR
[2]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[3]   Feature selection using Joint Mutual Information Maximisation [J].
Bennasar, Mohamed ;
Hicks, Yulia ;
Setchi, Rossitza .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) :8520-8532
[4]   Recent advances and emerging challenges of feature selection in the context of big data [J].
Bolon-Canedo, V. ;
Sanchez-Marono, N. ;
Alonso-Betanzos, A. .
KNOWLEDGE-BASED SYSTEMS, 2015, 86 :33-45
[5]   Feature selection in machine learning: A new perspective [J].
Cai, Jie ;
Luo, Jiawei ;
Wang, Shulin ;
Yang, Sheng .
NEUROCOMPUTING, 2018, 300 :70-79
[6]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[7]   Maximum relevance minimum common redundancy feature selection for nonlinear data [J].
Che, Jinxing ;
Yang, Youlong ;
Li, Li ;
Bai, Xuying ;
Zhang, Shenghu ;
Deng, Chengzhi .
INFORMATION SCIENCES, 2017, 409 :68-86
[8]  
Dua D, 2017, UCI MACHINE LEARNING, DOI DOI 10.1016/J.DSS.2009.05.016
[9]  
Fleuret F, 2004, J MACH LEARN RES, V5, P1531
[10]  
Fodor I., 2002, A Survey of Dimension Reduction Techniques, P1