A Comparison of Mutual and Fuzzy-Mutual Information-Based Feature Selection Strategies

被引:10
作者
Tsai, Yu-Shuen [1 ]
Yang, Ueng-Cheng [2 ]
Chung, I-Fang [2 ]
Huang, Chuen-Der [3 ]
机构
[1] Natl Taiwan Univ Hosp, Natl Clin Trial & Res Ctr, Taipei, Taiwan
[2] Natl Yang Ming Univ, Inst Biomed Informat, Taipei, Taiwan
[3] Hsiuping Univ Sci & Technol, Dept Elect Engn, Taichung, Taiwan
来源
2013 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ - IEEE 2013) | 2013年
关键词
feature selection; mutual information; fuzzy mutual information; symmetric uncertainty; GENE-EXPRESSION DATA; MAX-RELEVANCE;
D O I
10.1109/FUZZ-IEEE.2013.6622533
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is very important to select a small set of relevant features from a high dimensional data set and useful to design either an effective classification or prediction model. This procedure involves a series of estimations of the relationship between each pair of variables and between each variable and class labels. Mutual information is widely used to estimate these relationships. However, alternative strategies may be useful to estimate the mutual information with continuous or hybrid data. In this study, we attempt to evaluate the difference between the selection strategies involved with mutual information and fuzzy mutual information. The results indicate that using fuzzy mutual information is more helpful to obtain more stable feature sets and more accurate estimations of the relationship between two variables.
引用
收藏
页数:6
相关论文
共 25 条
[1]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[2]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[3]   Mining Projected Clusters in High-Dimensional Spaces [J].
Bouguessa, Mohamed ;
Wang, Shengrui .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (04) :507-522
[4]  
Cornelis C, 2008, IEEE INT CONF FUZZY, P1600
[5]   Estimating mutual information using B-spline functions - an improved similarity measure for analysing gene expression data [J].
Daub, CO ;
Steuer, R ;
Selbig, J ;
Kloska, S .
BMC BIOINFORMATICS, 2004, 5 (1)
[6]   Normalized Mutual Information Feature Selection [J].
Estevez, Pablo. A. ;
Tesmer, Michel ;
Perez, Claudio A. ;
Zurada, Jacek A. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (02) :189-201
[7]  
Frank A., 2010, UCI machine learning repository, V213
[8]  
Guyon I., 2003, J MACH LEARN RES, V3, P1157
[9]   Information-preserving hybrid data reduction based on fuzzy-rough techniques [J].
Hu, QH ;
Yu, DR ;
Xie, ZX .
PATTERN RECOGNITION LETTERS, 2006, 27 (05) :414-423
[10]   Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation [J].
Hu, Qinghua ;
Xie, Zongxia ;
Yu, Daren .
PATTERN RECOGNITION, 2007, 40 (12) :3509-3521