A novel feature selection method based on normalized mutual information

被引:0
作者
La The Vinh
Sungyoung Lee
Young-Tack Park
Brian J. d’Auriol
机构
[1] Kyung Hee University,Dept. of Computer Engineering
[2] Soongsil University,School of IT
来源
Applied Intelligence | 2012年 / 37卷
关键词
Feature selection; Mutual information; Minimal redundancy; Maximal relevance;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a novel feature selection method based on the normalization of the well-known mutual information measurement is presented. Our method is derived from an existing approach, the max-relevance and min-redundancy (mRMR) approach. We, however, propose to normalize the mutual information used in the method so that the domination of the relevance or of the redundancy can be eliminated. We borrow some commonly used recognition models including Support Vector Machine (SVM), k-Nearest-Neighbor (kNN), and Linear Discriminant Analysis (LDA) to compare our algorithm with the original (mRMR) and a recently improved version of the mRMR, the Normalized Mutual Information Feature Selection (NMIFS) algorithm. To avoid data-specific statements, we conduct our classification experiments using various datasets from the UCI machine learning repository. The results confirm that our feature selection method is more robust than the others with regard to classification accuracy.
引用
收藏
页码:100 / 120
页数:20
相关论文
共 57 条
[1]  
Battiti R(1994)Using mutual information for selecting features in supervised neural net learning IEEE Trans Neural Netw 5 537-550
[2]  
Bhanu B(2003)Genetic algorithm based feature selection for target detection in sar images Image Vis Comput 1 591-608
[3]  
Lin Y(2007)Sparse multinomial logistic regression via Bayesian l1 regularisation Adv Neural Inf Process Syst 19 209-216
[4]  
Cawley GC(1997)Feature selection for classification Intell Data Anal 1 131-156
[5]  
Talbot NLC(2009)Classifier subset selection for biomedical named entity recognition Appl Intell 31 267-282
[6]  
Girolami M(2003)Unsupervised feature selection applied to content-based retrieval of lung images IEEE Trans Pattern Anal Mach Intell 25 373-378
[7]  
Dash M(2009)Normalized mutual information feature selection IEEE Trans Neural Netw 20 189-201
[8]  
Liu H(2003)An extensive empirical study of feature selection metrics for text classification J Mach Learn Res 3 1289-1305
[9]  
Dimililer N(2003)An introduction to variable and feature selection J Mach Learn Res 3 1157-1182
[10]  
Varoglu E(2009)The weka data mining software: An update SIGKDD Explor 11 10-18