Feature selection for classificatory analysis based on information-theoretic criteria

被引:0
作者
Department of Automation, Harbin University of Science and Technology, Harbin 150080, China [1 ]
不详 [2 ]
机构
[1] Department of Automation, Harbin University of Science and Technology
[2] Department of Automation, Shanghai Jiaotong University
来源
Zidonghua Xuebao | 2008年 / 3卷 / 383-392期
基金
中国国家自然科学基金;
关键词
Data mining; Feature selection; Information-theoretic measures; Pattern classification;
D O I
10.3724/SP.J.1004.2008.00383
中图分类号
学科分类号
摘要
Feature selection aims to reduce the dimensionality of patterns for classificatory analysis by selecting the most informative instead of irrelevant and/or redundant features. In this study, two novel information-theoretic measures for feature ranking are presented: one is an improved formula to estimate the conditional mutual information between the candidate feature fi and the target class C given the subset of selected features S, i.e., I(C; fi|S), under the assumption that information of features is distributed uniformly; the other is a mutual information (MI) based constructive criterion that is able to capture both irrelevant and redundant input features under arbitrary distributions of information of features. With these two measures, two new feature selection algorithms, called the quadratic Mi-based feature selection (QMIFS) approach and the Mi-based constructive criterion (MICC) approach, respectively, are proposed, in which no parameters like β in Battiti's MIFS and (Kwak and Choi)'s MIFS-U methods need to be preset. Thus, the intractable problem of how to choose an appropriate value for β to do the tradeoff between the relevance to the target classes and the redundancy with the already-selected features is avoided completely. Experimental results demonstrate the good performances of QMIFS and MICC on both synthetic and benchmark data sets.
引用
收藏
页码:383 / 392
页数:9
相关论文
共 40 条
  • [1] Guyon I., Elisseeff A., An introduction to variable and feature selection, The Journal of Machine Learning Research, 3, 7-8, pp. 1157-1182, (2003)
  • [2] Dash M., Liu H., Feature selection for classification, Intelligent Data Analysis, 1, 3, pp. 131-156, (1997)
  • [3] Ahmad A., Dey L., A feature selection technique for classificatory analysis, Pattern Recognition Letters, 26, 1, pp. 43-56, (2005)
  • [4] Liu H., Motoda H., Feature Selection for Knowledge Discovery and Data Mining, (1998)
  • [5] Malhi A., Gao R.X., PCA-based feature selection scheme for machine defect classification, IEEE Transactions on Instrumentation and Measurement, 53, 6, pp. 1517-1525, (2004)
  • [6] Salcedo-Sanz S., Camps-Valls G., Perez-Cruz F., Sepulveda-Sanchis J., Bousono-Calzon C., Enhancing genetic feature selection through restricted search and Walsh analysis, IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, 34, 4, pp. 398-406, (2004)
  • [7] Verikas A., Bacauskiene M., Feature selection with neural networks, Pattern Recognition Letters, 23, 11, pp. 1323-1335, (2002)
  • [8] Shima K., Todoriki M., Suzuki A., SVM-based feature selection of latent semantic features, Pattern Recognition Letters, 25, 9, pp. 1051-1057, (2004)
  • [9] Jensen R., Shen Q., Semantics-preserving dimensionality reduction: Rough and fuzzy-rough-based approaches, IEEE Transactions on Knowledge and Data Engineering, 16, 12, pp. 1457-1471, (2004)
  • [10] Swiniarski R.W., Skowron A., Rough set methods in feature selection and recognition, Pattern Recognition Letters, 24, 6, pp. 833-849, (2003)