Feature subset selection using a new definition of classifiability

被引:60
作者
Dong, M
Kothari, R
机构
[1] Indian Inst Technol, IBM, India Res Lab, New Delhi 110016, India
[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
关键词
feature selection; dimensionality reduction; classification;
D O I
10.1016/S0167-8655(02)00303-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of most practical classifiers improves when correlated or irrelevant features are removed. Machine based classification is thus often preceded by subset selection-a procedure which identifies relevant features of a high dimensional data set. At present, the most widely used subset selection technique is the so-called "wrapper" approach in which a search algorithm is used to identify candidate subsets and the actual classifier is used as a "black box" to evaluate the fitness of the subset. Fitness evaluation of the subset however requires cross-validation or other resampling based procedure for error estimation necessitating the construction of a large number of classifiers for each subset. This significant computational burden makes the wrapper approach impractical when a large number of features are present. In this paper, we present an approach to subset selection based on a novel definition of the classifiability of a given data. The classifiability measure we propose characterizes the relative ease with which some labeled data can be classified. We use this definition of classifiability to systematically add the feature which leads to the most increase in classifiability. The proposed approach does not require the construction of classifiers at each step and therefore does not suffer from as high a computational burden as a wrapper approach. Our results over several different data sets indicate that the results obtained are at least as good as that obtained with the wrapper approach. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1215 / 1225
页数:11
相关论文
共 22 条
[11]   Layered Neural Networks with Gaussian Hidden Units as Universal Approximations [J].
Hartman, Eric J. ;
Keeler, James D. ;
Kowalski, Jacek M. .
NEURAL COMPUTATION, 1990, 2 (02) :210-215
[12]  
Ho TK, 2002, IEEE T PATTERN ANAL, V24, P289, DOI 10.1109/34.990132
[13]  
Hoekstra A., 1996, Proceedings of the 13th International Conference on Pattern Recognition, P271, DOI 10.1109/ICPR.1996.547429
[14]  
KIRA K, 1992, AAAI-92 PROCEEDINGS : TENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, P129
[15]   Wrappers for feature subset selection [J].
Kohavi, R ;
John, GH .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :273-324
[16]  
LANGLEY P, 1992, AAAI-92 PROCEEDINGS : TENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, P223
[17]  
NARENDRA P, 1977, IEEE T COMPUT, V26, P917, DOI 10.1109/TC.1977.1674939
[18]  
Neter J., 1990, APPL LINEAR STAT MOD
[19]  
Quinlan J. R., 1986, Machine Learning, V1, P81, DOI 10.1023/A:1022643204877
[20]  
Rao A., 1990, TAXONOMY TEXTURE DES