SEQUENTIAL, BOTTOM-UP VARIABLE SELECTION FOR HIGH-DIMENSIONAL CLASSIFICATION

被引:0
作者
Hall, Peter [1 ]
Miller, Hugh [1 ]
机构
[1] Univ Melbourne, Dept Math & Stat, Melbourne, Vic 3010, Australia
关键词
bootstrap; centroid-based classifier; classification; dimension reduction; lasso; linear model; median-based classifier; nearest neighbour classifier; regression; sequential variable selection; support vector machine; wrapper methods; LARGE UNDERDETERMINED SYSTEMS; WAVELET SHRINKAGE; GENE; CENTROIDS; EQUATIONS; CANCER;
D O I
10.1111/j.1467-842X.2010.00594.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
P>Most methods for variable selection work from the top down and steadily remove features until only a small number remain. They often rely on a predictive model, and there are usually significant disconnections in the sequence of methodologies that leads from the training samples to the choice of the predictor, then to variable selection, then to choice of a classifier, and finally to classification of a new data vector. In this paper we suggest a bottom-up approach that brings the choices of variable selector and classifier closer together, by basing the variable selector directly on the classifier, removing the need to involve predictive methods in the classification decision, and enabling the direct and transparent comparison of different classifiers in a given problem. Specifically, we suggest 'wrapper methods', determined by classifier type, for choosing variables that minimize the classification error rate. This approach is particularly useful for exploring relationships among the variables that are chosen for the classifier. It reveals which variables have a high degree of leverage for correct classification using different classifiers; it shows which variables operate in relative isolation, and which are important mainly in conjunction with others; it permits quantification of the authority with which variables are selected; and it generally leads to a reduced number of variables for classification, in comparison with alternative approaches based on prediction.
引用
收藏
页码:403 / 421
页数:19
相关论文
共 41 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2000, Pattern Classification
[3]  
ASIMOV D, 1985, ADS, V6, P128
[4]   Generalized rules for combination and joint training of classifiers [J].
Bilmes, JA ;
Kirchhoff, K .
PATTERN ANALYSIS AND APPLICATIONS, 2003, 6 (03) :201-211
[5]   BETTER SUBSET REGRESSION USING THE NONNEGATIVE GARROTE [J].
BREIMAN, L .
TECHNOMETRICS, 1995, 37 (04) :373-384
[6]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[7]   The properties of high-dimensional data spaces: implications for exploring gene and protein expression data [J].
Clarke, Robert ;
Ressom, Habtom W. ;
Wang, Antai ;
Xuan, Jianhua ;
Liu, Minetta C. ;
Gehan, Edmund A. ;
Wang, Yue .
NATURE REVIEWS CANCER, 2008, 8 (01) :37-49
[8]  
Cootes T. F., 1993, Information Processing in Medical Imaging. 13th International Conference, IPMI '93 Proceedings, P33, DOI 10.1007/BFb0013779
[9]   Optimality Driven Nearest Centroid Classification from Genomic Data [J].
Dabney, Alan R. ;
Storey, John D. .
PLOS ONE, 2007, 2 (10)
[10]   Classification of microarrays to nearest centroids [J].
Dabney, AR .
BIOINFORMATICS, 2005, 21 (22) :4148-4154