A direct approach to sparse discriminant analysis in ultra-high dimensions

被引:156
作者
Mai, Qing [1 ]
Zou, Hui [1 ]
Yuan, Ming [2 ]
机构
[1] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
[2] Georgia Inst Technol, M Stewart Sch Ind & Syst Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
Discriminant analysis; Features annealed independence rule; Lasso; Nearest shrunken centroids classifier; Nonpolynomial-dimension asymptotics; VARIABLE SELECTION; MODEL SELECTION; CLASSIFICATION; REGULARIZATION; REGRESSION; LASSO; TUMOR;
D O I
10.1093/biomet/asr066
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Sparse discriminant methods based on independence rules, such as the nearest shrunken centroids classifier (Tibshirani et al., 2002) and features annealed independence rules (Fan & Fan, 2008), have been proposed as computationally attractive tools for feature selection and classification with high-dimensional data. A fundamental drawback of these rules is that they ignore correlations among features and thus could produce misleading feature selection and inferior classification. We propose a new procedure for sparse discriminant analysis, motivated by the least squares formulation of linear discriminant analysis. To demonstrate our proposal, we study the numerical and theoretical properties of discriminant analysis constructed via lasso penalized least squares. Our theory shows that the method proposed can consistently identify the subset of discriminative features contributing to the Bayes rule and at the same time consistently estimate the Bayes classification direction, even when the dimension can grow faster than any polynomial order of the sample size. The theory allows for general dependence among features. Simulated and real data examples show that lassoed discriminant analysis compares favourably with other popular sparse discriminant proposals.
引用
收藏
页码:29 / 42
页数:14
相关论文
共 30 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[3]   Regularized estimation of large covariance matrices [J].
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2008, 36 (01) :199-227
[4]   Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations [J].
Bickel, PJ ;
Levina, E .
BERNOULLI, 2004, 10 (06) :989-1010
[5]   OPTIMAL RATES OF CONVERGENCE FOR COVARIANCE MATRIX ESTIMATION [J].
Cai, T. Tony ;
Zhang, Cun-Hui ;
Zhou, Harrison H. .
ANNALS OF STATISTICS, 2010, 38 (04) :2118-2144
[6]   BagBoosting for tumor classification with gene expression data [J].
Dettling, M .
BIOINFORMATICS, 2004, 20 (18) :3583-3593
[7]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[8]  
Efron B., 2010, Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, DOI 10.1017/CBO9780511761362
[9]   HIGH-DIMENSIONAL CLASSIFICATION USING FEATURES ANNEALED INDEPENDENCE RULES [J].
Fan, Jianqing ;
Fan, Yingying .
ANNALS OF STATISTICS, 2008, 36 (06) :2605-2637
[10]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360