We consider the problem of performing interpretable classification in the high-dimensional setting, in which the number of features is very large and the number of observations is limited. This setting has been studied extensively in the chemometrics literature, and more recently has become commonplace in biological and medical applications. In this setting, a traditional approach involves performing feature selection before classification. We propose sparse discriminant analysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously. Sparse discriminant analysis is based on the optimal scoring interpretation of linear discriminant analysis, and can be extended to perform sparse discrimination via mixtures of Gaussians if boundaries between classes are nonlinear or if subgroups are present within each class. Our proposal also provides low-dimensional views of the discriminative directions.
机构:
Stanford Univ, Dept Stat, Stanford, CA 94305 USAStanford Univ, Dept Stat, Stanford, CA 94305 USA
Hoefling, Holger
Tibshirani, Robert
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Stat, Stanford, CA 94305 USA
Stanford Univ, Dept Hlth Res & Policy, Stanford, CA 94305 USAStanford Univ, Dept Stat, Stanford, CA 94305 USA
机构:
Stanford Univ, Dept Stat, Stanford, CA 94305 USAStanford Univ, Dept Stat, Stanford, CA 94305 USA
Hoefling, Holger
Tibshirani, Robert
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Stat, Stanford, CA 94305 USA
Stanford Univ, Dept Hlth Res & Policy, Stanford, CA 94305 USAStanford Univ, Dept Stat, Stanford, CA 94305 USA