Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations

被引:381
作者
Bickel, PJ [1 ]
Levina, E
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
关键词
Fisher's linear discriminant; Gaussian coloured noise; minimax regret; naive Bayes;
D O I
10.3150/bj/1106314847
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We show that the 'naive Bayes' classifier which assumes independent covariates greaty ourperforms the Fisher linear discriminant rule under broad conditions when the number of variable grows,; faster than the number of observations, in the classical problem of discriminating between two normal populations. We also introduce a class of rules spanning the range between independence and arbitrary dependence. These rules are shown to achieve Bayes consistency for the Gaussian 'coloured noise' model and to adapt to a spectrum of convergence rates, which we Conjecture to be minimax.
引用
收藏
页码:989 / 1010
页数:22
相关论文
共 13 条
[1]  
BOTTCHER A, 1996, LECT OPERATOR THEORY
[2]  
Bradley RC, 2002, BERNOULLI, V8, P175
[3]  
DeVore Ronald A., 1993, CONSTRUTIVE APPROXIM, V303
[4]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[5]  
DONOHO DL, 1995, J ROY STAT SOC B MET, V57, P301
[6]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[7]   Persistence in high-dimensional linear predictor selection and the virtue of overparametrization [J].
Greenshtein, E ;
Ritov, Y .
BERNOULLI, 2004, 10 (06) :971-988
[8]  
Grenander U, 1984, TOEPLITZ FORMS THEIR
[9]  
JOHNSTONE IM, 2002, UNPUB FUNCTION ESTIM
[10]  
Levina E., 2002, THESIS U CALIFORNIA