A Neyman-Pearson approach to statistical learning

被引:87
作者
Scott, C [1 ]
Nowak, R
机构
[1] Rice Univ, Dept Stat, Houston, TX 77005 USA
[2] Univ Wisconsin, Dept Elect & Comp Engn, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
generalization error bounds; Neyman-Pearson (NP) classification; statistical learning theory;
D O I
10.1109/TIT.2005.856955
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Neyman-Pearson (NP) approach to hypothesis testing is useful in situations where different types of error have different consequences or a priori probabilities are unknown. For any alpha > 0, the NP lemma specifies the most powerful test of size alpha, but assumes the distributions for each. hypothesis are known or (in some cases) the likelihood ratio is monotonic in an unknown parameter. This paper investigates an extension of NP theory to situations in which one has no knowledge of the underlying distri butions; except for a collection of independent and identically distributed (i.i.d.) training examples from each hypothesis. Building on a "fundamental lemma" of Cannon et al., we demonstrate that several concepts from statistical learning theory have counterparts in the NP context. Specifically, we consider constrained versions of empirical risk minimization (NP-ERM) and structural risk minimization (NP-SRM), and prove performance guarantees for both. General conditions are given under which NP-SRM leads to strong universal consistency. We also apply NP-SRM to (dyadic) decision trees to derive rates of convergence. Finally, we present explicit algorithms to implement NP-SRM for histograms and dyadic decision trees.
引用
收藏
页码:3806 / 3819
页数:14
相关论文
共 38 条
[1]   Model selection and error estimation [J].
Bartlett, PL ;
Boucheron, S ;
Lugosi, G .
MACHINE LEARNING, 2002, 48 (1-3) :85-113
[2]   Oracle bounds and exact algorithm for dyadic classification trees [J].
Blanchard, G ;
Schäfer, C ;
Rozenholc, Y .
LEARNING THEORY, PROCEEDINGS, 2004, 3120 :378-392
[3]  
BLANCHARD G, 2005, OPTIMAL DYADIC DECIS
[4]  
BOUCHERON S, 2004, THEORY CLASSIFICATIO
[5]  
CANNON A., 2002, Tech. Rep. LA-UR 02-2951
[6]   Radial basis function neural networks for nonlinear Fisher discrimination and Neyman-Pearson classification [J].
Casasent, D ;
Chen, XW .
NEURAL NETWORKS, 2003, 16 (5-6) :529-535
[7]   Tree approximation and optimal encoding [J].
Cohen, A ;
Dahmen, W ;
Daubechies, I ;
DeVore, R .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2001, 11 (02) :192-226
[8]  
DeVore R. A., 1998, Acta Numerica, V7, P51, DOI 10.1017/S0962492900002816
[10]  
Devroye L., 1996, A probabilistic theory of pattern recognition