Concentration inequalities of the cross-validation estimator for empirical risk minimizer

被引:2
作者
Cornec, Matthieu [1 ,2 ]
机构
[1] Univ Nanterre, ModalX, Nanterre, France
[2] Cdiscount, 126 Quai de Bacalan, F-33000 Bordeaux, France
关键词
Cross-validation; VC-dimension; MODEL SELECTION; REGRESSION; BOUNDS;
D O I
10.1080/02331888.2016.1261479
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We derive concentration inequalities for the cross-validation estimate of the generalization error for empirical risk minimizers. In the general setting, we show that the worst-case error of this estimate is not much worse that of training error estimate see Kearns M, Ron D. [Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput. 1999;11:1427-1453]. General loss functions and class of predictors with finite VC-dimension are considered. Our focus is on proving the consistency of the various cross-validation procedures. We point out the interest of each cross-validation procedure in terms of rates of convergence. An interesting consequence is that the size of the test sample is not required to grow to infinity for the consistency of the cross-validation procedure.
引用
收藏
页码:43 / 60
页数:18
相关论文
共 36 条
[1]   RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION [J].
ALLEN, DM .
TECHNOMETRICS, 1974, 16 (01) :125-127
[2]  
[Anonymous], ARXIV151102980
[3]  
[Anonymous], 1996, SPRINGER SERIES STAT
[4]  
[Anonymous], ANN STAT
[5]  
[Anonymous], 1995, ADV NEURAL INFORM PR
[6]  
[Anonymous], CONCENTRATION OF MEA
[7]  
[Anonymous], J AM STAT ASS
[8]  
[Anonymous], STAT DECIS
[9]  
[Anonymous], J AM STAT ASS
[10]  
Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690