Model selection by bootstrap penalization for classification

被引:3
作者
Magalie Fromont
机构
[1] Université Rennes II,Laboratoire de Statistique, U.F.R. de Sciences Sociales–Département MASS
来源
Machine Learning | 2007年 / 66卷
关键词
Model selection; Classification; Bootstrap penalty; Exponential inequality; Oracle inequality; Minimax risk;
D O I
暂无
中图分类号
学科分类号
摘要
We consider the binary classification problem. Given an i.i.d. sample drawn from the distribution of an χ×{0,1}−valued random pair, we propose to estimate the so-called Bayes classifier by minimizing the sum of the empirical classification error and a penalty term based on Efron’s or i.i.d. weighted bootstrap samples of the data. We obtain exponential inequalities for such bootstrap type penalties, which allow us to derive non-asymptotic properties for the corresponding estimators. In particular, we prove that these estimators achieve the global minimax risk over sets of functions built from Vapnik-Chervonenkis classes. The obtained results generalize Koltchinskii (2001) and Bartlett et al.’s (2002) ones for Rademacher penalties that can thus be seen as special examples of bootstrap type penalties. To illustrate this, we carry out an experimental study in which we compare the different methods for an intervals model selection problem.
引用
收藏
页码:165 / 207
页数:42
相关论文
共 50 条
[41]   Bootstrap aggregated classification for sparse functional data [J].
Kim, Hyunsung ;
Lim, Yaeji .
JOURNAL OF APPLIED STATISTICS, 2022, 49 (08) :2052-2063
[42]   CLASSIFICATION OF MRI DATA USING DEEP LEARNING AND GAUSSIAN PROCESS-BASED MODEL SELECTION [J].
Bertrand, Hadrien ;
Perrot, Matthieu ;
Ardon, Roberto ;
Bloch, Isabelle .
2017 IEEE 14TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2017), 2017, :745-748
[43]   Estimation and model selection for model-based clustering with the conditional classification likelihood [J].
Baudry, Jean-Patrick .
ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01) :1041-1077
[44]   Using bootstrap identifiability as a metric for model selection for dynamic [11C]DASB PET data [J].
Ogden, R. Todd ;
Ojha, Ashish ;
Erlandsson, Kjell ;
van Heertum, Ronald ;
Mann, J. John ;
Parsey, Ramin V. .
2005 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-5, 2005, :2636-2639
[45]   Variable selection via combined penalization for high-dimensional data analysis [J].
Wang, Xiaoming ;
Park, Taesung ;
Carriere, K. C. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (10) :2230-2243
[46]   Classification-based model selection in retail demand forecasting [J].
Ulrich, Matthias ;
Jahnke, Hermann ;
Langrock, Roland ;
Pesch, Robert ;
Senge, Robin .
INTERNATIONAL JOURNAL OF FORECASTING, 2022, 38 (01) :209-223
[47]   An improved tree model based on ensemble feature selection for classification [J].
Mohan, Chandralekha ;
Nagarajan, Shenbagavadivu .
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (02) :1290-1307
[48]   Parsimonious model selection for tissue classification: A DTI study of zebrafish [J].
Freidlin, Raisa Z. ;
Komlosh, Michal E. ;
Loew, Murray H. ;
Basser, Peter J. .
MEDICAL IMAGING 2007: IMAGE PROCESSING, PTS 1-3, 2007, 6512
[49]   Minimax nonparametric classification - Part II: Model selection for adaptation [J].
Yang, YH .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1999, 45 (07) :2285-2292
[50]   Active Model Selection for Positive Unlabeled Time Series Classification [J].
Liang, Shen ;
Zhang, Yanchun ;
Ma, Jiangang .
2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, :361-372