Cross-Validation With Confidence

被引：47

作者：

Lei, Jing ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Dept Stat & Data Sci, 5000 Forbes Ave, Pittsburgh, PA 15213 USA

来源：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION | 2020年 / 115卷 / 532期

关键词：

Cross-validation; Hypothesis testing; Model selection; Overfitting; Tuning parameter selection; TUNING PARAMETER SELECTION; MODEL SELECTION; LASSO;

D O I：

10.1080/01621459.2019.1672556

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Cross-validation is one of the most popular model and tuning parameter selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to overfit, due to the ignorance of the uncertainty in the testing sample. We develop a novel statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This method outputs a set of highly competitive candidate models containing the optimal one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for tuning parameter selection, the method can provide an alternative trade-off between prediction accuracy and model interpretability than existing variants of cross-validation. We demonstrate the performance of the proposed method in several simulated and real data examples. Supplemental materials for this article can be found online.

引用

页码：1978 / 1997

页数：20

共 50 条

[1] Multiple predicting K-fold cross-validation for model selection
Jung, Yoonsuh
JOURNAL OF NONPARAMETRIC STATISTICS, 2018, 30 (01) : 197 - 215
[2] Targeted cross-validation
Zhang, Jiawei
Ding, Jie
Yang, Yuhong
BERNOULLI, 2023, 29 (01) : 377 - 402
[3] On cross-validation for sparse reduced rank regression
She, Yiyuan
Hoang Tran
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2019, 81 (01) : 145 - 161
[4] The uncertainty principle of cross-validation
Last, Mark
2006 IEEE International Conference on Granular Computing, 2006, : 275 - 280
[5] Profile electoral college cross-validation
Zhan, Zishu
Yang, Yuhong
INFORMATION SCIENCES, 2022, 586 : 24 - 40
[6] Granularity selection for cross-validation of SVM
Liu, Yong
Liao, Shizhong
INFORMATION SCIENCES, 2017, 378 : 475 - 483
[7] Estimation Stability With Cross-Validation (ESCV)
Lim, Chinghway
Yu, Bin
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2016, 25 (02) : 464 - 492
[8] Linear model selection by cross-validation
Rao, CR
Wu, Y
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2005, 128 (01) : 231 - 240
[9] On Cross-Validation for MLP Model Evaluation
Karkkainen, Tommi
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 291 - 300
[10] Fast Cross-Validation
Liu, Yong
Lin, Hailun
Ding, Lizhong
Wang, Weiping
Liao, Shizhong
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2497 - 2503

← 1 2 3 4 5 →