Fast and Informative Model Selection Using Learning Curve Cross-Validation

被引：25

作者：

Mohr, Felix ^{[1
]}

van Rijn, Jan N. ^{[2
]}

机构：

[1] Univ La Sabana, Fac Engn, Chia 250001, Colombia

[2] Leiden Univ, Leiden Inst Adv Comp Sci LIACS, NL-2333 CA Leiden, Netherlands

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 08期

关键词：

Decision making; learning curves; model selection; supervised machine learning; ALGORITHM;

D O I：

10.1109/TPAMI.2023.3251957

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Common cross-validation (CV) methods like k-fold cross-validation or Monte Carlo cross-validation estimate the predictive performance of a learner by repeatedly training it on a large portion of the given data and testing it on the remaining data. These techniques have two major drawbacks. First, they can be unnecessarily slow on large datasets. Second, beyond an estimation of the final performance, they give almost no insights into the learning process of the validated algorithm. In this article, we present a new approach for validation based on learning curves (LCCV). Instead of creating train-test splits with a large portion of training data, LCCV iteratively increases the number of instances used for training. In the context of model selection, it discards models that are unlikely to become competitive. In a series of experiments on 75 datasets, we could show that in over 90% of the cases using LCCV leads to the same performance as using 5/10-fold CV while substantially reducing the runtime (median runtime reductions of over 50%); the performance using LCCV never deviated from CV by more than 2.5%. We also compare it to a racing-based method and successive halving, a multi-armed banditmethod. Additionally, it provides important insights, which for example allows assessing the benefits of acquiring more data.

引用

页码：9669 / 9680

页数：12

共 42 条

[1]

Abu-Mostafa Y.S., 2012, Learning From Data, V4

[2]

[Anonymous], 1974, Nonlinear Parameter Estimation

[3]

Baker B., 2018, PROC 6 INT C LEARN R

[4]

Bergstra J, 2011, P 24 INT C NEURAL IN, V24

[5]

Bergstra J, 2012, J MACH LEARN RES, V13, P281

[6]

Brazdil Pavel, 2022, Metalearning: Applications to Automated Machine Learning and Data Mining, DOI DOI 10.1007/978-3

[7]

Domhan T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3460

[8]

Feurer M., 2022, J. Mach. Learn. Res., V23, P1

[9]

Feurer M., 2018, PROC 5 ICML WORKSHOP

[10]

Feurer M, 2021, J MACH LEARN RES, V22

← 1 2 3 4 5 →