Empirical Comparison Between Cross-Validation and Mutation-Validation in Model Selection

被引：0

作者：

Yu, Jinyang ^{[1
]}

Hamdan, Sami ^{[1
,2
]}

Sasse, Leonard ^{[1
,2
,6
]}

Morrison, Abigail ^{[3
,4
,5
]}

Patil, Kaustubh R. ^{[1
,2
]}

机构：

[1] Res Ctr Julich, Inst Neurosci & Med, Brain & Behav INM 7, Julich, Germany

[2] Heinrich Heine Univ Dusseldorf, Med Fac, Inst Syst Neurosci, Dusseldorf, Germany

[3] Res Ctr Julich, Inst Neurosciennce & Med INM 6, Julich, Germany

[4] Res Ctr Julich, Inst Adv Simulat IAS 6, Julich, Germany

[5] Rhein Westfal TH Aachen, Dept Comp Sci Software Engn 3, Aachen, Germany

[6] Max Planck Sch Cognit, Stephanstr 1a, Leipzig, Germany

来源：

ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024 | 2024年 / 14642卷

关键词：

model selection; mutation validation; cross-validation;

D O I：

10.1007/978-3-031-58553-1_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and k-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding three posterior probabilities: practical equivalence, CV superiority, and MV superiority. We also evaluated the differences in the capacity of the selected models and computational efficiency. We found that both MV and CV select models with practically equivalent generalization performance across various machine learning algorithms and the majority of benchmark datasets. MV exhibited advantages in terms of selecting simpler models and lower computational costs. However, in some cases MV selected overly simplistic models leading to underfitting and showed instability in hyperparameter selection. These limitations of MV became more evident in the evaluation of a real-world neuroscientific task of predicting sex at birth using brain functional connectivity.

引用

页码：56 / 67

页数：12

共 18 条

[1] [Anonymous], 2009, Springer Series Statistics, DOI [10.1111/j.1467-985X.2010.00646_6.x, DOI 10.1111/J.1467-985X.2010.00646_6.X]
[2] Barbiero P, 2020, ARXIV
[3] Statistical comparison of classifiers through Bayesian hierarchical modelling
Corani, Giorgio
Benavoli, Alessio
Demsar, Janez
Mangili, Francesca
Zaffalon, Marco
[J]. MACHINE LEARNING, 2017, 106 (11) : 1817 - 1837
[4] A Bayesian approach for comparing cross-validated algorithms on multiple data sets
Corani, Giorgio
Benavoli, Alessio
[J]. MACHINE LEARNING, 2015, 100 (2-3) : 285 - 304
[5] Dheeru D., 2017, UCI MACHINE LEARNING
[6] Esteban O., 2022, fMRIPrep: A Robust Preprocessing Pipeline for Functional MRI (22.0.1)
[7] Feldman V, 2019, PR MACH LEARN RES, V97
[8] Guyon I., 2003, J MACH LEARN RES, V3, P1157, DOI DOI 10.1016/J.ACA.2011.07.027
[9] Kohavi R., 1995, INT JOINT C ARTIFICI, DOI DOI 10.5555/1643031.1643047
[10] Bayesian Estimation Supersedes the t Test
Kruschke, John K.
[J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2013, 142 (02) : 573 - 603

← 1 2 →