Empirical Comparison Between Cross-Validation and Mutation-Validation in Model Selection

被引:0
作者
Yu, Jinyang [1 ]
Hamdan, Sami [1 ,2 ]
Sasse, Leonard [1 ,2 ,6 ]
Morrison, Abigail [3 ,4 ,5 ]
Patil, Kaustubh R. [1 ,2 ]
机构
[1] Res Ctr Julich, Inst Neurosci & Med, Brain & Behav INM 7, Julich, Germany
[2] Heinrich Heine Univ Dusseldorf, Med Fac, Inst Syst Neurosci, Dusseldorf, Germany
[3] Res Ctr Julich, Inst Neurosciennce & Med INM 6, Julich, Germany
[4] Res Ctr Julich, Inst Adv Simulat IAS 6, Julich, Germany
[5] Rhein Westfal TH Aachen, Dept Comp Sci Software Engn 3, Aachen, Germany
[6] Max Planck Sch Cognit, Stephanstr 1a, Leipzig, Germany
来源
ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024 | 2024年 / 14642卷
关键词
model selection; mutation validation; cross-validation;
D O I
10.1007/978-3-031-58553-1_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and k-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding three posterior probabilities: practical equivalence, CV superiority, and MV superiority. We also evaluated the differences in the capacity of the selected models and computational efficiency. We found that both MV and CV select models with practically equivalent generalization performance across various machine learning algorithms and the majority of benchmark datasets. MV exhibited advantages in terms of selecting simpler models and lower computational costs. However, in some cases MV selected overly simplistic models leading to underfitting and showed instability in hyperparameter selection. These limitations of MV became more evident in the evaluation of a real-world neuroscientific task of predicting sex at birth using brain functional connectivity.
引用
收藏
页码:56 / 67
页数:12
相关论文
共 18 条
  • [1] [Anonymous], 2009, Springer Series Statistics, DOI [10.1111/j.1467-985X.2010.00646_6.x, DOI 10.1111/J.1467-985X.2010.00646_6.X]
  • [2] Barbiero P, 2020, ARXIV
  • [3] Statistical comparison of classifiers through Bayesian hierarchical modelling
    Corani, Giorgio
    Benavoli, Alessio
    Demsar, Janez
    Mangili, Francesca
    Zaffalon, Marco
    [J]. MACHINE LEARNING, 2017, 106 (11) : 1817 - 1837
  • [4] A Bayesian approach for comparing cross-validated algorithms on multiple data sets
    Corani, Giorgio
    Benavoli, Alessio
    [J]. MACHINE LEARNING, 2015, 100 (2-3) : 285 - 304
  • [5] Dheeru D., 2017, UCI MACHINE LEARNING
  • [6] Esteban O., 2022, fMRIPrep: A Robust Preprocessing Pipeline for Functional MRI (22.0.1)
  • [7] Feldman V, 2019, PR MACH LEARN RES, V97
  • [8] Guyon I., 2003, J MACH LEARN RES, V3, P1157, DOI DOI 10.1016/J.ACA.2011.07.027
  • [9] Kohavi R., 1995, INT JOINT C ARTIFICI, DOI DOI 10.5555/1643031.1643047
  • [10] Bayesian Estimation Supersedes the t Test
    Kruschke, John K.
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2013, 142 (02) : 573 - 603