Fast Calculation of Gaussian Process Multiple-Fold Cross-Validation Residuals and their Covariances

被引:1
作者
Ginsbourger, David [1 ]
Schaerer, Cedric [1 ]
机构
[1] Univ Bern, Dept Math & Stat, Bern, Switzerland
基金
瑞士国家科学基金会;
关键词
Cross-validation; Diagnostics; Gaussian process; Hyperparameter estimation; Universal kriging; Woodbury formula; COMPUTER EXPERIMENTS; STRATEGIES;
D O I
10.1080/10618600.2024.2353633
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We generalize fast Gaussian process leave-one-out formulas to multiple-fold cross-validation, highlighting in turn the covariance structure of cross-validation residuals in simple and universal kriging frameworks. We illustrate how resulting covariances affect model diagnostics. We further establish in the case of noiseless observations that correcting for covariances between residuals in cross-validation-based estimation of the scale parameter leads back to maximum likelihood estimation. Also, we highlight in broader settings how differences between pseudo-likelihood and likelihood methods boil down to accounting or not for residual covariances. The proposed fast calculation of cross-validation residuals is implemented and benchmarked against a naive implementation, all in R. Numerical experiments highlight the substantial speed-ups that our approach enables. However, as supported by a discussion on main drivers of computational costs and by a numerical benchmark, speed-ups steeply decline as the number of folds (say, all sharing the same size) decreases. An application to a contaminant localization test case illustrates that the way of grouping observations in folds may affect model assessment and parameter fitting compared to leave-one-out. Overall, our results enable fast multiple-fold cross-validation, have consequences in model diagnostics, and pave the way to future work on hyperparameter fitting as well as on goal-oriented fold design. Supplementary materials for this article are available online.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [41] Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach
    Saud, Sheikh
    Jamil, Basharat
    Upadhyay, Yogesh
    Irshad, Kashif
    SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2020, 40
  • [42] A fast cross-validation method for alignment of electron tomography images based on Beer-Lambert law
    Yan, Rui
    Edwards, Thomas J.
    Pankratz, Logan M.
    Kuhn, Richard J.
    Lanman, Jason K.
    Liu, Jun
    Jiang, Wen
    JOURNAL OF STRUCTURAL BIOLOGY, 2015, 192 (02) : 297 - 306
  • [43] The five-factor model of the Positive and Negative Syndrome Scale - II: A ten-fold cross-validation of a revised model
    van der Gaag, Mark
    Hoffman, Tonko
    Remijsen, Mila
    Hijman, Ron
    de Haan, Lieuwe
    van Meijel, Berno
    van Harten, Peter N.
    Valmaggia, Lucia
    de Hert, Marc
    Cuijpers, Anke
    Wiersma, Durk
    SCHIZOPHRENIA RESEARCH, 2006, 85 (1-3) : 280 - 287
  • [44] Augmented RBF metamodel for global sensitivity analysis enhanced by recursive evolution LHD and efficient K-fold cross-validation
    Li, Guosheng
    Yang, Jiawei
    Wang, Wenjie
    Zhang, Zixuan
    Zhang, Weihua
    Wu, Zeping
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2022, 36 (08) : 4127 - 4142
  • [45] Evaluating explorative prediction power of machine learning algorithms for materials discovery using k-fold forward cross-validation
    Xiong, Zheng
    Cui, Yuxin
    Liu, Zhonghao
    Zhao, Yong
    Hu, Ming
    Hu, Jianjun
    COMPUTATIONAL MATERIALS SCIENCE, 2020, 171
  • [46] Augmented RBF metamodel for global sensitivity analysis enhanced by recursive evolution LHD and efficient K-fold cross-validation
    Guosheng Li
    Jiawei Yang
    Wenjie Wang
    Zixuan Zhang
    Weihua Zhang
    Zeping Wu
    Journal of Mechanical Science and Technology, 2022, 36 : 4127 - 4142
  • [47] Cross-validation and ensemble analyses on multiple-criteria linear programming classification for credit cardholder behavior
    Peng, Y
    Kou, G
    Chen, ZX
    Shi, Y
    COMPUTATIONAL SCIENCE - ICCS 2004, PROCEEDINGS, 2004, 3039 : 931 - 939
  • [48] Fast exact leave-one-out cross-validation of sparse least-squares support vector machines
    Cawley, GC
    Talbot, NLC
    NEURAL NETWORKS, 2004, 17 (10) : 1467 - 1475
  • [49] Estimating MLP generalisation ability without a test set using fast, approximate leave-one-out cross-validation
    Myles, AJ
    Murray, AF
    Wallace, AR
    Barnard, J
    Smith, G
    NEURAL COMPUTING & APPLICATIONS, 1997, 5 (03) : 134 - 151
  • [50] Estimating MLP generalisation ability without a test set using fast, approximate leave-one-out cross-validation
    Andrew J. Myles
    Alan F. Murray
    A. Robin Wallace
    John Barnard
    Gordon Smith
    Neural Computing & Applications, 1997, 5 : 134 - 151