Fast Calculation of Gaussian Process Multiple-Fold Cross-Validation Residuals and their Covariances

被引:1
作者
Ginsbourger, David [1 ]
Schaerer, Cedric [1 ]
机构
[1] Univ Bern, Dept Math & Stat, Bern, Switzerland
基金
瑞士国家科学基金会;
关键词
Cross-validation; Diagnostics; Gaussian process; Hyperparameter estimation; Universal kriging; Woodbury formula; COMPUTER EXPERIMENTS; STRATEGIES;
D O I
10.1080/10618600.2024.2353633
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We generalize fast Gaussian process leave-one-out formulas to multiple-fold cross-validation, highlighting in turn the covariance structure of cross-validation residuals in simple and universal kriging frameworks. We illustrate how resulting covariances affect model diagnostics. We further establish in the case of noiseless observations that correcting for covariances between residuals in cross-validation-based estimation of the scale parameter leads back to maximum likelihood estimation. Also, we highlight in broader settings how differences between pseudo-likelihood and likelihood methods boil down to accounting or not for residual covariances. The proposed fast calculation of cross-validation residuals is implemented and benchmarked against a naive implementation, all in R. Numerical experiments highlight the substantial speed-ups that our approach enables. However, as supported by a discussion on main drivers of computational costs and by a numerical benchmark, speed-ups steeply decline as the number of folds (say, all sharing the same size) decreases. An application to a contaminant localization test case illustrates that the way of grouping observations in folds may affect model assessment and parameter fitting compared to leave-one-out. Overall, our results enable fast multiple-fold cross-validation, have consequences in model diagnostics, and pave the way to future work on hyperparameter fitting as well as on goal-oriented fold design. Supplementary materials for this article are available online.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [31] Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: Practical aspects
    Camacho, Jose
    Ferrer, Alberto
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2014, 131 : 37 - 50
  • [32] Determining the optimal number of folds to use in a K-fold cross-validation: A neural network classification experiment
    Oyedele, Opeoluwa
    RESEARCH IN MATHEMATICS, 2023, 10 (01):
  • [33] An empirical comparison of V-fold penalisation and cross-validation for model selection in distribution-free regression
    Dhanjal, Charanpal
    Baskiotis, Nicolas
    Clemencon, Stephan
    Usunier, Nicolas
    PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (01) : 41 - 53
  • [34] K-fold cross-validation based frequentist model averaging for linear models with nonignorable missing responses
    Liang, Zhongqi
    Cai, Li
    Wang, Suojin
    Wang, Qihua
    STATISTICS AND COMPUTING, 2025, 35 (01)
  • [35] Systematic process to determine DNBR limit of CHF correlation with repetitive cross-validation technique
    Kim, Byeungseok
    Park, Shane
    Kim, Kanghoon
    Lim, Jongseon
    Nahm, Keeyil
    JOURNAL OF NUCLEAR SCIENCE AND TECHNOLOGY, 2018, 55 (09) : 1034 - 1042
  • [36] Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression
    An, Senjian
    Liu, Wanquan
    Venkatesh, Svetha
    PATTERN RECOGNITION, 2007, 40 (08) : 2154 - 2162
  • [37] Cross-Validation-based Adaptive Sampling for Gaussian Process Models
    Mohammadi, Hossein
    Challenor, Peter
    Williamson, Daniel
    Goodfellow, Marc
    SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2022, 10 (01) : 294 - 316
  • [38] Industrial Carbon Footprint (ICF) Calculation Approach Based on Bayesian Cross-Validation Improved Cyclic Stacking
    Xie, Yichao
    Zhou, Bowen
    Wang, Zhenyu
    Yang, Bo
    Ning, Liaoyi
    Zhang, Yanhui
    SUSTAINABILITY, 2023, 15 (19)
  • [39] Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach
    Saud, Sheikh
    Jamil, Basharat
    Upadhyay, Yogesh
    Irshad, Kashif
    SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2020, 40
  • [40] A New Formula for Faster Computation of the K-Fold Cross-Validation and Good Regularisation Parameter Values in Ridge Regression
    Liland, Kristian Hovde
    Skogholt, Joakim
    Indahl, Ulf Geir
    IEEE ACCESS, 2024, 12 : 17349 - 17368