Fast Calculation of Gaussian Process Multiple-Fold Cross-Validation Residuals and their Covariances

被引:1
作者
Ginsbourger, David [1 ]
Schaerer, Cedric [1 ]
机构
[1] Univ Bern, Dept Math & Stat, Bern, Switzerland
基金
瑞士国家科学基金会;
关键词
Cross-validation; Diagnostics; Gaussian process; Hyperparameter estimation; Universal kriging; Woodbury formula; COMPUTER EXPERIMENTS; STRATEGIES;
D O I
10.1080/10618600.2024.2353633
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We generalize fast Gaussian process leave-one-out formulas to multiple-fold cross-validation, highlighting in turn the covariance structure of cross-validation residuals in simple and universal kriging frameworks. We illustrate how resulting covariances affect model diagnostics. We further establish in the case of noiseless observations that correcting for covariances between residuals in cross-validation-based estimation of the scale parameter leads back to maximum likelihood estimation. Also, we highlight in broader settings how differences between pseudo-likelihood and likelihood methods boil down to accounting or not for residual covariances. The proposed fast calculation of cross-validation residuals is implemented and benchmarked against a naive implementation, all in R. Numerical experiments highlight the substantial speed-ups that our approach enables. However, as supported by a discussion on main drivers of computational costs and by a numerical benchmark, speed-ups steeply decline as the number of folds (say, all sharing the same size) decreases. An application to a contaminant localization test case illustrates that the way of grouping observations in folds may affect model assessment and parameter fitting compared to leave-one-out. Overall, our results enable fast multiple-fold cross-validation, have consequences in model diagnostics, and pave the way to future work on hyperparameter fitting as well as on goal-oriented fold design. Supplementary materials for this article are available online.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [21] Enhanced Local Support Vector Machine With Fast Cross-Validation Capability
    Chen, Yu-Ann
    Chung, Pau-Choo
    INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 491 - 500
  • [22] A Fast Cross-Validation Algorithm for Kernel Ridge Regression by Eigenvalue Decomposition
    Tanaka, Akira
    Imai, Hideyuki
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2019, E102A (09) : 1317 - 1320
  • [23] Large-scale evaluation of k-fold cross-validation ensembles for uncertainty estimation
    Dutschmann, Thomas-Martin
    Kinzel, Lennart
    ter Laak, Antonius
    Baumann, Knut
    JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
  • [24] Large-scale evaluation of k-fold cross-validation ensembles for uncertainty estimation
    Thomas-Martin Dutschmann
    Lennart Kinzel
    Antonius ter Laak
    Knut Baumann
    Journal of Cheminformatics, 15
  • [25] Study on the Impact of Partition-Induced Dataset Shift on k-fold Cross-Validation
    Garcia Moreno-Torres, Jose
    Saez, Jose A.
    Herrera, Francisco
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (08) : 1304 - 1312
  • [26] Fast Partition-Based Cross-Validation With Centering and Scaling for XTX and XTY
    Galbo Engstrom, Ole-Christian
    Holm Jensen, Martin
    JOURNAL OF CHEMOMETRICS, 2025, 39 (03)
  • [27] Development and cross-validation of a home functioning scale for people with multiple sclerosis
    Li, Jian
    Fitzgerald, Shawn
    Bishop, Malachy
    Zhang, Han
    Rumrill, Phillip
    JOURNAL OF VOCATIONAL REHABILITATION, 2015, 42 (02) : 115 - 129
  • [28] Analysis of k-fold cross-validation over hold-out validation on colossal datasets for quality classification
    Yadav, Sanjay
    Shukla, Sanyam
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 78 - 83
  • [29] Efficient approximate k-fold and leave-one-out cross-validation for ridge regression
    Meijer, Rosa J.
    Goeman, Jelle J.
    BIOMETRICAL JOURNAL, 2013, 55 (02) : 141 - 155
  • [30] Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: theoretical aspects
    Camacho, Jose
    Ferrer, Alberto
    JOURNAL OF CHEMOMETRICS, 2012, 26 (07) : 361 - 373