Infilling of high-dimensional rainfall networks through multiple imputation by chained equations

被引:1
|
作者
O'Sullivan, Brian [1 ]
Kelly, Gabrielle [1 ]
机构
[1] Univ Coll Dublin, Sch Math & Stat, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
elastic-net; gap-filling; infilling effect; Ireland; MICE; multiple imputation; multiple imputation by chained equations; rainfall; MISSING DATA; INTERPOLATION; REGULARIZATION; ALGORITHM; MODEL;
D O I
10.1002/joc.8513
中图分类号
P4 [大气科学(气象学)];
学科分类号
0706 ; 070601 ;
摘要
Accurate precipitation records are an essential component when monitoring the climate and studying its changes. However, analysis is typically limited by the large quantities of missing values present. This article proposes two new imputation techniques for incomplete monthly data collected from a rainfall monitoring network in the Republic of Ireland from 1981 to 2010. The data considered is high-dimensional due to the large number of over 1100 rain gauge stations present, and the methods presented are designed to handle such cases. These are Elastic-Net Chained Equations (ENCE) and Multiple Imputation by Chained Equations with Direct use of Regularized Regression by elastic-net (MICE DURR). Both methods predict missing data by a series of regularized regression models, where MICE DURR differs from ENCE by also using multiple imputation. Through various evaluations across different levels of missingness, ENCE and MICE DURR consistently outperformed existing imputation methods in terms of RMSE and R2$$ {R}<^>2 $$. Moreover, they have provided the best results both seasonally and for accurately predicting extreme values. An RMSE of 14.16 and 14.17 mm per month were reported for ENCE and MICE DURR, respectively, when stations that were at least 50% complete during the study period were included. For increasingly sparser data, the imputation accuracy achieved from MICE DURR surpasses ENCE, demonstrating the efficacy of multiple imputation when handling a substantial amount of missing data. Validation metrics indicate that these methods compare very favourably to existing methods in the literature, such as those that use random forests or multiple linear regression. Ireland's rainfall monitoring network provides excellent spatial coverage, although the temporal completeness of many stations during the monitoring period (1981-2010) is not substantial. The monthly precipitation totals are considered at increasing station completeness cutoffs, where an increase of sparsity and dimensionality is observed. Findings conclude that as less complete stations are included in the imputation process, the inclusion of multiple imputation principles is essential. image
引用
收藏
页码:3075 / 3091
页数:17
相关论文
共 50 条
  • [1] Multiple imputation in the presence of high-dimensional data
    Zhao, Yize
    Long, Qi
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2016, 25 (05) : 2021 - 2035
  • [2] Multiple imputation with compatibility for high-dimensional data
    Zahid, Faisal Maqbool
    Faisal, Shahla
    Heumann, Christian
    PLOS ONE, 2021, 16 (07):
  • [3] Multiple Imputation by Chained Equations (MICE): Implementation in Stata
    Royston, Patrick
    White, Ian R.
    JOURNAL OF STATISTICAL SOFTWARE, 2011, 45 (04): : 1 - 20
  • [4] An Efficient Multiple Imputation Approach for Estimating Equations with Response Missing at Random and High-Dimensional Covariates
    Lei Wang
    Siying Sun
    Zheng Xia
    Journal of Systems Science and Complexity, 2021, 34 : 440 - 464
  • [5] An Efficient Multiple Imputation Approach for Estimating Equations with Response Missing at Random and High-Dimensional Covariates
    Wang, Lei
    Sun, Siying
    Xia, Zheng
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2021, 34 (01) : 440 - 464
  • [6] An Efficient Multiple Imputation Approach for Estimating Equations with Response Missing at Random and High-Dimensional Covariates
    WANG Lei
    SUN Siying
    XIA Zheng
    JournalofSystemsScience&Complexity, 2021, 34 (01) : 440 - 464
  • [7] Multilevel Multiple Imputation: A Review and Evaluation of Joint Modeling and Chained Equations Imputation
    Enders, Craig K.
    Mistler, Stephen A.
    Keller, Brian T.
    PSYCHOLOGICAL METHODS, 2016, 21 (02) : 222 - 240
  • [8] Multiple imputation of unordered categorical missing data: A comparison of the multivariate normal imputation and multiple imputation by chained equations
    Karangwa, Innocent
    Kotze, Danelle
    Blignaut, Renette
    BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2016, 30 (04) : 521 - 539
  • [9] A New Multiple Imputation Method for High-Dimensional Neuroimaging Data
    Lu, Tong
    Kochunov, Peter
    Chen, Chixiang
    Huang, Hsin-Hsiung
    Hong, L. Elliot
    Chen, Shuo
    HUMAN BRAIN MAPPING, 2025, 46 (05)
  • [10] Multiple imputation and analysis for high-dimensional incomplete proteomics data
    Yin, Xiaoyan
    Levy, Daniel
    Willinger, Christine
    Adourian, Aram
    Larson, Martin G.
    STATISTICS IN MEDICINE, 2016, 35 (08) : 1315 - 1326