Sequential Imputation of Missing Spatio-Temporal Precipitation Data Using Random Forests

被引:30
|
作者
Mital, Utkarsh [1 ]
Dwivedi, Dipankar [1 ]
Brown, James B. [2 ]
Faybishenko, Boris [1 ]
Painter, Scott L. [3 ,4 ]
Steefel, Carl I. [1 ]
机构
[1] Lawrence Berkeley Natl Lab, Energy Geosci Div, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Natl Lab, Environm Genom & Syst Biol, Berkeley, CA USA
[3] Oak Ridge Natl Lab, Climate Change Sci Inst, Oak Ridge, TN USA
[4] Oak Ridge Natl Lab, Div Environm Sci, POB 2008, Oak Ridge, TN 37831 USA
来源
FRONTIERS IN WATER | 2020年 / 2卷
关键词
precipitation; hydrology and water; imputation; sequential imputation; machine learning; Random Forest; SERIES; NITROGEN; VALUES; TREES;
D O I
10.3389/frwa.2020.00020
中图分类号
TV21 [水资源调查与水利规划];
学科分类号
081501 ;
摘要
Meteorological records, including precipitation, commonly have missing values. Accurate imputation of missing precipitation values is challenging, however, because precipitation exhibits a high degree of spatial and temporal variability. Data-driven spatial interpolation of meteorological records is an increasingly popular approach in which missing values at a target station are imputed using synchronous data from reference stations. The success of spatial interpolation depends on whether precipitation records at the target station are strongly correlated with precipitation records at reference stations. However, the need for reference stations to have complete datasets implies that stations with incomplete records, even though strongly correlated with the target station, are excluded. To address this limitation, we develop a new sequential imputation algorithm for imputing missing values in spatio-temporal daily precipitation records. We demonstrate the benefits of sequential imputation by incorporating it within a spatial interpolation based on a Random Forest technique. Results show that for reliable imputation, having a few strongly correlated references is more effective than having a larger number of weakly correlated references. Further, we observe that sequential imputation becomes more beneficial as the number of stations with incomplete records increases. Overall, we present a new approach for imputing missing precipitation data which may also apply to other meteorological variables.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Missing data imputation for traffic flow speed using spatio-temporal cokriging
    Bae, Bumjoon
    Kim, Hyun
    Lim, Hyeonsup
    Liu, Yuandong
    Han, Lee D.
    Freeze, Phillip B.
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2018, 88 : 124 - 139
  • [3] Imputation of missing data from offshore wind farms using spatio-temporal correlation and feature correlation
    Sun, Chuan
    Chen, Yueyi
    Cheng, Cheng
    ENERGY, 2021, 229
  • [4] Hyperparameter Tuning to Optimize Implementations of Denoising Autoencoders for Imputation of Missing Spatio-temporal Data
    Siddiqi, Muhammad Danial
    Jiang, Boyuan
    Asadi, Reza
    Regan, Amelia
    12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 107 - 114
  • [5] Missing data imputation in tunnel monitoring with a spatio-temporal correlation fused machine learning model
    Tan, Xuyan
    Chen, Weizhong
    Tan, Xianjun
    Fan, Chengkai
    Mao, Yuhao
    Cheng, Ke
    Du, Bowen
    JOURNAL OF CIVIL STRUCTURAL HEALTH MONITORING, 2024,
  • [6] An Enhanced Imputation Approach for Spatio-Temporal Clinical Data
    Yin, Yilin
    Chou, Chun-An
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 813 - 818
  • [7] Data imputation in IoT using Spatio-Temporal Variational Auto-Encoder
    Zhang, Shuo
    Chen, Jinyi
    Chen, Jiayuan
    Chen, Xiaofei
    Huang, Hejiao
    NEUROCOMPUTING, 2023, 529 : 23 - 32
  • [8] Spatio-temporal Outlier Detection in Precipitation Data
    Wu, Elizabeth
    Liu, Wei
    Chawla, Sanjay
    KNOWLEDGE DISCOVERY FROM SENSOR DATA, 2010, 5840 : 115 - 133
  • [9] IMD-MP: Imputation of Missing Data in IoT Based on Matrix Profile and Spatio-temporal Correlations
    Lakshmi, G. V. Vidya
    Gopikrishnan, S.
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2024, 30 (06) : 814 - 846
  • [10] Investigations into Missing Values Imputation Using Random Forests for Semi-supervised Data
    Ishioka, Tsunenori
    16TH INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES (IIWAS 2014), 2014, : 296 - 301