A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria

被引:40
作者
Aieb, Amir [1 ,2 ]
Madani, Khodir [1 ]
Scarpa, Marco [3 ]
Bonaccorso, Brunella [3 ]
Lefsih, Khalef [1 ]
机构
[1] Univ Bejaia, L3BS, Bejaia 06000, Algeria
[2] Abderrahmane Mira Univ, Fac Exact Sci, Dept Comp Sci, Bejaia 06000, Algeria
[3] Univ Messina, Dept Engn, Messina, Italy
关键词
Atmospheric science; Environmental science; Hydrology; TIME-SERIES; IMPUTATION; SATELLITE; GAPS; PCA;
D O I
10.1016/j.heliyon.2019.e01247
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k-nearest-neighbors imputation, weighted k-nearest-neighbors imputation, multiple imputation, linear regression and simple average method. The choice of these methods was justified by qualitative and quantitative statistical tests analysis. However, the reliability of obtained results depends mainly on percentage of missing data, choice of neighboring stations and data missingness mechanism which should be missing at random. During the study it was found that the most of stations in Soummam watershed don't have a good correlation because the large loss in rainfall data or the geology of watershed which gives a relationship between station position and rainfall variability. For this case, principal component analysis is applied on a set of stations; it showed a positive impact of altitude, latitude and longitude on correlation index between selected stations. The graphical analysis of the normal law on RMSE values, which were obtained by applying the proposed technique in several random cases of missingness, that are 4%, 8%, 12% and 16% respectively, it confirmed the validity and the performance of this approach.
引用
收藏
页数:27
相关论文
共 37 条
  • [1] Adler RF, 2000, J APPL METEOROL, V39, P2007, DOI 10.1175/1520-0450(2001)040<2007:TRDDUT>2.0.CO
  • [2] 2
  • [3] Missing data imputation using fuzzy-rough methods
    Amiri, Mehran
    Jensen, Richard
    [J]. NEUROCOMPUTING, 2016, 205 : 152 - 164
  • [4] Infilling missing precipitation records - A comparison of a new copula-based method with other techniques
    Bardossy, Andras
    Pegram, Geoffrey
    [J]. JOURNAL OF HYDROLOGY, 2014, 519 : 1162 - 1170
  • [5] Batista GEAPA, 2003, APPL ARTIF INTELL, V17, P519, DOI 10.1080/08839510390219309
  • [6] Root mean square error (RMSE) or mean absolute error (MAE)? - Arguments against avoiding RMSE in the literature
    Chai, T.
    Draxler, R. R.
    [J]. GEOSCIENTIFIC MODEL DEVELOPMENT, 2014, 7 (03) : 1247 - 1250
  • [7] PCA model building with missing data: New proposals and a comparative study
    Folch-Fortuny, A.
    Arteaga, F.
    Ferrer, A.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 146 : 77 - 88
  • [8] Self-organizing maps for imputation of missing data in incomplete data matrices
    Folguera, Laura
    Zupan, Jure
    Cicerone, Daniel
    Magallanes, Jorge F.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 143 : 146 - 151
  • [9] A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets
    Gomez-Carracedo, M. P.
    Andrade, J. M.
    Lopez-Mahia, P.
    Muniategui, S.
    Prada, D.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2014, 134 : 23 - 33
  • [10] FILLING GAPS IN RUNOFF TIME-SERIES IN WEST-AFRICA
    GYAUBOAKYE, P
    SCHULTZ, GA
    [J]. HYDROLOGICAL SCIENCES JOURNAL-JOURNAL DES SCIENCES HYDROLOGIQUES, 1994, 39 (06): : 621 - 636