Investigating the impact of input variable selection on daily solar radiation prediction accuracy using data-driven models: a case study in northern Iran

被引:0
作者
Mohammad Sina Jahangir
Seyed Mostafa Biazar
David Hah
John Quilty
Mohammad Isazadeh
机构
[1] University of Waterloo,Department of Civil and Environmental Engineering
[2] University of Tabriz,Department of Water Engineering, Faculty of Agriculture
来源
Stochastic Environmental Research and Risk Assessment | 2022年 / 36卷
关键词
Data-driven models; Solar radiation prediction; Input variable selection; Edgeworth approximation-based conditional mutual information; Iran;
D O I
暂无
中图分类号
学科分类号
摘要
Data-driven models have been explored in numerous studies for solar radiation (Rs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{s}$$\end{document}) prediction. However, the use of different input variable selection (IVS) methods for improving Rs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{s}$$\end{document} prediction accuracy has mostly been neglected. This study explores various IVS methods, including Gamma test (GT), Procrustes analysis (PA) and Edgeworth approximation-based conditional mutual information (EA) and evaluates their ability to improve Rs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{s}$$\end{document} prediction accuracy by coupling them with popular non-linear data-driven models, multilayer perceptron (MLP), support vector machine, extreme learning machine and multi-gene genetic programming (MGGP). The partial correlation input selection method was coupled with multiple linear regression to serve as a linear benchmark. Meteorological data from eight stations in northern Iran was used for building the Rs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{s}$$\end{document} prediction models. The type and number of variables selected at each station was dissimilar and dependent on the IVS method. The models utilizing EA selected fewer variables compared to the GT method and had higher accuracy, while models using PA selected fewer variables than all methods but were not able to adequately predict Rs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${R}_{s}$$\end{document}. It was also found that predictive performance substantially varied when pairing the IVS methods with different model types. For example, MLP, the model with the best average performance, when coupled with EA instead of PA resulted in a ~ 27% improvement (decrease) in the normalized root mean square error (nRMSE). The results also indicated that MGGP produced the least accurate predictions, where the nRMSE increased by up to 40% compared to MLP when the EA method was used for IVS. Finally, IVS hyper-parameter adjustment (which is routinely overlooked in the literature) profoundly affected the results and is recommended as a very important step to consider when developing data-driven models for solar radiation prediction.
引用
收藏
页码:225 / 249
页数:24
相关论文
共 329 条
  • [1] Ahmadi A(2009)Input data selection for solar radiation estimation Hydrol Process Int J 23 2754-2764
  • [2] Han D(2015)High-performance extreme learning machines: a complete toolbox for big data applications IEEE Access 3 1011-1025
  • [3] Karamouz M(2020)A comparative study of several machine learning based non-linear regression methods in estimating solar radiation: Case studies of the USA and Turkey regions Energy 197 117239-263
  • [4] Remesan R(1998)Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56 Fao, Rome 300 D05109-216
  • [5] Akusok A(2018)Investigating the impact of feature selection on the prediction of solar radiation in different locations in Saudi Arabia Appl Soft Comput 66 250-341
  • [6] Björk KM(2014)Sunshine-based global radiation models: a review and case study Energy Convers Manage 84 209-653
  • [7] Miche Y(2014)Artificial neural network based daily local forecasting for global solar radiation Appl Energy 130 333-160
  • [8] Lendasse A(2018)Estimation methods of global solar radiation, cell temperature and solar power forecasting: a review and case study in Eskişehir Renew Sustain Energy Rev 91 639-1472
  • [9] Alizamir M(2016)Simple solar radiation modelling for different cloud types and climatologies Theoret Appl Climatol 124 141-1480
  • [10] Kim S(2016)Prediction of global solar radiation using support vector machines Int J Green Energy 13 1467-118