Forecasting water quality variable using deep learning and weighted averaging ensemble models

被引:21
作者
Zamani, Mohammad G. [1 ]
Nikoo, Mohammad Reza [2 ]
Jahanshahi, Sina [3 ]
Barzegar, Rahim [4 ]
Meydani, Amirreza [5 ]
机构
[1] Amirkabir Univ Technol, Fac Civil Water & Environm Engn, Dept Water Resources Engn, Tehran, Iran
[2] Sultan Qaboos Univ, Dept Civil & Architectural Engn, Muscat, Oman
[3] Univ Tehran, Fac Civil Water & Environm Engn, Dept Water Resources Engn, Tehran, Iran
[4] Univ Quebec Abitibi Temiscamingue UQAT, Res Inst Mines & Environm RIME, Groundwater Res Grp GRES, Amos, PQ, Canada
[5] Univ Delaware, Dept Geog & Spatial Sci, Newark, DE USA
关键词
Water quality forecasting; Ensemble model; Deep learning (DL); Single- and multi-objective optimization algorithms; Non-dominated genetic algorithm (NSGA-II); ARTIFICIAL NEURAL-NETWORK; GENETIC ALGORITHM; DISSOLVED-OXYGEN; HYBRID MODEL; PREDICTION; EUTROPHICATION; CLASSIFICATION; CONTAMINATION;
D O I
10.1007/s11356-023-30774-4
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Water quality variables, including chlorophyll-a (Chl-a), play a pivotal role in comprehending and evaluating the condition of aquatic ecosystems. Chl-a, a pigment present in diverse aquatic organisms, notably algae and cyanobacteria, serves as a valuable indicator of water quality. Thus, the objectives of this study encompass: (1) the assessment of the predictive capabilities of four deep learning (DL) models - namely, recurrent neural network (RNN), long short-term memory (LSTM), gated recurrence unit (GRU), and temporal convolutional network (TCN) - in forecasting Chl-a concentrations; (2) the incorporation of these DL models into ensemble models (EMs) employing genetic algorithm (GA) and non-dominated sorting genetic algorithm (NSGA-II) to harness the strengths of each standalone model; and (3) the evaluation of the efficacy of the developed EMs. Utilizing data collected at 15-min intervals from Small Prespa Lake (SPL) in Greece, the models employed hourly Chl-a concentration lag times, extending up to 6 h, as models' inputs to forecast Chla (t+1). The proposed models underwent training on 70% of the dataset and were subsequently validated on the remaining 30%. Among the standalone DL models, the GRU model exhibited superior performance in Chl-a forecasting, surpassing the RNN, LSTM, and TCN models by 8%, 2%, and 2%, respectively. Furthermore, the integration of DL models through single-objective GA and multi-objective NSGA-II optimization algorithms yielded hybrid models adept at effectively forecasting both low and high Chl-a concentrations. The ensemble model based on NSGA-II outperformed standalone DL models as well as the GA-based model across a range of evaluation indices. For instance, considering the R-squared metric, the study's findings demonstrated that the EM-NSGA-II stands out with exceptional effectiveness compared to DL and EM-GA models, showcasing improvements of 14% (RNN), 8% (LSTM), 6% (GRU), 8% (TCN), and 3% (EM-GA) during the testing phase.
引用
收藏
页码:124245 / 124262
页数:18
相关论文
共 123 条
[1]   An Experimental and Machine-Learning Investigation into Compaction of the Cemented Sand-Gravel Mixtures and Influencing Factors [J].
Aghajani, Hamed Farshbaf ;
Karimi, Sina ;
Diznab, Milad Hatefi .
TRANSPORTATION INFRASTRUCTURE GEOTECHNOLOGY, 2023, 10 (05) :816-855
[2]   Integrated community-based approaches to urban pluvial flooding research, trends and future directions: A review [J].
Azizi, Koorosh ;
Diko, Stephen Kofi ;
Saija, Laura ;
Zamani, Mohammad Ghadir ;
Meier, Claudio I. .
URBAN CLIMATE, 2022, 44
[3]  
Babatunde OluleyeH., 2014, A genetic algorithm-based feature selection
[4]   Human Health Risks due to Exposure to Water Pollution: A Review [J].
Babuji, Preethi ;
Thirumalaisamy, Subramani ;
Duraisamy, Karunanidhi ;
Periyasamy, Gopinathan .
WATER, 2023, 15 (14)
[5]   A fusion-based data assimilation framework for runoff prediction considering multiple sources of precipitation [J].
Bahrami, Maziyar ;
Talebbeydokhti, Nasser ;
Rakhshandehroo, Gholamreza ;
Nikoo, Mohammad Reza ;
Adamowski, Jan Franklin .
HYDROLOGICAL SCIENCES JOURNAL, 2023, 68 (04) :614-629
[6]  
Bai Shaojie., 2018, An empirical evaluation of generic convolutional and recurrent networks for sequence modeling, DOI DOI 10.48550/ARXIV.1803.01271
[7]   Coupling a hybrid CNN-LSTM deep learning model with a Boundary Corrected Maximal Overlap Discrete Wavelet Transform for multiscale Lake water level forecasting [J].
Barzegar, Rahim ;
Aalami, Mohammad Taghi ;
Adamowski, Jan .
JOURNAL OF HYDROLOGY, 2021, 598
[8]   Short-term water quality variable prediction using a hybrid CNN-LSTM deep learning model [J].
Barzegar, Rahim ;
Aalami, Mohammad Taghi ;
Adamowski, Jan .
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2020, 34 (02) :415-433
[9]   Using bootstrap ELM and LSSVM models to estimate river ice thickness in the Mackenzie River Basin in the Northwest Territories, Canada [J].
Barzegar, Rahim ;
Ghasri, Mahsa ;
Qi, Zhiming ;
Quilty, John ;
Adamowski, Jan .
JOURNAL OF HYDROLOGY, 2019, 577
[10]   Comparison of machine learning models for predicting fluoride contamination in groundwater [J].
Barzegar, Rahim ;
Moghaddam, Asghar Asghari ;
Adamowski, Jan ;
Fijani, Elham .
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2017, 31 (10) :2705-2718