Two-stage meta-ensembling machine learning model for enhanced water quality forecasting

被引:12
作者
Heydari, Sepideh [1 ]
Nikoo, Mohammad Reza [2 ]
Mohammadi, Ali [3 ]
Barzegar, Rahim [4 ]
机构
[1] Univ Tehran, Fac Environm Engn, Dept Environm Engn, Tehran, Iran
[2] Sultan Qaboos Univ, Dept Civil & Architectural Engn, Muscat, Oman
[3] Sharif Univ Technol, Dept Ind Engn, Tehran, Iran
[4] Univ Quebec Abitibi Temiscamingue UQAT, Res Inst Mines & Environm RIME, Groundwater Res Grp GRES, Amos, PQ, Canada
关键词
Water quality forecasting; Machine learning; Multi-objective optimization; Genetic algorithm; Grey Wolf Optimizer; Chlorophyll-a and Dissolved oxygen; PREDICTION; IMPLEMENTATION; DECOMPOSITION; PERFORMANCE; STREAMFLOW; RESOURCES; SYSTEM;
D O I
10.1016/j.jhydrol.2024.131767
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Accurate short-term forecasting of water quality variables (WQVs) such as dissolved oxygen (DO) and chlorophyll-a (Chl-a) is crucial for the effective management of aquatic resources. This study introduces a robust two-stage optimization-ensembling framework that integrates the Grey Wolf Optimizer (GWO) and the Nondominated Sorting Genetic Algorithm II (NSGA-II) to enhance the forecasting capabilities of machine learning (ML) models. Focusing on Small Prespa Lake, Greece, we implemented an array of diverse ML techniques, including eXtreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Light Gradient-Boosting Machine (LightGBM), and Multilayer Perceptron (MLP). These models were fine-tuned using GWO to optimize their performance over critical WQVs predicted six hours in advance. Our methodology employed rigorous data preprocessing techniques, including lag time feature engineering and principal component analysis (PCA), to handle the high dimensionality of the dataset. Optimal lag times ranging from 6 to 24 hour were evaluated, with the 24-hour lag proving to be the most effective in utilizing historical data to enhance forecasting accuracy. The GWO not only facilitated hyperparameter tuning but also demonstrated a notable improvement (7.6%) in the Kling-Gupta Efficiency (KGE) over conventional randomized search methods. Subsequently, the NSGA-II was utilized for multi-objective optimization, constructing powerful model ensembles that outperformed the individual GWO-optimized models by up to a 7% in KGE. In comparision to a standard genetic algorithm-based ensemble, the NSGA-II ensemble demonstrated superior effectiveness in balancing solution quality. This innovative approach not only establishes a new benchmark in water quality forecasting but also contributes substantially to proactive environmental monitoring and management strategies.
引用
收藏
页数:16
相关论文
共 50 条
[31]   Applicability of a Three-Stage Hybrid Model by Employing a Two-Stage Signal Decomposition Approach and a Deep Learning Methodology for Runoff Forecasting at Swat River Catchment, Pakistan [J].
Sibtain, Muhammad ;
Li, Xianshan ;
Azam, Muhammad Imran ;
Bashir, Hassan .
POLISH JOURNAL OF ENVIRONMENTAL STUDIES, 2021, 30 (01) :369-384
[32]   Multi-step ahead wind speed forecasting using a hybrid model based on two-stage decomposition technique and AdaBoost-extreme learning machine [J].
Peng, Tian ;
Zhou, Jianzhong ;
Zhang, Chu ;
Zheng, Yang .
ENERGY CONVERSION AND MANAGEMENT, 2017, 153 :589-602
[33]   A hybrid two-stage financial stock forecasting algorithm based on clustering and ensemble learning [J].
Xu, Ying ;
Yan, Cuijuan ;
Peng, Shaoliang ;
Nojima, Yusuke .
APPLIED INTELLIGENCE, 2020, 50 (11) :3852-3867
[34]   Multi-step water quality forecasting using a boosting ensemble multi-wavelet extreme learning machine model [J].
Barzegar, Rahim ;
Moghaddam, Asghar Asghari ;
Adamowski, Jan ;
Ozga-Zielinski, Bogdan .
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2018, 32 (03) :799-813
[35]   PREDICTING THE PROJECTILE VELOCITY OF A TWO-STAGE GAS GUN USING MACHINE LEARNING [J].
Shojaei, Pouya ;
Trabia, Mohamed ;
O'Toole, Brendan ;
Jennings, Richard .
PROCEEDINGS OF ASME 2022 PRESSURE VESSELS AND PIPING CONFERENCE, PVP2022, VOL 3, 2022,
[36]   Two-Stage Ransomware Detection Using Dynamic Analysis and Machine Learning Techniques [J].
Hwang, Jinsoo ;
Kim, Jeankyung ;
Lee, Seunghwan ;
Kim, Kichang .
WIRELESS PERSONAL COMMUNICATIONS, 2020, 112 (04) :2597-2609
[37]   Two-Stage Ransomware Detection Using Dynamic Analysis and Machine Learning Techniques [J].
Jinsoo Hwang ;
Jeankyung Kim ;
Seunghwan Lee ;
Kichang Kim .
Wireless Personal Communications, 2020, 112 :2597-2609
[38]   Sewer sediment deposition prediction using a two-stage machine learning solution [J].
Gene, Marc Ribalta ;
Bejar, Ramon ;
Mateu, Carles ;
Corominas, Lluis ;
Esbri, Oscar ;
Rubion, Edgar .
JOURNAL OF HYDROINFORMATICS, 2024, 26 (04) :727-743
[39]   Detection of Malicious PDF Files Using a Two-Stage Machine Learning Algorithm [J].
He, Kang ;
Zhu, Yuefei ;
He, Yubo ;
Liu, Long ;
Lu, Bin ;
Lin, Wei .
CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (06) :1165-1177
[40]   A novel machine learning application: Water quality resilience prediction Model [J].
Imani, Maryam ;
Hasan, Md Mahmudul ;
Bittencourt, Luiz Fernando ;
McClymont, Kent ;
Kapelan, Zoran .
SCIENCE OF THE TOTAL ENVIRONMENT, 2021, 768