Two-stage meta-ensembling machine learning model for enhanced water quality forecasting

被引:12
作者
Heydari, Sepideh [1 ]
Nikoo, Mohammad Reza [2 ]
Mohammadi, Ali [3 ]
Barzegar, Rahim [4 ]
机构
[1] Univ Tehran, Fac Environm Engn, Dept Environm Engn, Tehran, Iran
[2] Sultan Qaboos Univ, Dept Civil & Architectural Engn, Muscat, Oman
[3] Sharif Univ Technol, Dept Ind Engn, Tehran, Iran
[4] Univ Quebec Abitibi Temiscamingue UQAT, Res Inst Mines & Environm RIME, Groundwater Res Grp GRES, Amos, PQ, Canada
关键词
Water quality forecasting; Machine learning; Multi-objective optimization; Genetic algorithm; Grey Wolf Optimizer; Chlorophyll-a and Dissolved oxygen; PREDICTION; IMPLEMENTATION; DECOMPOSITION; PERFORMANCE; STREAMFLOW; RESOURCES; SYSTEM;
D O I
10.1016/j.jhydrol.2024.131767
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Accurate short-term forecasting of water quality variables (WQVs) such as dissolved oxygen (DO) and chlorophyll-a (Chl-a) is crucial for the effective management of aquatic resources. This study introduces a robust two-stage optimization-ensembling framework that integrates the Grey Wolf Optimizer (GWO) and the Nondominated Sorting Genetic Algorithm II (NSGA-II) to enhance the forecasting capabilities of machine learning (ML) models. Focusing on Small Prespa Lake, Greece, we implemented an array of diverse ML techniques, including eXtreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Light Gradient-Boosting Machine (LightGBM), and Multilayer Perceptron (MLP). These models were fine-tuned using GWO to optimize their performance over critical WQVs predicted six hours in advance. Our methodology employed rigorous data preprocessing techniques, including lag time feature engineering and principal component analysis (PCA), to handle the high dimensionality of the dataset. Optimal lag times ranging from 6 to 24 hour were evaluated, with the 24-hour lag proving to be the most effective in utilizing historical data to enhance forecasting accuracy. The GWO not only facilitated hyperparameter tuning but also demonstrated a notable improvement (7.6%) in the Kling-Gupta Efficiency (KGE) over conventional randomized search methods. Subsequently, the NSGA-II was utilized for multi-objective optimization, constructing powerful model ensembles that outperformed the individual GWO-optimized models by up to a 7% in KGE. In comparision to a standard genetic algorithm-based ensemble, the NSGA-II ensemble demonstrated superior effectiveness in balancing solution quality. This innovative approach not only establishes a new benchmark in water quality forecasting but also contributes substantially to proactive environmental monitoring and management strategies.
引用
收藏
页数:16
相关论文
共 50 条
[41]   Predicting and Analyzing Water Quality using Machine Learning: A Comprehensive Model [J].
Khan, Yafra ;
See, Chai Soo .
2016 IEEE LONG ISLAND SYSTEMS, APPLICATIONS AND TECHNOLOGY CONFERENCE (LISAT), 2016,
[42]   Utilizing machine learning techniques for enhanced water quality monitoring [J].
Yigit, Gozde Ozsert ;
Baransel, Cesur .
WATER QUALITY RESEARCH JOURNAL, 2024, 59 (04) :187-204
[43]   A two-stage model for stock price prediction based on variational mode decomposition and ensemble machine learning method [J].
Jun Zhang ;
Xuedong Chen .
Soft Computing, 2024, 28 :2385-2408
[44]   A two-stage model for stock price prediction based on variational mode decomposition and ensemble machine learning method [J].
Zhang, Jun ;
Chen, Xuedong .
SOFT COMPUTING, 2024, 28 (03) :2385-2408
[45]   A daily carbon emission prediction model combining two-stage feature selection and optimized extreme learning machine [J].
Kong, Feng ;
Song, Jianbo ;
Yang, Zhongzhi .
ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2022, 29 (58) :87983-87997
[46]   A hybrid wind speed forecasting model with two-stage data processing based on adaptive neuro-fuzzy inference systems and deep learning algorithms [J].
Tian, Zhongda ;
Wei, Donglai .
EARTH SCIENCE INFORMATICS, 2025, 18 (01)
[47]   Two-Stage Deep Learning Model for Automated Segmentation and Classification of Splenomegaly [J].
Meddeb, Aymen ;
Kossen, Tabea ;
Bressem, Keno K. ;
Molinski, Noah ;
Hamm, Bernd ;
Nagel, Sebastian N. .
CANCERS, 2022, 14 (22)
[48]   A qualitatively analyzable two-stage ensemble model based on machine learning for credit risk early warning: Evidence from Chinese manufacturing companies [J].
Wang, Lu ;
Zhang, Wenyao .
INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
[49]   A two-stage supervised learning approach for electricity price forecasting by leveraging different data sources [J].
Luo, Shuman ;
Weng, Yang .
APPLIED ENERGY, 2019, 242 :1497-1512
[50]   Forecasting Direction of Stock Index Using Two Stage Hybridization of Machine Learning Models [J].
Misra, Puneet ;
Chaurasia, Siddharth .
2018 7TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO) (ICRITO), 2018, :533-537