Two-stage meta-ensembling machine learning model for enhanced water quality forecasting

被引:3
|
作者
Heydari, Sepideh [1 ]
Nikoo, Mohammad Reza [2 ]
Mohammadi, Ali [3 ]
Barzegar, Rahim [4 ]
机构
[1] Univ Tehran, Fac Environm Engn, Dept Environm Engn, Tehran, Iran
[2] Sultan Qaboos Univ, Dept Civil & Architectural Engn, Muscat, Oman
[3] Sharif Univ Technol, Dept Ind Engn, Tehran, Iran
[4] Univ Quebec Abitibi Temiscamingue UQAT, Res Inst Mines & Environm RIME, Groundwater Res Grp GRES, Amos, PQ, Canada
关键词
Water quality forecasting; Machine learning; Multi-objective optimization; Genetic algorithm; Grey Wolf Optimizer; Chlorophyll-a and Dissolved oxygen; PREDICTION; IMPLEMENTATION; DECOMPOSITION; PERFORMANCE; STREAMFLOW; RESOURCES; SYSTEM;
D O I
10.1016/j.jhydrol.2024.131767
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Accurate short-term forecasting of water quality variables (WQVs) such as dissolved oxygen (DO) and chlorophyll-a (Chl-a) is crucial for the effective management of aquatic resources. This study introduces a robust two-stage optimization-ensembling framework that integrates the Grey Wolf Optimizer (GWO) and the Nondominated Sorting Genetic Algorithm II (NSGA-II) to enhance the forecasting capabilities of machine learning (ML) models. Focusing on Small Prespa Lake, Greece, we implemented an array of diverse ML techniques, including eXtreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Light Gradient-Boosting Machine (LightGBM), and Multilayer Perceptron (MLP). These models were fine-tuned using GWO to optimize their performance over critical WQVs predicted six hours in advance. Our methodology employed rigorous data preprocessing techniques, including lag time feature engineering and principal component analysis (PCA), to handle the high dimensionality of the dataset. Optimal lag times ranging from 6 to 24 hour were evaluated, with the 24-hour lag proving to be the most effective in utilizing historical data to enhance forecasting accuracy. The GWO not only facilitated hyperparameter tuning but also demonstrated a notable improvement (7.6%) in the Kling-Gupta Efficiency (KGE) over conventional randomized search methods. Subsequently, the NSGA-II was utilized for multi-objective optimization, constructing powerful model ensembles that outperformed the individual GWO-optimized models by up to a 7% in KGE. In comparision to a standard genetic algorithm-based ensemble, the NSGA-II ensemble demonstrated superior effectiveness in balancing solution quality. This innovative approach not only establishes a new benchmark in water quality forecasting but also contributes substantially to proactive environmental monitoring and management strategies.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] A novel hybrid model based on two-stage data processing and machine learning for forecasting chlorophyll-a concentration in reservoirs
    Yu, Wenqing
    Wang, Xingju
    Jiang, Xin
    Zhao, Ranhang
    Zhao, Shen
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2024, 31 (01) : 406 - 421
  • [2] Two-stage machine learning model for guideline development
    Mani, S
    Shankle, WR
    Dick, MB
    Pazzani, MJ
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 1999, 16 (01) : 51 - 71
  • [3] Hybrid Multivariate Machine Learning Models for Streamflow Forecasting: A Two-Stage Decomposition-Reconstruction Framework
    Jin, Aohan
    Wang, Quanrong
    Zhou, Renjie
    Shi, Wenguang
    Qiao, Xiangyu
    JOURNAL OF HYDROLOGIC ENGINEERING, 2024, 29 (05)
  • [4] A novel two-stage seasonal grey model for residential electricity consumption forecasting
    Du, Pei
    Guo, Ju'e
    Sun, Shaolong
    Wang, Shouyang
    Wu, Jing
    ENERGY, 2022, 258
  • [5] Combining two-stage decomposition based machine learning methods for annual runoff forecasting
    Chen, Shu
    Ren, Miaomiao
    Sun, Wei
    JOURNAL OF HYDROLOGY, 2021, 603
  • [6] A Two-Stage Hybrid Extreme Learning Model for Short-Term Traffic Flow Forecasting
    Cui, Zhihan
    Huang, Boyu
    Dou, Haowen
    Cheng, Yan
    Guan, Jitian
    Zhou, Teng
    MATHEMATICS, 2022, 10 (12)
  • [7] Machine Learning-Based Two-Stage Data Selection Scheme for Long-Term Influenza Forecasting
    Moon, Jaeuk
    Jung, Seungwon
    Park, Sungwoo
    Hwang, Eenjun
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (03): : 2945 - 2959
  • [8] Two-stage quality monitoring of a laser welding process using machine learning
    Dold, Patricia M.
    Bleier, Fabian
    Boley, Meiko
    Mikut, Ralf
    AT-AUTOMATISIERUNGSTECHNIK, 2023, 71 (10) : 878 - 890
  • [9] A novel hybrid model based on two-stage data processing and machine learning for forecasting chlorophyll-a concentration in reservoirs
    Wenqing Yu
    Xingju Wang
    Xin Jiang
    Ranhang Zhao
    Shen Zhao
    Environmental Science and Pollution Research, 2024, 31 : 262 - 279
  • [10] Two-Stage Machine Learning Framework for Simultaneous Forecasting of Price-Load in the Smart Grid
    Victoire, Aruldoss Albert T.
    Gobu, B.
    Jaikumar, S.
    Arulmozhi, N.
    Kanimozhi, P.
    Victoire, Amalraj T.
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 1081 - 1086