Integrating machine learning models for optimizing ecosystem health assessments through prediction of nitrate–N concentrations in the lower stretch of Ganga River, India

被引:0
作者
Das, Basanta Kumar [1 ]
Paul, Sanatan [1 ]
Mandal, Biswajit [1 ]
Gogoi, Pranab [1 ]
Paul, Liton [1 ]
Saha, Ajoy [1 ]
Johnson, Canciyal [1 ]
Das, Akankshya [1 ]
Ray, Archisman [1 ]
Roy, Shreya [1 ]
Das Gupta, Shubhadeep [1 ]
机构
[1] ICAR-Central Inland Fisheries Research Institute, Barrackpore, West Bengal, Kolkata
关键词
GAMs; Ganga River; MAE; Nitrate–N; NSE; RF; RMSE; Variable importance; XGboost;
D O I
10.1007/s11356-025-35999-z
中图分类号
学科分类号
摘要
Nitrate, a highly reactive form of inorganic nitrogen, is commonly found in aquatic environments. Understanding the dynamics of nitrate–N concentration in rivers and its interactions with other water-quality parameters is crucial for effective freshwater ecosystem management. This study uses advanced machine learning models to analyse water quality parameters and predict nitrate–N concentrations in the lower stretch of the Ganga River from the observations of six annual periods (2017 to 2022). The parameters include water temperature, pH, specific conductivity (Sp_Con), dissolved oxygen (DO), nitrate–N, total phosphate (TP), turbidity, biochemical oxygen demand (BOD), silicate, total dissolved solids (TDS), and rainfall. The present study evaluated the predictive performance of five models—Multiple Polynomial Regression (MPR), Generalized Additive Models (GAMs), Decision Tree Regression, Random Forest (RF), and XGBoost (Extreme Gradient Boosting)—using RMSE, MAE, MAPE, NSE and R2 metrics. XGBoost emerged as the top performer, with an RMSE of 0.024, MAE of 0.018, MAPE of 51.805, NSE of 0.855 and R2 of 0.85, explaining 85% of the variance in nitrate–N concentrations. Random Forest also demonstrated strong predictive capability, with an RMSE of 0.028, MAE of 0.021, MAPE of 57.272, NSE of 0.804 and R2 of 0.80. MPR effectively modelled non-linear relationships, explaining 75% of the variance, while Decision Tree Regression and GAMs were less effective, with R2 values of 0.60 and 0.48, respectively. Variables (BOD, pH, Rainfall, water temperature, and total phosphate) were the best predictors of nitrate–N dynamics. Comparative analysis with previous studies confirmed the robustness of XGBoost and Random Forest in environmental data modelling. The findings highlight the importance of advanced machine learning models in accurately predicting water quality parameters and facilitating proactive management strategies. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
引用
收藏
页码:4670 / 4689
页数:19
相关论文
共 93 条
[1]  
Abba S.I., Yassin M.A., Jibril M.M., Tawabini B., Soupios P., Khogal A., Shah S.M.H., Usman J., Aljundi I.H., Nitrate concentrations tracking from multi-aquifer groundwater vulnerability zones: insight from machine learning and spatial mapping, Process Saf Environ Prot, 184, pp. 1143-1157, (2024)
[2]  
Agbasi J.C., Egbueri J.C., Prediction of potentially toxic elements in water resources using MLP-NN, RBF-NN, and ANFIS: A comprehensive review, Environ Sci Pollut Res, pp. 1-29, (2024)
[3]  
Standard methods for the examination of water and wastewater, American Water Works Association (AWWA) and Water Environment Federation, (2012)
[4]  
Aslan V., The analysis of classical, polynomial regression and cubic spline mathematical models in hemp biodiesel optimization: an experimental comparison, Environ Sci Pollut Res, 31, pp. 9392-9940, (2024)
[5]  
Aswal D.K., Quality infrastructure of India and its importance for inclusive national growth, Mapan, 35, 2, pp. 139-150, (2020)
[6]  
Attri S.D., Tyagi A., Climate profile of India, (2010)
[7]  
Banik S., Balasubramanian K., Manna S., Derrible S., Sankaranarayananan S.K., Evaluating generalized feature importance via performance assessment of machine learning models for predicting elastic properties of materials, Comput Mater Sci, 236, (2024)
[8]  
Belany P., Hrabovsky P., Sedivy S., CajovaKantova N., Florkova Z., A comparative analysis of polynomial regression and artificial neural networks for predictionof lighting consumption, Buildings, 14, 6, (2024)
[9]  
Bhatt S., Mishra A.P., Chandra N., Sahu H., Chaurasia S.K., Pande C.B., Agbasi J.C., Khan M.Y.A., Abba S.I., Egbueri J.C., Durin B., Characterizing seasonal, environmental and human-induced factors influencing the dynamics of Rispana River's water quality: Implications for sustainable river management, Results Eng, 22, (2024)
[10]  
Bhattara A., Dhakal S., Gautam Y., Bhattarai R., Prediction of nitrate and phosphate concentrations using machine learning algorithms in watersheds with different landuse, Water, 13, 21, (2021)