Efficient Water Quality Prediction Using Supervised Machine Learning

被引:226
作者
Ahmed, Umair [1 ]
Mumtaz, Rafia [1 ]
Anwar, Hirra [1 ]
Shah, Asad A. [1 ]
Irfan, Rabia [1 ]
Garcia-Nieto, Jose [2 ]
机构
[1] Natl Univ Sci & Technol NUST, SEECS, Islamabad 44000, Pakistan
[2] Univ Malaga, Dept Languages & Comp Sci, Ada Byron Res Bldg, Malaga 29016, Spain
关键词
water quality prediction; supervised machine learning; smart city; gradient boosting; multi-layer perceptron; RIDGE-REGRESSION; F-SCORE; INDEX; SELECTION; MALAYSIA;
D O I
10.3390/w11112210
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Water makes up about 70% of the earth's surface and is one of the most important sources vital to sustaining life. Rapid urbanization and industrialization have led to a deterioration of water quality at an alarming rate, resulting in harrowing diseases. Water quality has been conventionally estimated through expensive and time-consuming lab and statistical analyses, which render the contemporary notion of real-time monitoring moot. The alarming consequences of poor water quality necessitate an alternative method, which is quicker and inexpensive. With this motivation, this research explores a series of supervised machine learning algorithms to estimate the water quality index (WQI), which is a singular index to describe the general quality of water, and the water quality class (WQC), which is a distinctive class defined on the basis of the WQI. The proposed methodology employs four input parameters, namely, temperature, turbidity, pH and total dissolved solids. Of all the employed algorithms, gradient boosting, with a learning rate of 0.1 and polynomial regression, with a degree of 2, predict the WQI most efficiently, having a mean absolute error (MAE) of 1.9642 and 2.7273, respectively. Whereas multi-layer perceptron (MLP), with a configuration of (3, 7), classifies the WQC most efficiently, with an accuracy of 0.8507. The proposed methodology achieves reasonable accuracy using a minimal number of parameters to validate the possibility of its use in real time water quality detection systems.
引用
收藏
页数:14
相关论文
共 36 条
[21]   Coefficients of determination for multiple logistic regression analysis [J].
Menard, S .
AMERICAN STATISTICIAN, 2000, 54 (01) :17-24
[22]   Modelling using polynomial regression [J].
Ostertagova, Eva .
MODELLING OF MECHANICAL AND MECHATRONICS SYSTEMS, 2012, 48 :500-506
[23]  
PCRWR, 2010, WAT QUAL FILTR PLANT
[24]   DECISION TREES AND DECISION-MAKING [J].
QUINLAN, JR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1990, 20 (02) :339-346
[25]  
Rankovi V., 2010, SERBIA ECOL MODELL, V221, P1239, DOI DOI 10.1016/j.ecolmodel.2009.12.023
[27]  
Shafi Uferah, 2018, 2018 15th International Conference on Smart Cities: Improving Quality of Life Using ICT & IoT (HONET-ICT), P92, DOI 10.1109/HONET.2018.8551341
[28]  
Sokolova M, 2006, LECT NOTES COMPUT SC, V4304, P1015
[29]  
Srivastava G., 2013, INT J RES ENG TECHNO, V2, P609, DOI [10.15623/ijret.2013.0204035, DOI 10.15623/IJRET.2013.0204035]
[30]  
Thukral A., 2005, Sat, V1, P99