Efficient Water Quality Prediction Using Supervised Machine Learning

被引:210
作者
Ahmed, Umair [1 ]
Mumtaz, Rafia [1 ]
Anwar, Hirra [1 ]
Shah, Asad A. [1 ]
Irfan, Rabia [1 ]
Garcia-Nieto, Jose [2 ]
机构
[1] Natl Univ Sci & Technol NUST, SEECS, Islamabad 44000, Pakistan
[2] Univ Malaga, Dept Languages & Comp Sci, Ada Byron Res Bldg, Malaga 29016, Spain
关键词
water quality prediction; supervised machine learning; smart city; gradient boosting; multi-layer perceptron; RIDGE-REGRESSION; F-SCORE; INDEX; SELECTION; MALAYSIA;
D O I
10.3390/w11112210
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Water makes up about 70% of the earth's surface and is one of the most important sources vital to sustaining life. Rapid urbanization and industrialization have led to a deterioration of water quality at an alarming rate, resulting in harrowing diseases. Water quality has been conventionally estimated through expensive and time-consuming lab and statistical analyses, which render the contemporary notion of real-time monitoring moot. The alarming consequences of poor water quality necessitate an alternative method, which is quicker and inexpensive. With this motivation, this research explores a series of supervised machine learning algorithms to estimate the water quality index (WQI), which is a singular index to describe the general quality of water, and the water quality class (WQC), which is a distinctive class defined on the basis of the WQI. The proposed methodology employs four input parameters, namely, temperature, turbidity, pH and total dissolved solids. Of all the employed algorithms, gradient boosting, with a learning rate of 0.1 and polynomial regression, with a degree of 2, predict the WQI most efficiently, having a mean absolute error (MAE) of 1.9642 and 2.7273, respectively. Whereas multi-layer perceptron (MLP), with a configuration of (3, 7), classifies the WQC most efficiently, with an accuracy of 0.8507. The proposed methodology achieves reasonable accuracy using a minimal number of parameters to validate the possibility of its use in real time water quality detection systems.
引用
收藏
页数:14
相关论文
共 36 条
  • [1] Evaluation of multivariate linear regression and artificial neural networks in prediction of water quality parameters
    Abyaneh, Hamid Zare
    [J]. JOURNAL OF ENVIRONMENTAL HEALTH SCIENCE AND ENGINEERING, 2014, 12
  • [2] Ahmad Z, 2017, INT J RIVER BASIN MA, V15, P79, DOI 10.1080/15715124.2016.1256297
  • [3] Ali M, 2013, 2013 EIGHTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), P108, DOI 10.1109/ICDIM.2013.6694009
  • [4] Short term load forecasting using multiple linear regression
    Amral, N.
    Oezveren, C. S.
    King, D.
    [J]. 2007 42ND INTERNATIONAL UNIVERSITIES POWER ENGINEERING CONFERENCE, VOLS 1-3, 2007, : 1192 - 1198
  • [5] [Anonymous], 2007, NAT WAT QUAL MON PRO
  • [6] Beyer K, 1999, LECT NOTES COMPUT SC, V1540, P217
  • [7] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    [J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [8] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [9] Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
  • [10] Development of a water quality index (WQI) for the Loktak Lake in India
    Das Kangabam R.
    Bhoominathan S.D.
    Kanagaraj S.
    Govindaraju M.
    [J]. Applied Water Science, 2017, 7 (6) : 2907 - 2918