Efficient Water Quality Prediction Using Supervised Machine Learning

被引:210
作者
Ahmed, Umair [1 ]
Mumtaz, Rafia [1 ]
Anwar, Hirra [1 ]
Shah, Asad A. [1 ]
Irfan, Rabia [1 ]
Garcia-Nieto, Jose [2 ]
机构
[1] Natl Univ Sci & Technol NUST, SEECS, Islamabad 44000, Pakistan
[2] Univ Malaga, Dept Languages & Comp Sci, Ada Byron Res Bldg, Malaga 29016, Spain
关键词
water quality prediction; supervised machine learning; smart city; gradient boosting; multi-layer perceptron; RIDGE-REGRESSION; F-SCORE; INDEX; SELECTION; MALAYSIA;
D O I
10.3390/w11112210
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Water makes up about 70% of the earth's surface and is one of the most important sources vital to sustaining life. Rapid urbanization and industrialization have led to a deterioration of water quality at an alarming rate, resulting in harrowing diseases. Water quality has been conventionally estimated through expensive and time-consuming lab and statistical analyses, which render the contemporary notion of real-time monitoring moot. The alarming consequences of poor water quality necessitate an alternative method, which is quicker and inexpensive. With this motivation, this research explores a series of supervised machine learning algorithms to estimate the water quality index (WQI), which is a singular index to describe the general quality of water, and the water quality class (WQC), which is a distinctive class defined on the basis of the WQI. The proposed methodology employs four input parameters, namely, temperature, turbidity, pH and total dissolved solids. Of all the employed algorithms, gradient boosting, with a learning rate of 0.1 and polynomial regression, with a degree of 2, predict the WQI most efficiently, having a mean absolute error (MAE) of 1.9642 and 2.7273, respectively. Whereas multi-layer perceptron (MLP), with a configuration of (3, 7), classifies the WQC most efficiently, with an accuracy of 0.8507. The proposed methodology achieves reasonable accuracy using a minimal number of parameters to validate the possibility of its use in real time water quality detection systems.
引用
收藏
页数:14
相关论文
共 36 条
  • [11] Drinking Water Quality Status and Contamination in Pakistan
    Daud, M. K.
    Nafees, Muhammad
    Ali, Shafaqat
    Rizwan, Muhammad
    Bajwa, Raees Ahmad
    Shakoor, Muhammad Bilal
    Arshad, Muhammad Umair
    Chatha, Shahzad Ali Shahid
    Deeba, Farah
    Murad, Waheed
    Malook, Ijaz
    Zhu, Shui Jin
    [J]. BIOMED RESEARCH INTERNATIONAL, 2017, 2017
  • [12] Stochastic gradient boosting
    Friedman, JH
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) : 367 - 378
  • [13] Artificial neural network modeling of the water quality index for Kinta River (Malaysia) using water quality variables as predictors
    Gazzaz, Nabeel M.
    Yusoff, Mohd Kamil
    Aris, Ahmad Zaharin
    Juahir, Hafizan
    Ramli, Mohammad Firuz
    [J]. MARINE POLLUTION BULLETIN, 2012, 64 (11) : 2409 - 2420
  • [14] Goutte C, 2005, LECT NOTES COMPUT SC, V3408, P345
  • [15] Günther F, 2010, R J, V2, P30
  • [16] RIDGE REGRESSION - BIASED ESTIMATION FOR NONORTHOGONAL PROBLEMS
    HOERL, AE
    KENNARD, RW
    [J]. TECHNOMETRICS, 1970, 12 (01) : 55 - &
  • [17] Hosmer DW, 2013, APPL LOGISTIC REGRES, DOI [10.1002/9781118548387, DOI 10.1002/9781118548387]
  • [18] Jayalakshmi T., 2011, Int. J. Comput. Theory Eng., V3, P1793
  • [19] MEHMOOD K., 2015, BULL ENVIRON PHARMAC, V4, P88
  • [20] Mehmood S., 2013, Sci. Rep, V637, P1, DOI [10.4172/scientificreports.637, DOI 10.4172/SCIENTIFICREPORTS.637]