Enhancing data-driven modeling of fluoride concentration using new data mining algorithms

被引:9
作者
Gupta, Praveen Kumar [1 ]
Maiti, Saumen [1 ]
机构
[1] IIT ISM, Dept Appl Geophys, Dhanbad 826004, Bihar, India
关键词
Fluoride; Data transformation; Data mining algorithms; Gaussian process; SUPPORT VECTOR MACHINES; GROUNDWATER QUALITY; GAUSSIAN-PROCESSES; CONTAMINATION; PREDICTION; CLASSIFICATION; INTELLIGENCE; DISTANCE; REGION;
D O I
10.1007/s12665-022-10216-z
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Groundwater is an essential constituent of drinking water in hard rock areas and hence it requires the analysis of contaminant resources. Fluoride contamination with large spatial variation in the part of Sindhudurg district is reported. The present study focuses on the development of data-driven modeling of fluoride concentration using on-site measurement of physicochemical parameters. In this configuration, six machine learning(ML) architectures, namely data mining algorithms were explored including novel algorithms Gaussian process (GP) and long short term memory (LSTM). The results were compared with support vector machine (SVM), random forest (RF), extreme learning machine (ELM), and multi-layer perceptron (MLP) as a benchmark to test the robustness of the modeling process. In total 225 water samples from different dug-wells/bore- wells were obtained from the area (latitude:15.37-16.40 degree, longitude:73.19-74.18 degree) in the period of 2009-2016. Two subsets of data were divided with 80% data in training and 20% in testing. Different 9 physicochemical parameters pH, EC, TDS, Ca2+, Mg2+, Na+, Cl-, HCO3-, SO42- were used in the modeling of fluoride (F-). In this context logarithmic transformation of raw data was employed to improve the correlation between input and target and therefore to enhance the modeling accuracy. Different quantitative and qualitative (visual) measures were taken to establish the prediction power of models. Results revealed that GP outperform all other models in fluoride prediction followed by LSTM, SVM, MLP, RF, and ELM, respectively. Results also revealed that the model's performance depends on model structure and data accuracy.
引用
收藏
页数:13
相关论文
共 70 条
  • [1] Development of an artificial neural network based multi-model ensemble to estimate the northeast monsoon rainfall over south peninsular India: an application of extreme learning machine
    Acharya, Nachiketa
    Shrivastava, Nitin Anand
    Panigrahi, B. K.
    Mohanty, U. C.
    [J]. CLIMATE DYNAMICS, 2014, 43 (5-6) : 1303 - 1310
  • [2] Occurrence, health risks, and geochemical mechanisms of fluoride and nitrate in groundwater of the rock-dominant semi-arid region, Telangana State, India
    Adimalla, Narsimha
    Li, Peiyue
    [J]. HUMAN AND ECOLOGICAL RISK ASSESSMENT, 2019, 25 (1-2): : 81 - 103
  • [3] Using of neural networks for the prediction of nitrate groundwater contamination in rural and agricultural areas
    Al-Mahallawi, Khamis
    Mania, Jacky
    Hani, Azzedine
    Shahrour, Isam
    [J]. ENVIRONMENTAL EARTH SCIENCES, 2012, 65 (03) : 917 - 928
  • [4] Modeling of nitrate concentration in groundwater using artificial intelligence approach-a case study of Gaza coastal aquifer
    Alagha, Jawad S.
    Said, Md Azlin Md
    Mogheir, Yunes
    [J]. ENVIRONMENTAL MONITORING AND ASSESSMENT, 2014, 186 (01) : 35 - 45
  • [5] Amini M, 2009, 18TH WORLD IMACS CONGRESS AND MODSIM09 INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION, P4100
  • [6] [Anonymous], 1987, DRINK WAT STAND
  • [7] Significance of machine learning algorithms in professional blogger's classification
    Asim, Yousra
    Shahid, Ahmad Raza
    Malik, Ahmad Kamran
    Raza, Basit
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2018, 65 : 461 - 473
  • [8] Comparison of machine learning models for predicting fluoride contamination in groundwater
    Barzegar, Rahim
    Moghaddam, Asghar Asghari
    Adamowski, Jan
    Fijani, Elham
    [J]. STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2017, 31 (10) : 2705 - 2718
  • [9] Random forest in remote sensing: A review of applications and future directions
    Belgiu, Mariana
    Dragut, Lucian
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 114 : 24 - 31
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32