Integrating machine learning models with cross-validation and bootstrapping for evaluating groundwater quality in Kanchanaburi province, Thailand

被引:9
|
作者
Thanh, Nguyen Ngoc [1 ]
Chotpantarat, Srilert [2 ,3 ]
Ngu, Nguyen Huu [1 ]
Thunyawatcharakul, Pongsathorn [4 ]
Kaewdum, Narongsak [5 ]
机构
[1] Hue Univ, Univ Agr & Forestry, 102 Phung Hung Str, Hue City 53000, Thua Thien Hue, Vietnam
[2] Chulalongkorn Univ, Fac Sci, Dept Geol, Bangkok 10330, Thailand
[3] Chulalongkorn Univ, Environm Res Inst, Ctr Excellence Environm Innovat & Management Met E, Phayathai Rd, Bangkok 10330, Thailand
[4] Chulalongkorn Univ, Grad Sch, Int Postgrad Program Hazardous Subst & Environm Ma, Bangkok 10330, Thailand
[5] Mahidol Univ, Geosci Program, Kanchanaburi Campus, Kanchanaburi 71150, Thailand
关键词
Groundwater quality; Random forest; Artificial neural network; Kanchanaburi province; Thailand; ARTIFICIAL NEURAL-NETWORK; RANDOM FOREST; LAND-USE; WATER; GIS; PREDICTION; MANAGEMENT; SURFACE; REGION; INDEX;
D O I
10.1016/j.envres.2024.118952
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Exploring the potential of new models for mapping groundwater quality presents a major challenge in water resource management, particularly in Kanchanaburi Province, Thailand, where groundwater faces contamination risks. This study aimed to explore the applicability of random forest (RF) and artificial neural networks (ANN) models to predict groundwater quality. Particularly, these two models were integrated into crossvalidation (CV) and bootstrapping (B) techniques to build predictive models, including RF-CV, RF-B, ANN-CV, and ANN-B. Entropy groundwater quality index (EWQI) was converted to normalized EWQI which was then classified into five levels from very poor to very good. A total of twelve physicochemical parameters from 180 groundwater wells, including potassium, sodium, calcium, magnesium, chloride, sulfate, bicarbonate, nitrate, pH, electrical conductivity, total dissolved solids, and total hardness, were investigated to decipher groundwater quality in the eastern part of Kanchanaburi Province, Thailand. Our results indicated that groundwater quality in the study area was primarily polluted by calcium, magnesium, and bicarbonate and that the RF-CV model (RMSE = 0.06, R2 = 0.87, MAE = 0.04) outperformed the RF-B (RMSE = 0.07, R2 = 0.80, MAE = 0.04), ANN-CV (RMSE = 0.09, R2 = 0.70, MAE = 0.06), and ANN-B (RMSE = 0.10, R2 = 0.67, MAE = 0.06). Our findings highlight the superiority of the RF models over the ANN models based on the CV and B techniques. In addition, the role of groundwater parameters to the normalized EWQI in various machine learning models was found. The groundwater quality map created by the RF-CV model can be applied to orient groundwater use.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Optimizing coastal groundwater quality predictions: A novel data mining framework with cross-validation, bootstrapping, and entropy analysis
    Islam, Abu Reza Md Towfiqul
    Mamun, Md. Abdullah-Al
    Hasan, Mehedi
    Aktar, Nazneen
    Uddin, Md Nashir
    Siddique, Md. Abu Bakar
    Chowdhury, Mohaiminul Haider
    Islam, Md. Saiful
    Bari, A. B. M. Mainul
    Idris, Abubakr M.
    Senapathi, Venkatramanan
    JOURNAL OF CONTAMINANT HYDROLOGY, 2025, 269
  • [2] Mapping groundwater potential zones in Kanchanaburi Province, Thailand by integrating of analytic hierarchy process, frequency ratio, and random forest
    Thanh, Nguyen Ngoc
    Chotpantarat, Srilert
    Trung, Nguyen H.
    Ngu, Nguyen Huu
    Van Muoi, Le
    ECOLOGICAL INDICATORS, 2022, 145
  • [3] Impact of Cross-Validation on Machine Learning Models for Early Detection of Intrauterine Fetal Demise
    Kaliappan, Jayakumar
    Bagepalli, Apoorva Reddy
    Almal, Shubh
    Mishra, Rishabh
    Hu, Yuh-Chung
    Srinivasan, Kathiravan
    DIAGNOSTICS, 2023, 13 (10)
  • [4] Spatial plus : A new cross-validation method to evaluate geospatial machine learning models
    Wang, Yanwen
    Khodadadzadeh, Mahdi
    Zurita-Milla, Raul
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 121
  • [5] Validation of machine learning ridge regression models using Monte Carlo, bootstrap, and variations in cross-validation
    Nakatsu, Robbie T.
    JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [6] A cross-validation scheme for machine learning algorithms in shotgun proteomics
    Viktor Granholm
    William Stafford Noble
    Lukas Käll
    BMC Bioinformatics, 13
  • [7] A cross-validation scheme for machine learning algorithms in shotgun proteomics
    Granholm, Viktor
    Noble, William Stafford
    Kall, Lukas
    BMC BIOINFORMATICS, 2012, 13
  • [8] Utilizing grid search cross-validation with adaptive boosting for augmenting performance of machine learning models
    Adnan M.
    Alarood A.A.S.
    Uddin M.I.
    Rehman I.U.
    PeerJ Computer Science, 2022, 8
  • [9] Utilizing grid search cross-validation with adaptive boosting for augmenting performance of machine learning models
    Adnan, Muhammad
    Alarood, Alaa Abdul Salam
    Uddin, M. Irfan
    Rehman, Izaz Ur
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [10] CROSS-VALIDATION OF PREDICTIVE MODELS FOR AESTHETIC QUALITY OF RESIDENTIAL STREETS
    SCHROEDER, HW
    BUHYOFF, GJ
    CANNON, WN
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 1986, 23 (04) : 309 - 316