An optimization framework with dimensionality reduction using Markov Chain Monte Carlo and genetic algorithms for groundwater potential assessment

被引:0
作者
Wang, Zitao [1 ,2 ,3 ]
Yue, Chao [1 ,2 ,3 ]
Wang, Jianping [1 ,2 ]
机构
[1] Chinese Acad Sci, Qinghai Inst Salt Lakes, Key Lab Comprehens & Highly Efficient Utilizat Sal, Xining 810008, Peoples R China
[2] Qinghai Prov Key Lab Geol & Environm Salt Lakes, Xining 810008, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Groundwater potential assessment; Dimensionality reduction; Genetic algorithm; MCMC; Automated machine learning; JIANGHAN PLAIN; LOGISTIC-REGRESSION; RANDOM FOREST; GIS; VARIABILITY; WEIGHTS; MACHINE; MODELS;
D O I
10.1016/j.asoc.2024.111991
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Limited samples and high-dimensional feature spaces often hinder the accuracy of machine learning (ML) models in regional groundwater potential assessment (GPA). This study proposes a novel framework, the GPA with Dimensionality Optimization (GPADO), that optimizes feature dimension reduction to enhance prediction performance. Taking the Jianghan Basin as an example, data on nine continuous variables and five categorical variables influencing the region's GPA were gathered, expanding the feature set to 37 through One-hot encoding for categorical variables. Three scenarios were devised to assess prediction outcomes following various dimensionality reduction approaches. Comparative analysis revealed that a hybrid dimension reduction method, incorporating both continuous and categorical variables, yielded the highest validation set accuracy. Consequently, genetic algorithm and Markov Chain Monte Carlo methods were employed to determine the optimal solution and uncertainties associated with four unknown parameters: the chosen dimension reduction method for continuous and categorical variables, and the number of dimensions retained. Results indicated that utilizing singular value decomposition to reduce categorical variables to three dimensions, coupled with principal component analysis reducing continuous variables to three dimensions, produced the highest model validation accuracy of 0.834 within the GPADO framework. This optimal configuration facilitated automated ML training, resulting in a final validation set accuracy of 0.851 and a test set accuracy of 0.836. The resulting model provided a more precise spatial distribution of groundwater potential and demonstrated the GPADO framework's effectiveness in improving GPA accuracy, particularly in data-scarce regions. The GPADO framework offers a valuable approach for enhancing GPA studies.
引用
收藏
页数:15
相关论文
共 100 条
[51]   Mapping Groundwater Potential Using a Novel Hybrid Intelligence Approach [J].
Miraki, Shaghayegh ;
Zanganeh, Sasan Hedayati ;
Chapi, Kamran ;
Singh, Vijay P. ;
Shirzadi, Ataollah ;
Shahabi, Himan ;
Binh Thai Pham .
WATER RESOURCES MANAGEMENT, 2019, 33 (01) :281-302
[52]  
Mishra R. K., 2023, British Journal of Multidisciplinary and Advanced Studies, V4, P1, DOI DOI 10.37745/BJMAS.2022.0208
[53]   Integration of hydrogeological data, GIS and AHP techniques applied to delineate groundwater potential zones in sandstone, limestone and shales rocks of the Damoh district, (MP) central India [J].
Moharir, Kanak N. ;
Pande, Chaitanya B. ;
Gautam, Vinay Kumar ;
Singh, Sudhir Kumar ;
Rane, Nitin Liladhar .
ENVIRONMENTAL RESEARCH, 2023, 228
[54]   Ensemble Boosting and Bagging Based Machine Learning Models for Groundwater Potential Prediction [J].
Mosavi, Amirhosein ;
Sajedi Hosseini, Farzaneh ;
Choubin, Bahram ;
Goodarzi, Massoud ;
Dineva, Adrienn A. ;
Rafiei Sardooi, Elham .
WATER RESOURCES MANAGEMENT, 2021, 35 (01) :23-37
[55]   Application of extreme gradient boosting and parallel random forest algorithms for assessing groundwater spring potential using DEM-derived factors [J].
Naghibi, Seyed Amir ;
Hashemi, Hossein ;
Berndtsson, Ronny ;
Lee, Saro .
JOURNAL OF HYDROLOGY, 2020, 589 (589)
[56]   Application of Support Vector Machine, Random Forest, and Genetic Algorithm Optimized Random Forest Models in Groundwater Potential Mapping [J].
Naghibi, Seyed Amir ;
Ahmadi, Kourosh ;
Daneshi, Alireza .
WATER RESOURCES MANAGEMENT, 2017, 31 (09) :2761-2775
[57]   Application of GIS based data driven evidential belief function model to predict groundwater potential zonation [J].
Nampak, Haleh ;
Pradhan, Biswajeet ;
Abd Manap, Mohammad .
JOURNAL OF HYDROLOGY, 2014, 513 :283-300
[58]  
Nanga S., 2021, J. Data Anal. Inf. Process, V9, P189, DOI [DOI 10.4236/JDAIP.2021.93013, 10.4236/jdaip.2021.93013]
[59]  
Nguyen H.D., 2024, ACTA GEOPHYS, DOI [10.1007/s11600-024, DOI 10.1007/s11600-024-01331-5]
[60]   Temporal variations of groundwater quality in the Western Jianghan Plain, China [J].
Niu, Beibei ;
Wang, Huanhuan ;
Loaiciga, Hugo A. ;
Hong, Song ;
Shao, Wei .
SCIENCE OF THE TOTAL ENVIRONMENT, 2017, 578 :542-550