Siamese Network-Based Transfer Learning Model to Predict Geogenic Contaminated Groundwaters

被引:15
作者
Cao, Hailong [1 ,2 ]
Xie, Xianjun [1 ,2 ]
Shi, Jianbo [1 ,2 ]
Jiang, Guibin [3 ]
Wang, Yanxin [1 ,2 ]
机构
[1] China Univ Geosci, Sch Environm Studies, Wuhan 430074, Peoples R China
[2] China Univ Geosci, State Key Lab Biogeol & Environm Geol, Wuhan 430074, Peoples R China
[3] Chinese Acad Sci, Res Ctr Ecoenvironm Sci, State Key Lab Environm Chem & Ecotoxicol, Beijing 100085, Peoples R China
基金
中国国家自然科学基金;
关键词
groundwater; Siamese network; transfer learning; class-imbalanced data; prediction; ARSENIC CONTAMINATION; FLUORIDE; IODINE; CHINA; WELLS;
D O I
10.1021/acs.est.1c08682
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Exposure to geogenic contaminated groundwaters (GCGs) is a significant public health concern. Machine learning models are powerful tools for the discovery of potential GCGs. However, the insufficient groundwater quality data and the fact that GCGs are typically a minority class in data hinder models to produce meaningful GCG predictions. To address this issue, a deep learning method, Siamese network-based transfer learning (SNTL), is used to estimate the probability that hazardous substances are present in groundwater above a threshold based on limited and class-imbalanced data. SNTL greatly reduces the amount of required training data and eliminates negative effects of class-imbalanced data on prediction model performance. The predictions of three typical GCGs (high arsenic/fluoride/iodine groundwater) show that the SNTL models provide higher (about 80%) and more balanced sensitivity and specificity than benchmark Random Forest models, indicating that SNTL models can predict both GCGs and non-GCGs. Therefore, protecting populations from GCG exposure in areas where other prediction methods fail to contribute risk information due to poor groundwater quality data can be enabled by SNTL.
引用
收藏
页码:11071 / 11079
页数:9
相关论文
共 62 条
[51]   A note on Youden's J and its cost ratio [J].
Smits, Niels .
BMC MEDICAL RESEARCH METHODOLOGY, 2010, 10
[52]   Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filtering [J].
Tao, Xinmin ;
Li, Qing ;
Guo, Wenjie ;
Ren, Chao ;
He, Qing ;
Liu, Rui ;
Zou, JunRong .
INFORMATION SCIENCES, 2020, 519 :43-73
[53]  
Taylor RG, 2013, NAT CLIM CHANGE, V3, P322, DOI [10.1038/nclimate1744, 10.1038/NCLIMATE1744]
[54]   Crop pest classification based on deep convolutional neural network and transfer learning [J].
Thenmozhi, K. ;
Reddy, U. Srinivasulu .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 164
[55]   Genesis of geogenic contaminated groundwater: As, F and I [J].
Wang, Yanxin ;
Li, Junxia ;
Ma, Teng ;
Xie, Xianjun ;
Deng, Yamin ;
Gan, Yiqun .
CRITICAL REVIEWS IN ENVIRONMENTAL SCIENCE AND TECHNOLOGY, 2021, 51 (24) :2895-2933
[56]   Predicting groundwater arsenic contamination in Southeast Asia from surface parameters [J].
Winkel, Lenny ;
Berg, Michael ;
Amini, Manouchehr ;
Hug, Stephan J. ;
Johnson, C. Annette .
NATURE GEOSCIENCE, 2008, 1 (08) :536-542
[57]   A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning [J].
Xu, Tingting ;
Coco, Giovanni ;
Neale, Martin .
WATER RESEARCH, 2020, 177
[58]   Predicting Geogenic Arsenic Contamination in Shallow Groundwater of South Louisiana, United States [J].
Yang, Ningfang ;
Winkel, Lenny H. E. ;
Johannesson, Karen H. .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2014, 48 (10) :5660-5666
[59]   Iodine in groundwater of the North China Plain: Spatial patterns and hydrogeochemical processes of enrichment [J].
Zhang, Eryong ;
Wang, Yanyan ;
Qian, Yong ;
Ma, Teng ;
Zhang, Dongxiao ;
Zhan, Hongbin ;
Zhang, Zhaoji ;
Fei, Yuhong ;
Wang, Suming .
JOURNAL OF GEOCHEMICAL EXPLORATION, 2013, 135 :40-53
[60]   Coupling predicted model of arsenic in groundwater with endemic arsenism occurrence in Shanxi Province, Northern China [J].
Zhang, Qiang ;
Rodriguez-Lado, Luis ;
Liu, Juan ;
Johnson, C. Annette ;
Zheng, Quanmei ;
Sun, Guifan .
JOURNAL OF HAZARDOUS MATERIALS, 2013, 262 :1147-1153