Performance Evaluation of Pipe Break Machine Learning Models Using Datasets from Multiple Utilities

被引:15
作者
Chen, Thomas Ying-Jeh [1 ]
Vladeanu, Greta [1 ]
Yazdekhasti, Sepideh [1 ]
Daly, Craig Michael [1 ]
机构
[1] Xylem Inc, 8920 MD-108, Columbia, MD 21045 USA
关键词
Drinking water systems; Risk analysis; Pipe breaks; Machine learning; PREDICTION MODELS; ASSET MANAGEMENT; WATER; DETERIORATION; METHODOLOGY; FAILURES;
D O I
10.1061/(ASCE)IS.1943-555X.0000683
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Water pipeline infrastructures are critical for the delivery of lifeline services; however, these aging systems are experiencing increasing breakage rates. To assist utilities in identifying the most vulnerable assets, sustained research efforts have been made in developing machine learning models to accurately predict future failures. The performance of these methods heavily depends on the quantity of reliable data, while most utilities only have limited records of historical pipe breaks. To overcome the limitation of data availability, this article presents a case study exploring the performance of machine learning methods for predicting future failures when system information from multiple utilities is combined. Six utilities are considered, for which predictive models are trained and evaluated in several scenarios, (1) using data from only a single reference system, (2) all systems combined, and (3) a bootstrapped sample of multiple systems to match the pipe material distribution of the reference system. Empirical results suggest that variance controlling algorithms, such as random forests, are less sensitive to the availability of data, and that introducing information from third-party sources only leads to marginal changes in performance. Overall, the amount of break records from the reference system itself has the largest influence on accuracy, suggesting that utilities must keep reliable historical break data to maximize the power of predictive modeling for their asset management programs.
引用
收藏
页数:13
相关论文
共 44 条
[1]  
[Anonymous], 2009, NAT INFR PROT PLAN
[2]  
ASCE, 2017, Infrastructure Report Card
[3]   Forecasting watermain failure using artificial neural network modelling [J].
Asnaashari, Ahmad ;
McBean, Edward A. ;
Gharabaghi, Bahram ;
Tutt, Donald .
CANADIAN WATER RESOURCES JOURNAL, 2013, 38 (01) :24-33
[4]  
Baird GM, 2010, J AM WATER WORKS ASS, V102, P74
[5]   Improving pipe failure predictions: Factors affecting pipe failure in drinking water networks [J].
Barton, Neal Andrew ;
Farewell, Timothy Stephen ;
Hallett, Stephen Henry ;
Acland, Timothy Francis .
WATER RESEARCH, 2019, 164
[6]   Prediction of water main failures with the spatial clustering of breaks [J].
Chen, Thomas Ying-Jeh ;
Guikema, Seth David .
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2020, 203
[7]   Statistical Modeling in Absence of System Specific Data: Exploratory Empirical Analysis for Prediction of Water Main Breaks [J].
Chen, Thomas Ying-Jeh ;
Beekman, Jared Anthony ;
Guikema, Seth David ;
Shashaani, Sara .
JOURNAL OF INFRASTRUCTURE SYSTEMS, 2019, 25 (02)
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]   A Data Science Approach to Understanding Residential Water Contamination in Flint [J].
Chojnacki, Alex ;
Dai, Chengyu ;
Farahi, Arya ;
Shi, Guangsha ;
Webb, Jared ;
Zhang, Daniel T. ;
Abernethy, Jacob ;
Schwartz, Eric .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :1407-1416
[10]   State-of-the-technology review on water pipe condition, deterioration and failure rate prediction models! [J].
Clair, Alison M. St. ;
Sinha, Sunil .
URBAN WATER JOURNAL, 2012, 9 (02) :85-112