A Cost-Effective CNN-LSTM-Based Solution for Predicting Faulty Remote Water Meter Reading Devices in AMI Systems

被引:3
作者
Lee, Jaeseung [1 ]
Choi, Woojin [1 ]
Kim, Jibum [1 ]
机构
[1] Incheon Natl Univ, Dept Comp Sci & Engn, Incheon 22012, South Korea
基金
新加坡国家研究基金会;
关键词
machine learning; advanced meter infrastructure (AMI); CNN-LSTM; deep learning; water; fault detection;
D O I
10.3390/s21186229
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Automatic meter infrastructure (AMI) systems using remote metering are being widely used to utilize water resources efficiently and minimize non-revenue water. We propose a convolutional neural network-long short-term memory network (CNN-LSTM)-based solution that can predict faulty remote water meter reading (RWMR) devices by analyzing approximately 2,850,000 AMI data collected from 2762 customers over 360 days in a small-sized city in South Korea. The AMI data used in this study is a challenging, highly unbalanced real-world dataset with limited features. First, we perform extensive preprocessing steps and extract meaningful features for handling this challenging dataset with limited features. Next, we select important features that have a higher influence on the classifier using a recursive feature elimination method. Finally, we apply the CNN-LSTM model for predicting faulty RWMR devices. We also propose an efficient training method for ML models to learn the unbalanced real-world AMI dataset. A cost-effective threshold for evaluating the performance of ML models is proposed by considering the mispredictions of ML models as well as the cost. Our experimental results show that an F-measure of 0.82 and MCC of 0.83 are obtained when the CNN-LSTM model is used for prediction.
引用
收藏
页数:20
相关论文
共 20 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]  
[Anonymous], 2006, P 23 INT C MACH LEAR, DOI 10.1145/1143844.1143874
[3]   Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric [J].
Boughorbel, Sabri ;
Jarray, Fethi ;
El-Anbari, Mohammed .
PLOS ONE, 2017, 12 (06)
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[6]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[7]   Analytics-driven asset management [J].
Hampapur, A. ;
Cao, H. ;
Davenport, A. ;
Dong, W. S. ;
Fenhagen, D. ;
Feris, R. S. ;
Goldszmidt, G. ;
Jiang, Z. B. ;
Kalagnanam, J. ;
Kumar, T. ;
Li, H. ;
Liu, X. ;
Mahatma, S. ;
Pankanti, S. ;
Pelleg, D. ;
Sun, W. ;
Taylor, M. ;
Tian, C. H. ;
Wasserkrug, S. ;
Xie, L. ;
Lodhi, M. ;
Kiely, C. ;
Butturff, K. ;
Desjardins, L. .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2011, 55 (1-2)
[8]   Learning from Imbalanced Data [J].
He, Haibo ;
Garcia, Edwardo A. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) :1263-1284
[9]  
Hochreiter S., 1997, Neural Computation, V9, P1735
[10]   Agreement, the F-measure, and reliability in information retrieval [J].
Hripcsak, G ;
Rothschild, AS .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2005, 12 (03) :296-298