A Cost-Effective CNN-LSTM-Based Solution for Predicting Faulty Remote Water Meter Reading Devices in AMI Systems

被引：3

作者：

Lee, Jaeseung ^{[1
]}

Choi, Woojin ^{[1
]}

Kim, Jibum ^{[1
]}

机构：

[1] Incheon Natl Univ, Dept Comp Sci & Engn, Incheon 22012, South Korea

来源：

SENSORS | 2021年 / 21卷 / 18期

基金：

新加坡国家研究基金会;

关键词：

machine learning; advanced meter infrastructure (AMI); CNN-LSTM; deep learning; water; fault detection;

D O I：

10.3390/s21186229

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Automatic meter infrastructure (AMI) systems using remote metering are being widely used to utilize water resources efficiently and minimize non-revenue water. We propose a convolutional neural network-long short-term memory network (CNN-LSTM)-based solution that can predict faulty remote water meter reading (RWMR) devices by analyzing approximately 2,850,000 AMI data collected from 2762 customers over 360 days in a small-sized city in South Korea. The AMI data used in this study is a challenging, highly unbalanced real-world dataset with limited features. First, we perform extensive preprocessing steps and extract meaningful features for handling this challenging dataset with limited features. Next, we select important features that have a higher influence on the classifier using a recursive feature elimination method. Finally, we apply the CNN-LSTM model for predicting faulty RWMR devices. We also propose an efficient training method for ML models to learn the unbalanced real-world AMI dataset. A cost-effective threshold for evaluating the performance of ML models is proposed by considering the mispredictions of ML models as well as the cost. Our experimental results show that an F-measure of 0.82 and MCC of 0.83 are obtained when the CNN-LSTM model is used for prediction.

引用

页数：20

共 20 条

[1]

Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265

[2]

[Anonymous], 2006, P 23 INT C MACH LEAR, DOI 10.1145/1143844.1143874

[3] Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric [J].

Boughorbel, Sabri ;

Jarray, Fethi ;

El-Anbari, Mohammed .

PLOS ONE, 2017, 12 (06)

[4] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[5] An introduction to ROC analysis [J].

Fawcett, Tom .

PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874

[6] Gene selection for cancer classification using support vector machines [J].

Guyon, I ;

Weston, J ;

Barnhill, S ;

Vapnik, V .

MACHINE LEARNING, 2002, 46 (1-3) :389-422

[7] Analytics-driven asset management [J].

Hampapur, A. ;

Cao, H. ;

Davenport, A. ;

Dong, W. S. ;

Fenhagen, D. ;

Feris, R. S. ;

Goldszmidt, G. ;

Jiang, Z. B. ;

Kalagnanam, J. ;

Kumar, T. ;

Li, H. ;

Liu, X. ;

Mahatma, S. ;

Pankanti, S. ;

Pelleg, D. ;

Sun, W. ;

Taylor, M. ;

Tian, C. H. ;

Wasserkrug, S. ;

Xie, L. ;

Lodhi, M. ;

Kiely, C. ;

Butturff, K. ;

Desjardins, L. .

IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2011, 55 (1-2)

[8] Learning from Imbalanced Data [J].

He, Haibo ;

Garcia, Edwardo A. .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) :1263-1284

[9]

Hochreiter S., 1997, Neural Computation, V9, P1735

[10] Agreement, the F-measure, and reliability in information retrieval [J].

Hripcsak, G ;

Rothschild, AS .

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2005, 12 (03) :296-298

← 1 2 →