A data mining-based framework for the identification of daily electricity usage patterns and anomaly detection in building electricity consumption data

被引:103
作者
Liu, Xue [1 ,2 ]
Ding, Yong [1 ,2 ]
Tang, Hao [1 ,2 ]
Xiao, Feng [3 ]
机构
[1] Chongqing Univ, Minist Educ, Joint Int Res Lab Green Bldg & Built Environm, Chongqing 400045, Peoples R China
[2] Chongqing Univ, Minist Sci & Technol, Natl Ctr Int Res Low Carbon & Green Bldg, Chongqing 400045, Peoples R China
[3] Southwestern Univ Finance & Econ, Sch Business Adm, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Building energy management; Time series clustering; Decision tree; Knowledge discovery; Electricity usage pattern; Data mining; LOAD PROFILES; CLUSTER-ANALYSIS; MIXTURE MODEL; BENCHMARKING; PREDICTION; ANALYTICS; STRATEGY;
D O I
10.1016/j.enbuild.2020.110601
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
With the development of advanced information techniques, smart energy meters have made a considerable amount of real-time electricity consumption data available. These data provide a promising way to understand energy usage patterns and improve building energy management. However, previous studies have paid more attention to methodologies for the identification of energy usage patterns and are limited in the interpretability and applications of the patterns. In this context, this paper proposes a general data mining-based framework that can extract typical electricity load patterns (TELPs) and discover insightful information hidden in the patterns. The framework integrates multiple data mining techniques and mainly consists of three phases: data preparation, identification of TELPs and knowledge discovery in the patterns. A new clustering method with a two-step clustering analysis is proposed to identify the TELPs at the individual building level. Before clustering, five statistical features that represent the shapes of electricity load profiles are first defined to reduce the dimensions of daily electricity load profiles. The first clustering step aims at detecting outliers of daily electricity load profiles (DELPs) by using the density-based spatial clustering application with noise (DBSCAN) algorithm clustering technique, which addresses the data quality issues for electricity consumption data derived from energy consumption monitoring platforms (ECMPs). The second clustering step aims at grouping similar DELPs by means of the k-means algorithm to extract TELPs. The effectiveness of the proposed clustering method is demonstrated by a comparison with two single-step clustering techniques. Furthermore, a classification and regression tree (CART) algorithm is employed to discover insightful knowledge on TELPs and improve the interpretability of clustering results, namely, to explain the relations between dynamic influencing factors related to electricity consumption and TELPs. The proposed framework is applied to analyze the time-series electricity consumption data of three practical office buildings in Chongqing, and its effectiveness has been confirmed. A potential application of discovered knowledge is presented: early fault detection of anomalous electricity load profiles. The proposed framework can provide building managers with an efficient way to understand the characteristics of building electricity usage patterns and detect anomalies therein. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:22
相关论文
共 52 条
[1]   Time-series clustering - A decade review [J].
Aghabozorgi, Saeed ;
Shirkhorshidi, Ali Seyed ;
Teh Ying Wah .
INFORMATION SYSTEMS, 2015, 53 :16-38
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
[Anonymous], 2012, Energy Technology Perspectives 2012, DOI [10.1787/energytech-2012-en, DOI 10.1787/ENERGYTECH-2012-EN]
[4]   Dynamic clustering segmentation applied to load profiles of energy consumption from Spanish customers [J].
Benitez, Ignacio ;
Quijano, Alfredo ;
Diez, Jose-Luis ;
Delgado, Ignacio .
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2014, 55 :437-448
[5]   Assessing a mixture model for clustering with the integrated completed likelihood [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (07) :719-725
[6]   clValid: An R package for cluster validation [J].
Brock, Guy ;
Datta, Susmita ;
Pihur, Vasyl ;
Datta, Somnath .
JOURNAL OF STATISTICAL SOFTWARE, 2008, 25 (04) :1-22
[7]  
CALINSKI T, 1968, BIOMETRICS, V24, P207
[8]   Automated load pattern learning and anomaly detection for enhancing energy management in smart buildings [J].
Capozzoli, Alfonso ;
Piscitelli, Marco Savino ;
Brandi, Silvio ;
Grassi, Daniele ;
Chicco, Gianfranco .
ENERGY, 2018, 157 :336-352
[9]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[10]   Cluster analysis of residential heat load profiles and the role of technical and household characteristics [J].
do Carmo, Carolina Madeira R. ;
Christensen, Toke Haunstrup .
ENERGY AND BUILDINGS, 2016, 125 :171-180