Comparison of different clustering approaches on different databases of smart meter data

被引:0
作者
Ferrando, Martina [1 ,2 ]
Nozza, Debora [3 ]
Hong, Tianzhen [2 ]
Causone, Francesco [1 ]
机构
[1] Politecn Milan, Milan, Italy
[2] Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
[3] Univ Bocconi, Milan, Italy
来源
PROCEEDINGS OF BUILDING SIMULATION 2021: 17TH CONFERENCE OF IBPSA | 2022年 / 17卷
关键词
CLASSIFICATION;
D O I
10.26868/25222708.2021.30193
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Various clustering methods have been applied to determine representative groups of buildings based on their energy use patterns. We reviewed and selected the most commonly used clustering methods, including k-means, k-medoids, Self-Organizing Map (SOM) coupled with k-means and hierarchical, and our proposed deep clustering algorithm for comparative performance assessment using datasets of smart meters. After the data preparation (data cleaning, segmentation, and normalization), the clustering is run, firstly, letting the number of clusters free to be chosen by the optimization process, and then forcing it to be equal to the number of primary functions of buildings. Depending on the purpose of clustering, e.g., to identify daily 24-hour load shape, to identify primary building use type (e.g., office, residential, school, retail), the optimal number of clustering can vary greatly. Thus, based on the final aim, forcing somehow the number of clusters is the most followed and suggested for engineering purposes. The k-means, the k-medoid, and the hierarchical algorithms show the best results, in all cases. While for the nature of the databases the additional step of adding a SOM to the k-means algorithms does not show improvements in terms of evaluation metrics. The direct comparison of the different algorithms gives a clear overview of the existing main clustering approaches and their performance in capturing typical use patterns in typical smart meter databases. The resulting cluster centroids could be used to better understand and characterize the energy use patterns of different buildings and building typologies with the final aims of benchmarking or customers segmentation.
引用
收藏
页码:1155 / 1162
页数:8
相关论文
共 24 条
[1]   INSTANCE-BASED LEARNING ALGORITHMS [J].
AHA, DW ;
KIBLER, D ;
ALBERT, MK .
MACHINE LEARNING, 1991, 6 (01) :37-66
[2]  
Calinski T., 1974, Communications in Statistics, V3, P1, DOI [DOI 10.1080/03610927408827101, 10.1080/03610927408827101]
[3]   Italian prototype building models for urban scale building performance simulation [J].
Carnieletto, Laura ;
Ferrando, Martina ;
Teso, Lorenzo ;
Sun, Kaiyu ;
Zhang, Wanni ;
Causone, Francesco ;
Romagnoni, Piercarlo ;
Zarrella, Angelo ;
Hong, Tianzhen .
BUILDING AND ENVIRONMENT, 2021, 192
[4]   A data-driven procedure to model occupancy and occupant-related electric load profiles in residential buildings for energy simulation [J].
Causone, Francesco ;
Carlucci, Salvatore ;
Ferrando, Martina ;
Marchenko, Alla ;
Erba, Silvia .
ENERGY AND BUILDINGS, 2019, 202
[5]   Load pattern-based classification of electricity customers [J].
Chicco, G ;
Napoli, R ;
Piglione, F ;
Postolache, P ;
Scutariu, M ;
Toader, C .
IEEE TRANSACTIONS ON POWER SYSTEMS, 2004, 19 (02) :1232-1239
[6]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[7]  
Devlin M, 2019, IEEE POW ENER SOC GE
[8]  
Ferrando M., 2019, BUILD SIMUL-CHINA, P2246, DOI [10.26868/25222708.2019.210750, DOI 10.26868/25222708.2019.210750]
[9]   Classification and Clustering of Electricity Demand Patterns in Industrial Parks [J].
Hernandez, Luis ;
Baladron, Carlos ;
Aguiar, Javier M. ;
Carro, Belen ;
Sanchez-Esguevillas, Antonio .
ENERGIES, 2012, 5 (12) :5215-5228
[10]  
Himpe E., 2019, E3S WEB CONFERENCES, V111, DOI [10.1051/e3sconf/201911105011, DOI 10.1051/E3SC0NF/201911105011]