A case study to examine the imputation of missing data to improve clustering analysis of building electrical demand

被引：10

作者：

Inman, Daniel ^{[1
]}

Elmore, Ryan ^{[2
]}

Bush, Brian ^{[1
]}

机构：

[1] Natl Renewable Energy Lab, Strateg Energy Anal Ctr, Golden, CO 80401 USA

[2] Natl Renewable Energy Lab, Computat Sci Ctr, Golden, CO 80401 USA

来源：

BUILDING SERVICES ENGINEERING RESEARCH & TECHNOLOGY | 2015年 / 36卷 / 05期

关键词：

Clustering; missing data; building electrical demand; FAULT-DETECTION; SYSTEMS;

D O I：

10.1177/0143624415573215

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Building performance data are widely used for daily operation, improving building efficiency, identifying and diagnosing performance problems, and commissioning. In this study, the authors explore the use of missing data imputation and clustering on an electrical demand dataset. The objective was to compare four approaches of data imputation and clustering analysis. Results of this study suggest that using multiple imputation to fill in missing data prior to performing clustering analysis results in more informative clusters. Commonly used methods to fill in missing data lead to changes in cluster membership that are not suggestive of a change in the building's performance, but instead is a result of the choice of imputation method used.Practical application: The authors demonstrate, through the use of a case study, the application of a statistically sound method for filling in missing data in large buildings performance datasets. The methods used in this analysis are available through the open-source programming language R and are straight forward to implement. The approach demonstrated in this case study could aid buildings analysts with fault detection and continuous commissioning of large commercial buildings.

引用

页码：628 / 637

页数：10

共 50 条

[1] Cooperative Clustering Missing Data Imputation
Wan, Daoming
Razavi-Far, Roozbeh
Saif, Mehrdad
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1039 - 1045
[2] Impact of missing data imputation methods on gene expression clustering and classification
de Souto, Marcilio C. P.
Jaskowiak, Pablo A.
Costa, Ivan G.
BMC BIOINFORMATICS, 2015, 16
[3] A Missing Data Imputation Approach Using Clustering and Maximum Likelihood Estimation
Albayrak, Muammer
Turhan, Kemal
Kurt, Burcin
2017 MEDICAL TECHNOLOGIES NATIONAL CONGRESS (TIPTEKNO), 2017,
[4] Instance driven clustering for the imputation of missing data in KDD
Ilango, P.
Vijayakumar, K.
Babu, M. Rajasekhara
INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2014, 12 (01) : 69 - 81
[5] Impact of missing data imputation methods on gene expression clustering and classification
Marcilio CP de Souto
Pablo A Jaskowiak
Ivan G Costa
BMC Bioinformatics, 16
[6] Multiple imputation for missing data in a longitudinal cohort study: a tutorial based on a detailed case study involving imputation of missing outcome data
Lee, Katherine J.
Roberts, Gehan
Doyle, Lex W.
Anderson, Peter J.
Carlin, John B.
INTERNATIONAL JOURNAL OF SOCIAL RESEARCH METHODOLOGY, 2016, 19 (05) : 575 - 591
[7] Correlation Clustering Imputation for Diagnosing Attacks and Faults With Missing Power Grid Data
Razavi-Far, Roozbeh
Farajzadeh-Zanjani, Maryam
Saif, Mehrdad
Chakrabarti, Shiladitya
IEEE TRANSACTIONS ON SMART GRID, 2020, 11 (02) : 1453 - 1464
[8] Imputation method for missing data based on clustering and measure of property
Kim, Sunghyun
Kim, Dongjae
KOREAN JOURNAL OF APPLIED STATISTICS, 2018, 31 (01) : 29 - 40
[9] Partial distance evidential clustering for missing data with multiple imputation
Tian, Hong-Peng
Zhang, Zhen
KNOWLEDGE-BASED SYSTEMS, 2025, 310
[10] Clustering with missing and left-censored data: A simulation study comparing multiple-imputation-based procedures
Faucheux, Lilith
Resche-Rigon, Matthieu
Curis, Emmanuel
Soumelis, Vassili
Chevret, Sylvie
BIOMETRICAL JOURNAL, 2021, 63 (02) : 372 - 393

← 1 2 3 4 5 →