A case study to examine the imputation of missing data to improve clustering analysis of building electrical demand

被引:10
|
作者
Inman, Daniel [1 ]
Elmore, Ryan [2 ]
Bush, Brian [1 ]
机构
[1] Natl Renewable Energy Lab, Strateg Energy Anal Ctr, Golden, CO 80401 USA
[2] Natl Renewable Energy Lab, Computat Sci Ctr, Golden, CO 80401 USA
关键词
Clustering; missing data; building electrical demand; FAULT-DETECTION; SYSTEMS;
D O I
10.1177/0143624415573215
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Building performance data are widely used for daily operation, improving building efficiency, identifying and diagnosing performance problems, and commissioning. In this study, the authors explore the use of missing data imputation and clustering on an electrical demand dataset. The objective was to compare four approaches of data imputation and clustering analysis. Results of this study suggest that using multiple imputation to fill in missing data prior to performing clustering analysis results in more informative clusters. Commonly used methods to fill in missing data lead to changes in cluster membership that are not suggestive of a change in the building's performance, but instead is a result of the choice of imputation method used.Practical application: The authors demonstrate, through the use of a case study, the application of a statistically sound method for filling in missing data in large buildings performance datasets. The methods used in this analysis are available through the open-source programming language R and are straight forward to implement. The approach demonstrated in this case study could aid buildings analysts with fault detection and continuous commissioning of large commercial buildings.
引用
收藏
页码:628 / 637
页数:10
相关论文
共 50 条
  • [1] Cooperative Clustering Missing Data Imputation
    Wan, Daoming
    Razavi-Far, Roozbeh
    Saif, Mehrdad
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1039 - 1045
  • [2] Impact of missing data imputation methods on gene expression clustering and classification
    de Souto, Marcilio C. P.
    Jaskowiak, Pablo A.
    Costa, Ivan G.
    BMC BIOINFORMATICS, 2015, 16
  • [3] A Missing Data Imputation Approach Using Clustering and Maximum Likelihood Estimation
    Albayrak, Muammer
    Turhan, Kemal
    Kurt, Burcin
    2017 MEDICAL TECHNOLOGIES NATIONAL CONGRESS (TIPTEKNO), 2017,
  • [4] Instance driven clustering for the imputation of missing data in KDD
    Ilango, P.
    Vijayakumar, K.
    Babu, M. Rajasekhara
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2014, 12 (01) : 69 - 81
  • [5] Impact of missing data imputation methods on gene expression clustering and classification
    Marcilio CP de Souto
    Pablo A Jaskowiak
    Ivan G Costa
    BMC Bioinformatics, 16
  • [6] Multiple imputation for missing data in a longitudinal cohort study: a tutorial based on a detailed case study involving imputation of missing outcome data
    Lee, Katherine J.
    Roberts, Gehan
    Doyle, Lex W.
    Anderson, Peter J.
    Carlin, John B.
    INTERNATIONAL JOURNAL OF SOCIAL RESEARCH METHODOLOGY, 2016, 19 (05) : 575 - 591
  • [7] Correlation Clustering Imputation for Diagnosing Attacks and Faults With Missing Power Grid Data
    Razavi-Far, Roozbeh
    Farajzadeh-Zanjani, Maryam
    Saif, Mehrdad
    Chakrabarti, Shiladitya
    IEEE TRANSACTIONS ON SMART GRID, 2020, 11 (02) : 1453 - 1464
  • [8] Imputation method for missing data based on clustering and measure of property
    Kim, Sunghyun
    Kim, Dongjae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2018, 31 (01) : 29 - 40
  • [9] Partial distance evidential clustering for missing data with multiple imputation
    Tian, Hong-Peng
    Zhang, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [10] Clustering with missing and left-censored data: A simulation study comparing multiple-imputation-based procedures
    Faucheux, Lilith
    Resche-Rigon, Matthieu
    Curis, Emmanuel
    Soumelis, Vassili
    Chevret, Sylvie
    BIOMETRICAL JOURNAL, 2021, 63 (02) : 372 - 393