Trimming outliers using trees: Winning solution of the Large-scale Energy Anomaly Detection (LEAD) competition

被引:6
作者
Fu, Chun [1 ]
Arjunan, Pandarasamy [2 ]
Miller, Clayton [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Berkeley Educ Alliance Res Singapore Ltd, Singapore, Singapore
来源
PROCEEDINGS OF THE 2022 THE 9TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2022 | 2022年
基金
新加坡国家研究基金会;
关键词
Building energy; Smart meter; Anomaly detection; Supervised learning; Classification; FAULT-DETECTION; THEFT;
D O I
10.1145/3563357.3566147
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Prediction of building energy consumption using machine learning models has been a focal point of research for decades. However, some causes of forecast errors, particularly data quality, have not been adequately addressed, which may affect the accuracy of forecasting models and subsequent energy management. To solve the issue of data quality, a classifier that can automatically detect time series anomalies is the goal that researchers have been pursuing. Large-scale Energy Anomaly Detection (LEAD), a community competition hosted on the Kaggle platform, was created for this purpose as well as to provide a foundation for benchmarking solutions. In this competition, 200 energy time series worldwide with labeled anomalies were provided to train a classification model to predict anomalies of another 206 unseen time series. The proposed winning solution is a tree-based supervised learning anomaly classifier with ROC-AUC score as high as 0.9866 on private leaderboard. This article describes and analyzes in depth a variety of commonly employed techniques for improving the classification model. Among these strategies, feature engineering requires the most effort and dominates all other techniques; value-changing features that can represent the level of time-series variation have a particularly positive impact. Besides, the classification accuracy of solutions in the competition can serve as a benchmark for future research on supervised learning of energy anomaly detection.
引用
收藏
页码:456 / 461
页数:6
相关论文
共 15 条
  • [1] Fault detection analysis using data mining techniques for a cluster of smart office buildings
    Capozzoli, Alfonso
    Lauro, Fiorella
    Khan, Imran
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (09) : 4324 - 4338
  • [2] Using the fuzzy linear regression method to benchmark the energy efficiency of commercial buildings
    Chung, William
    [J]. APPLIED ENERGY, 2012, 95 : 45 - 49
  • [3] Gulati Manoj, 2022, arXiv, DOI [10.48550/ARXIV.2203.17256, DOI 10.48550/ARXIV.2203.17256]
  • [4] Rule-based classification of energy theft and anomalies in consumers load demand profile
    Jain, Sonal
    Choksi, Kushan A.
    Pindoriya, Naran M.
    [J]. IET SMART GRID, 2019, 2 (04) : 612 - 624
  • [5] Methods for fault detection, diagnostics, and prognostics for building systems - A review, part I
    Katipamula, S
    Brambley, MR
    [J]. HVAC&R RESEARCH, 2005, 11 (01): : 3 - 25
  • [6] Krishna Varun Badrinath, 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Proceedings, P407, DOI 10.1109/DSN.2016.44
  • [7] Fault, detection and diagnosis for building cooling system with a tree-structured learning method
    Li, Dan
    Zhou, Yuxun
    Hu, Guoqiang
    Spanos, Costas J.
    [J]. ENERGY AND BUILDINGS, 2016, 127 : 540 - 551
  • [8] Mao Guojun, 2005, Data mining theory and algorithm
  • [9] Limitations of machine learning for building energy prediction: ASHRAE Great Energy Predictor III Kaggle competition error analysis
    MILLER, C. L. A. Y. T. O. N.
    PICCHETTI, B. I. A. N. C. A.
    FU, C. H. U. N.
    PANTELIC, J. O. V. A. N.
    [J]. SCIENCE AND TECHNOLOGY FOR THE BUILT ENVIRONMENT, 2022, 28 (05) : 610 - 627
  • [10] The ASHRAE Great Energy Predictor III competition: Overview and results
    Miller, Clayton
    Arjunan, Pandarasamy
    Kathirgamanathan, Anjukan
    Fu, Chun
    Roth, Jonathan
    Park, June Young
    Balbach, Chris
    Gowri, Krishnan
    Nagy, Zoltan
    Fontanini, Anthony D.
    Haberl, Jeff
    [J]. SCIENCE AND TECHNOLOGY FOR THE BUILT ENVIRONMENT, 2020, 26 (10) : 1427 - 1447