Anomaly Detection with Machine Learning Algorithms and Big Data in Electricity Consumption

被引:38
|
作者
Oprea, Simona-Vasilica [1 ]
Bara, Adela [1 ]
Puican, Florina Camelia [1 ]
Radu, Ioan Cosmin [2 ]
机构
[1] Bucharest Univ Econ Studies, Dept Econ Informat & Cybernet, Romana Sq 6, Bucharest 010374, Romania
[2] Univ Politehn Bucuresti, Dept Engn Foreign Languages, Splaiul Independent 313, Bucharest 060042, Romania
关键词
anomaly detection; unsupervised and supervised machine learning; big data; smart grid; fraud detection; DETECTION FRAMEWORK; THEFT DETECTION; FRAUD DETECTION; ENERGY THEFT;
D O I
10.3390/su131910963
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
When analyzing smart metering data, both reading errors and frauds can be identified. The purpose of this analysis is to alert the utility companies to suspicious consumption behavior that could be further investigated with on-site inspections or other methods. The use of Machine Learning (ML) algorithms to analyze consumption readings can lead to the identification of malfunctions, cyberattacks interrupting measurements, or physical tampering with smart meters. Fraud detection is one of the classical anomaly detection examples, as it is not easy to label consumption or transactional data. Furthermore, frauds differ in nature, and learning is not always possible. In this paper, we analyze large datasets of readings provided by smart meters installed in a trial study in Ireland by applying a hybrid approach. More precisely, we propose an unsupervised ML technique to detect anomalous values in the time series, establish a threshold for the percentage of anomalous readings from the total readings, and then label that time series as suspicious or not. Initially, we propose two types of algorithms for anomaly detection for unlabeled data: Spectral Residual-Convolutional Neural Network (SR-CNN) and an anomaly trained model based on martingales for determining variations in time-series data streams. Then, the Two-Class Boosted Decision Tree and Fisher Linear Discriminant analysis are applied on the previously processed dataset. By training the model, we obtain the required capabilities of detecting suspicious consumers proved by an accuracy of 90%, precision score of 0.875, and F1 score of 0.894.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Evaluation of Machine Learning-based Anomaly Detection Algorithms on an Industrial Modbus/TCP Data Set
    Anton, Simon Duque
    Kanoor, Suneetha
    Fraunholz, Daniel
    Schotten, Hans Dieter
    13TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2018), 2019,
  • [42] Application of Machine Learning Algorithms in the Development and Consumption Trend of Green and Intelligent Vehicles under the Background of Big Data
    Liang, Benshuang
    Yang, Jing
    Guo, Yaxin
    Guo, Xin
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [43] Anomaly Detection for Environmental Data Using Machine Learning Regression
    Yuan, Fuqing
    Lu, Jinmei
    6TH ANNUAL INTERNATIONAL CONFERENCE ON MATERIAL SCIENCE AND ENVIRONMENTAL ENGINEERING, 2019, 472
  • [44] Anomaly detection based on joint spatio-temporal learning for building electricity consumption
    Kong, Jun
    Jiang, Wen
    Tian, Qing
    Jiang, Min
    Liu, Tianshan
    APPLIED ENERGY, 2023, 334
  • [45] Anomaly Detection in Renewable Energy Big Data Using Deep Learning
    Katamoura, Suzan MohammadAli
    Aksoy, Mehmet Sabih
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2023, 19 (01)
  • [46] An Algorithm Design of Big Data Anomaly Detection Based on Ensemble Learning
    Chen, Xiao
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 319 - 323
  • [47] Predicting Student Success Using Big Data and Machine Learning Algorithms
    Ouatik, Farouk
    Erritali, Mohammed
    Ouatik, Fahd
    Jourhmane, Mostafa
    INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2022, 17 (12): : 236 - 251
  • [48] Air Quality Forecasting Using Big Data and Machine Learning Algorithms
    Koo, Youn-Seo
    Choi, Yunsoo
    Ho, Chang-Hoi
    ASIA-PACIFIC JOURNAL OF ATMOSPHERIC SCIENCES, 2023, 59 (05) : 529 - 530
  • [49] Big data and machine learning algorithms for health-care delivery
    Ngiam, Kee Yuan
    Khor, Ing Wei
    LANCET ONCOLOGY, 2019, 20 (05): : E262 - E273
  • [50] Privacy-preserving Machine Learning Algorithms for Big Data Systems
    Xu, Kaihe
    Yue, Hao
    Guo, Linke
    Guo, Yuanxiong
    Fang, Yuguang
    2015 IEEE 35TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 2015, : 318 - 327