Big data analytics for identifying electricity theft using machine learning approaches in microgrids for smart communities

被引:20
|
作者
Arif, Arooj [1 ]
Javaid, Nadeem [1 ]
Aldegheishem, Abdulaziz [2 ]
Alrajeh, Nabil [3 ]
机构
[1] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan
[2] King Saud Univ KSU, Coll Architecture & Planning, Urban Planning Dept, Riyadh, Saudi Arabia
[3] King Saud Univ KSU, Biomed Technol Dept, Coll Appl Med Sci, Riyadh, Saudi Arabia
关键词
big data; electricity theft detection; hyperactive optimization toolkit; machine learning; smart grids; urban planning; IMBALANCED DATA; OPTIMIZATION; SYSTEMS;
D O I
10.1002/cpe.6316
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Electricity theft (ET) causes major revenue loss in power utilities. It reduces the quality of supply, raises production cost, causes legal consumers to pay the higher cost, and impacts the economy as a whole. In this article, we use the State Grid Corporation of China (SGCC) dataset, which contains electricity consumption data of 1035 days for two classes: normal and fraudulent. In this work, ET detection model is proposed that consists of four steps: interpolation, data balancing, feature extraction, and classification. First, missing values of the dataset are recovered using the interpolation method. Second, resampling technique is implemented. ET consumers are 9% in the SGCC dataset that make the model inefficient to correctly classify both classes (normal and theft). A hybrid resampling technique is proposed, named synthetic minority oversampling technique with near miss. Third, residual network extracts the latent features from the SGCC dataset. Fourth, three tree based classifiers, such as decision tree (DT), random forest (RF), and adaptive boosting (AdaBoost) are applied to train the encoded feature vectors for classification. Besides, search for good hyperparameters is a challenging task, which is usually done manually and takes a considerable amount of time. To resolve this problem, Bayesian optimizer is used to simplify the tuning process of DT, RF, and AdaBoost. Finally, the results indicate that RF outperforms DT and AdaBoost.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Development of Big Data Predictive Analytics Model for Disease Prediction using Machine learning Technique
    R. Venkatesh
    C. Balasubramanian
    M. Kaliappan
    Journal of Medical Systems, 2019, 43
  • [42] Using Big Data-machine learning models for diabetes prediction and flight delays analytics
    Thérence Nibareke
    Jalal Laassiri
    Journal of Big Data, 7
  • [43] AMI Smart Meter Big Data Analytics for Time Series of Electricity Consumption
    Rashid, Mohammad Harun
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM) / 12TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (IEEE BIGDATASE), 2018, : 1771 - 1776
  • [44] Big data and machine learning: A roadmap towards smart plants
    Bogdan Dorneanu
    Sushen Zhang
    Hang Ruan
    Mohamed Heshmat
    Ruijuan Chen
    Vassilios S. Vassiliadis
    Harvey Arellano-Garcia
    Frontiers of Engineering Management, 2022, 9 : 623 - 639
  • [45] Big data and machine learning:A roadmap towards smart plants
    Bogdan DORNEANU
    Sushen ZHANG
    Hang RUAN
    Mohamed HESHMAT
    Ruijuan CHEN
    Vassilios S.VASSILIADIS
    Harvey ARELLANO-GARCIA
    Frontiers of Engineering Management, 2022, 9 (04) : 623 - 639
  • [46] Big Data Analytics for Predictive System Maintenance Using Machine Learning Models
    Ngwa, Pius
    Ngaruye, Innocent
    ADVANCES IN DATA SCIENCE AND ADAPTIVE ANALYSIS, 2023, 15 (01N02)
  • [47] Identifying Applications of Machine Learning and Data Analytics Based Approaches for Optimization of Upstream Petroleum Operations
    Pandey, Rakesh Kumar
    Dahiya, Anil Kumar
    Mandal, Ajay
    ENERGY TECHNOLOGY, 2021, 9 (01)
  • [48] OPTIMIZATION OF WATER RESOURCES MANAGEMENT USING BIG DATA AND MACHINE LEARNING IN SMART CITIES
    Akhayeva, Zh. B.
    Zakirova, A. B.
    Zhmud, V. A.
    EURASIAN JOURNAL OF MATHEMATICAL AND COMPUTER APPLICATIONS, 2024, 12 (02): : 4 - 15
  • [49] Big data and machine learning: A roadmap towards smart plants
    DORNEANU Bogdan
    ZHANG Sushen
    RUAN Hang
    HESHMAT Mohamed
    CHEN Ruijuan
    VASSILIADIS Vassilios S
    ARELLANOGARCIA Harvey
    Frontiers of Engineering Management, 2022, 9 (04) : 623 - 639
  • [50] A Comprehensive Survey on Machine Learning-Based Big Data Analytics for IoT-Enabled Smart Healthcare System
    Wei Li
    Yuanbo Chai
    Fazlullah Khan
    Syed Rooh Ullah Jan
    Sahil Verma
    Varun G. Menon
    Xingwang Kavita
    Mobile Networks and Applications, 2021, 26 : 234 - 252