The Effect of Dataset Imbalance on the Performance of SCADA Intrusion Detection Systems

被引:22
作者
Balla, Asaad [1 ]
Habaebi, Mohamed Hadi [1 ]
Elsheikh, Elfatih A. A. [2 ]
Islam, Md. Rafiqul [1 ]
Suliman, F. M. [2 ]
机构
[1] Int Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, Malaysia
[2] King Khalid Univ, Coll Engn, Dept Elect Engn, Abha 61421, Saudi Arabia
关键词
IDS; ICS; SCADA; imbalanced datasets; cyber security;
D O I
10.3390/s23020758
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Integrating IoT devices in SCADA systems has provided efficient and improved data collection and transmission technologies. This enhancement comes with significant security challenges, exposing traditionally isolated systems to the public internet. Effective and highly reliable security devices, such as intrusion detection system (IDSs) and intrusion prevention systems (IPS), are critical. Countless studies used deep learning algorithms to design an efficient IDS; however, the fundamental issue of imbalanced datasets was not fully addressed. In our research, we examined the impact of data imbalance on developing an effective SCADA-based IDS. To investigate the impact of various data balancing techniques, we chose two unbalanced datasets, the Morris power dataset, and CICIDS2017 dataset, including random sampling, one-sided selection (OSS), near-miss, SMOTE, and ADASYN. For binary classification, convolutional neural networks were coupled with long short-term memory (CNN-LSTM). The system's effectiveness was determined by the confusion matrix, which includes evaluation metrics, such as accuracy, precision, detection rate, and F1-score. Four experiments on the two datasets demonstrate the impact of the data imbalance. This research aims to help security researchers in understanding imbalanced datasets and their impact on DL SCADA-IDS.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Improving the performance of the intrusion detection systems by the machine learning explainability
    Quang-Vinh Dang
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2021, 17 (05) : 537 - 555
  • [42] GeNIS: A modular dataset for network intrusion detection and classification
    Silva, Miguel
    Pinto, Daniela
    Vitorino, Joao
    Goncalves, Jose
    Maia, Eva
    Praca, Isabel
    DATA IN BRIEF, 2025, 60
  • [43] An Intrusion Detection System for Imbalanced Dataset Based on Deep
    Mbow, Mariama
    Koide, Hiroshi
    Sakurai, Kouichi
    2021 NINTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR 2021), 2021, : 38 - 47
  • [44] SCADA-Wireshark Testbed data-based Exploratory Data Analytics and Intrusion Detection
    Biswas, Hillol
    Kumar, Muthyala Manoj
    2024 INTERNATIONAL CONFERENCE ON SMART APPLICATIONS, COMMUNICATIONS AND NETWORKING, SMARTNETS-2024, 2024,
  • [45] Dataset of intrusion detection alerts from a sharing platform
    Husak, Martin
    Zadnik, Martin
    Bartos, Vaclav
    Sokol, Pavol
    DATA IN BRIEF, 2020, 33
  • [46] Cyber-Physical Integrated Intrusion Detection Scheme in SCADA System of Process Manufacturing Industry
    Qian, Junlei
    Du, Xueqiang
    Chen, Bo
    Qu, Bin
    Zeng, Kai
    Liu, Jianpeng
    IEEE ACCESS, 2020, 8 : 147471 - 147481
  • [47] TON_IoT Telemetry Dataset: A New Generation Dataset of IoT and IIoT for Data-Driven Intrusion Detection Systems
    Alsaedi, Abdullah
    Moustafa, Nour
    Tari, Zahir
    Mahmood, Abdun
    Anwar, Adnan
    IEEE ACCESS, 2020, 8 : 165130 - 165150
  • [48] A Hybrid Model for Anomaly-based Intrusion Detection in SCADA Networks
    Ullah, Imtiaz
    Mahmoud, Qusay H.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2160 - 2167
  • [49] Multiattribute SCADA-Specific Intrusion Detection System for Power Networks
    Yang, Y.
    McLaughlin, K.
    Sezer, S.
    Littler, T.
    Im, E. G.
    Pranggono, B.
    Wang, H. F.
    IEEE TRANSACTIONS ON POWER DELIVERY, 2014, 29 (03) : 1092 - 1102
  • [50] A Review on Intrusion Detection Techniques and Intrusion Detection systems in MANETS
    Chakravarthi, S. Sreenivasa
    Veluru, Suresh
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 730 - 737