Three-stage data generation algorithm for multiclass network intrusion detection with highly imbalanced dataset

被引:0
作者
Chui K.T. [1 ]
Gupta B.B. [2 ,3 ,9 ,10 ,11 ]
Chaurasia P. [4 ]
Arya V. [5 ,12 ]
Almomani A. [6 ,7 ]
Alhalabi W. [8 ]
机构
[1] Department of Electronic Engineering and Computer Science, School of Science and Technology, Hong Kong Metropolitan University
[2] International Center for AI and Cyber Security Research and Innovations (CCRI)
[3] Department of Computer Science and Information Engineering, Asia University
[4] School of Computing, Engineering and Intelligent Systems, Ulster University
[5] Department of Business Administration, Asia University
[6] School of Information Technology, Skyline University College, Sharjah
[7] Immersive Virtual Reality Research Group, King Abdulaziz University, Jeddah
[8] Symbiosis Centre for Information Technology (SCIT), Symbiosis International University, Pune
[9] Lebanese American University, Beirut
[10] Birkbeck, University of London
[11] Chandigarh University, Chandigarh
来源
International Journal of Intelligent Networks | 2023年 / 4卷
关键词
Convolutional neural network; Data generation; Generative adversarial network; Kernel function; Multiclass classification; Network intrusion detection; Support vector machine; Synthetic minority over-sampling technique;
D O I
10.1016/j.ijin.2023.08.001
中图分类号
学科分类号
摘要
The Internet plays a crucial role in our daily routines. Ensuring cybersecurity to Internet users will provide a safe online environment. Automatic network intrusion detection (NID) using machine learning algorithms has recently received increased attention recently. The NID model is prone to bias towards the classes with more training samples due to highly imbalanced datasets across different types of attacks. The challenge in generating additional training data for minority classes is the generation of insufficient data. The study's purpose is to address this challenge, which extends the data generation ability by proposing a three-stage data generation algorithm using the synthetic minority over-sampling technique, a generative adversarial network (GAN), and a variational autoencoder. A convolutional neural network is employed to extract the representative features from the data, which were fed into a support vector machine with a customised kernel function. An ablation study evaluated the effectiveness of the three-stage data generation, feature extraction, and customised kernel. This was followed by a performance comparison between our study and existing studies. The findings revealed that the proposed NID model achieved an accuracy of 91.9%–96.2% in the four benchmark datasets. In addition, it outperformed existing methods such as GAN-based deep neural networks, conditional Wasserstein GAN-based stacked autoencoder, synthesised minority oversampling technique-based random forest, and variational autoencoder-based deep neural network, by 1.51%–28.4%. © 2023 The Authors
引用
收藏
页码:202 / 210
页数:8
相关论文
共 31 条
[1]  
Artificial Intelligence Application Areas in Organizations Worldwide 2018, (2022)
[2]  
Johnson J., Number of Web Attacks Blocked Daily Worldwide 2015-2018, (2021)
[3]  
Sava J.A., Cybersecurity Market Revenues Worldwide 2021-2026, (2022)
[4]  
Ling Z., J Z., HaoIntrusion detection using normalized mutual information feature selection and parallel quantum genetic algorithm, Int. J. Semantic Web Inf. Syst., 18, 1, pp. 1-24, (2022)
[5]  
Huang S., Lei K., IGAN-IDS: an imbalanced generative adversarial network towards intrusion detection system in ad-hoc networks, Ad Hoc Netw., 105, (2020)
[6]  
Zhang G., Wang X., Li R., Song Y., He J., Lai J., Network intrusion detection based on conditional Wasserstein generative adversarial network and cost-sensitive stacked autoencoder, IEEE Access, 8, pp. 190431-190447, (2020)
[7]  
de Araujo-Filho P.F., Kaddoum G., Campelo D.R., Santos A.G., Macedo D., Zanchettin C., Intrusion detection for cyber–physical systems using generative adversarial networks in fog environment, IEEE Internet Things J., 8, 8, pp. 6247-6256, (2021)
[8]  
Tan X., Su S., Huang Z., Guo X., Zuo Z., Sun X., Li L., Wireless sensor networks intrusion detection based on SMOTE and the random forest algorithm, Sensors, 19, 1, (2019)
[9]  
Ma X., Shi W., Aesmote: adversarial reinforcement learning with smote for anomaly detection, IEEE Trans. Netw. Sci. Eng., 8, 2, pp. 943-956, (2021)
[10]  
Jiang K., Wang W., Wang A., Wu H., Network intrusion detection combined hybrid sampling with deep hierarchical network, IEEE Access, 8, pp. 32464-32476, (2020)