SMMO-CoFS: Synthetic Multi-minority Oversampling with Collaborative Feature Selection for Network Intrusion Detection System

被引:0
作者
Yeshalem Gezahegn Damtew
Hongmei Chen
机构
[1] Southwest Jiaotong University,School of Computing and Artificial Intelligence
[2] Debre Berhan University,College of Computing Science
来源
International Journal of Computational Intelligence Systems | / 16卷
关键词
Multi-class balancing; Multi-minority over-sampling; Feature selection; Machine learning; Network intrusion detection system;
D O I
暂无
中图分类号
学科分类号
摘要
Researchers publish various studies to improve the performance of network intrusion detection systems. However, there is still a high false alarm rate and missing intrusions due to class imbalance in the multi-class dataset. This imbalanced distribution of classes results in low detection accuracy for the minority classes. This paper proposes a Synthetic Multi-minority Oversampling (SMMO) framework by integrating with a collaborative feature selection (CoFS) approach in network intrusion detection systems. Our framework aims to increase the detection accuracy of the extreme minority classes (i.e., user-to-root and remote-to-local attacks) by improving the dataset’s class distribution and selecting relevant features. In our framework, SMMO generates synthetic data and iteratively over-samples multi-minority classes. And the collaboration of correlation-based feature selection with an evolutionary algorithm selects essential features. We evaluate our framework with a random forest, J48, BayesNet, and AdaBoostM1. In a multi-class NSL-KDD dataset, the experimental results show that the proposed framework significantly improves the detection accuracy of the extreme minority classes compared with other approaches.
引用
收藏
相关论文
共 96 条
[31]  
Hussein AS(2016)A hybrid data mining approach for intrusion detection on imbalanced nsl-kdd dataset Int. J. Adv. Comput. Sci. Appl. 7 20-1733
[32]  
Li T(2018)Machine-learning approach to optimize smote ratio in class imbalance dataset for intrusion detection Comput. Intell. Neurosci. 2018 1-41550
[33]  
Yohannese CW(2020)Detection of anomaly intrusion utilizing self-adaptive grasshopper optimization algorithm Neural Comput. Appl. 2020 1-1130
[34]  
Ibrahim MH(2017)A study of crossover operators for genetic algorithms to solve vrp and its variants and new sinusoidal motion crossover operator nt J Comput Intell Res 13 1717-354
[35]  
Ji H(2019)Deep learning approach for intelligent intrusion detection system Ieee Access 7 41525-11
[36]  
Wang Y(2012)Multiclass imbalance problems: analysis and potential solutions IEEE Trans. Syst. Man Cybern. Syst. 42 1119-340
[37]  
Qin H(2003)Learning when training data are costly: the effect of class distribution on tree induction Int. J. Artif. Intell. Res. 19 315-undefined
[38]  
Keserwani PK(2018)Integrating correlation-based feature selection and clustering for improved cardiovascular disease diagnosis Complexity 2018 1-undefined
[39]  
Govil MC(2017)Synthetic minority oversampling technique for multiclass imbalance problems Pattern Recognit. 72 327-undefined
[40]  
Pilli ES(undefined)undefined undefined undefined undefined-undefined