Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset

被引:251
作者
Kasongo, Sydney M. [1 ]
Sun, Yanxia [1 ]
机构
[1] Univ Johannesburg, Dept Elect & Elect Engn Sci, Kingsway Ave, ZA-2006 Johannesburg, South Africa
关键词
Machine learning; Feature engineering; Computer networks; Intrusion detection; DATA SET; ALGORITHM; IDS;
D O I
10.1186/s40537-020-00379-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Computer networks intrusion detection systems (IDSs) and intrusion prevention systems (IPSs) are critical aspects that contribute to the success of an organization. Over the past years, IDSs and IPSs using different approaches have been developed and implemented to ensure that computer networks within enterprises are secure, reliable and available. In this paper, we focus on IDSs that are built using machine learning (ML) techniques. IDSs based on ML methods are effective and accurate in detecting networks attacks. However, the performance of these systems decreases for high dimensional data spaces. Therefore, it is crucial to implement an appropriate feature extraction method that can prune some of the features that do not possess a great impact in the classification process. Moreover, many of the ML based IDSs suffer from an increase in false positive rate and a low detection accuracy when the models are trained on highly imbalanced datasets. In this paper, we present an analysis the UNSW-NB15 intrusion detection dataset that will be used for training and testing our models. Moreover, we apply a filter-based feature reduction technique using the XGBoost algorithm. We then implement the following ML approaches using the reduced feature space: Support Vector Machine (SVM), k-Nearest-Neighbour (kNN), Logistic Regression (LR), Artificial Neural Network (ANN) and Decision Tree (DT). In our experiments, we considered both the binary and multiclass classification configurations. The results demonstrated that the XGBoost-based feature selection method allows for methods such as the DT to increase its test accuracy from 88.13 to 90.85% for the binary classification scheme.
引用
收藏
页数:20
相关论文
共 42 条
[1]   A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer [J].
Alazzam, Hadeel ;
Sharieh, Ahmad ;
Sabri, Khair Eddin .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 148
[2]   Intrusion detection in Edge-of-Things computing [J].
Almogren, Ahmad S. .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 137 :259-265
[3]   A Feature Selection Model for Network Intrusion Detection System Based on PSO, GWO, FFA and GA Algorithms [J].
Almomani, Omar .
SYMMETRY-BASEL, 2020, 12 (06) :1-20
[4]   Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm [J].
Ambusaidi, Mohammed A. ;
He, Xiangjian ;
Nanda, Priyadarsi ;
Tan, Zhiyuan .
IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (10) :2986-2998
[5]   Artificial neural networks: fundamentals, computing, design, and application [J].
Basheer, IA ;
Hajmeer, M .
JOURNAL OF MICROBIOLOGICAL METHODS, 2000, 43 (01) :3-31
[6]  
Belouch M, 2017, INT J ADV COMPUT SC, V8, P389
[7]  
Chen T, 2016, ery and Data Mining, V785, P785, DOI [DOI 10.1145/2939672.2939785, 10.1145/2939672.2939785]
[8]   Control parameter design for automatic carrier landing system via pigeon-inspired optimization [J].
Deng, Yimin ;
Duan, Haibin .
NONLINEAR DYNAMICS, 2016, 85 (01) :97-106
[9]  
Dong G., 2018, Feature Engineering for Machine Learning and Data Analytics
[10]   Logistic regression and artificial neural network classification models: a methodology review [J].
Dreiseitl, S ;
Ohno-Machado, L .
JOURNAL OF BIOMEDICAL INFORMATICS, 2002, 35 (5-6) :352-359