Combining Oversampling with Recurrent Neural Networks for Intrusion Detection

被引:3
作者
Wang, Jenq-Haur [1 ]
Septian, Tri Wanda [2 ]
机构
[1] Natl Taipei Univ Technol, Taipei, Taiwan
[2] Sriwijaya Univ, Palembang, Indonesia
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS | 2021年 / 12680卷
关键词
Class imbalance; Oversampling; Feature selection; Long short-term memory; Gated recurrent unit;
D O I
10.1007/978-3-030-73216-5_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous studies on intrusion detection focus on analyzing features from existing datasets. With various types of fast-changing attacks, we need to adapt to new features for effective protection. Since the real network traffic is very imbalanced, it's essential to train appropriate classifiers that can deal with rare cases. In this paper, we propose to combine oversampling techniques with deep learning methods for intrusion detection in imbalanced network traffic. First, after preprocessing with data cleaning and normalization, we use feature importance weights generated from ensemble decision trees to select important features. Then, the Synthetic Minority Oversampling Technique (SMOTE) is used for creating synthetic samples from minority class. Finally, we use Recurrent Neural Networks (RNNs) including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) for classification. In our experimental results, oversampling improves the performance of intrusion detection for both machine learning and deep learning methods. The best performance can be obtained for CIC-IDS2017 dataset using LSTM classifier with an F1 -score of 98.9%, and for CSE-CIC-IDS2018 dataset using GRU with an F1-score of 98.8%. This shows the potential of our proposed approach in detecting new types of intrusion from imbalanced real network traffic.
引用
收藏
页码:305 / 320
页数:16
相关论文
共 22 条
[1]   A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer [J].
Alazzam, Hadeel ;
Sharieh, Ahmad ;
Sabri, Khair Eddin .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 148
[2]   Analysis of Intelligent Classifiers and Enhancing the Detection Accuracy for Intrusion Detection System [J].
Albayati, Mohanad ;
Issac, Biju .
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2015, 8 (05) :841-853
[3]  
Almseidin M, 2017, I S INTELL SYST INFO, P277, DOI 10.1109/SISY.2017.8080566
[4]  
Althubiti SA, 2018, 2018 28TH INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), P293
[5]  
[Anonymous], 2018, CSE-CIC-IDS2018
[6]  
[Anonymous], Food Additives Contaminants
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]   Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]