Comparison of Machine Learning and Deep Learning Models for Network Intrusion Detection Systems

被引:48
作者
Thapa, Niraj [1 ]
Liu, Zhipeng [2 ]
Kc, Dukka B. [3 ]
Gokaraju, Balakrishna [1 ]
Roy, Kaushik [2 ]
机构
[1] North Carolina A&T State Univ, Dept Computat Data Sci & Engn, Greensboro, NC 27411 USA
[2] North Carolina A&T State Univ, Dept Comp Sci, Greensboro, NC 27411 USA
[3] Wichita State Univ, Elect Engn & Comp Sci Dept, Wichita, KS 67260 USA
关键词
network intrusion detection; CIDDS; machine learning; deep learning; KNN; CART; XGBoost; CNN; LSTM; ensemble;
D O I
10.3390/fi12100167
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development of robust anomaly-based network detection systems, which are preferred over static signal-based network intrusion, is vital for cybersecurity. The development of a flexible and dynamic security system is required to tackle the new attacks. Current intrusion detection systems (IDSs) suffer to attain both the high detection rate and low false alarm rate. To address this issue, in this paper, we propose an IDS using different machine learning (ML) and deep learning (DL) models. This paper presents a comparative analysis of different ML models and DL models on Coburg intrusion detection datasets (CIDDSs). First, we compare different ML- and DL-based models on the CIDDS dataset. Second, we propose an ensemble model that combines the best ML and DL models to achieve high-performance metrics. Finally, we benchmarked our best models with the CIC-IDS2017 dataset and compared them with state-of-the-art models. While the popular IDS datasets like KDD99 and NSL-KDD fail to represent the recent attacks and suffer from network biases, CIDDS, used in this research, encompasses labeled flow-based data in a simulated office environment with both updated attacks and normal usage. Furthermore, both accuracy and interpretability must be considered while implementing AI models. Both ML and DL models achieved an accuracy of 99% on the CIDDS dataset with a high detection rate, low false alarm rate, and relatively low training costs. Feature importance was also studied using the Classification and regression tree (CART) model. Our models performed well in 10-fold cross-validation and independent testing. CART and convolutional neural network (CNN) with embedding achieved slightly better performance on the CIC-IDS2017 dataset compared to previous models. Together, these results suggest that both ML and DL methods are robust and complementary techniques as an effective network intrusion detection system.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 32 条
[1]  
Bengio Y, 2001, ADV NEUR IN, V13, P932
[2]  
Breiman L, 1984, Classification and Regression Trees, V1st, DOI DOI 10.1201/9781315139470
[3]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[4]  
Chowdhury MMU, 2017, 2017 IEEE 8TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (UEMCON), P456, DOI 10.1109/UEMCON.2017.8249084
[5]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[6]  
Cunningham R.K., 1999, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB
[7]  
Ever Yoney Kirsal, 2019, Mobile Web and Intelligent Information Systems. 16th International Conference (MobiWIS 2019). Proceedings: Lecture Notes in Computer Science (LNCS 11673), P111, DOI 10.1007/978-3-030-27192-3_9
[8]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[9]  
Guo C., 2019, ARXIV190507121
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]