Enhancing Security Attacks Analysis using Regularized Machine Learning Techniques

被引:11
作者
Hagos, Desta Haileselassie [1 ,2 ,3 ]
Yazidi, Anis [3 ]
Kure, Oivind [2 ,4 ]
Engelstad, Paal E. [1 ,2 ,3 ]
机构
[1] Univ Oslo, Dept Informat, Oslo, Norway
[2] Univ Grad Ctr UNIK, Kjeller, Norway
[3] Oslo & Akershus Univ Coll, Dept Comp Sci, Oslo, Norway
[4] Norwegian Univ Sci & Technol, Dept Telemat, Trondheim, Norway
来源
2017 IEEE 31ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA) | 2017年
关键词
Machine Learning; Network Intrusion Detection; SVMs; LASSO; Feature Selection; Bayesian; Classification; SELECTION; FEATURES; CLASSIFICATION; REGRESSION;
D O I
10.1109/AINA.2017.19
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing threats of security attacks, Machine learning (ML) has become a popular technique to detect those attacks. However, most of the ML approaches are black-box methods and their inner-workings are difficult to understand by human beings. In the case of network security, understanding the dynamics behind the classification model is a crucial element towards creating safe and human-friendly systems. In this article, we investigate the most important features in identifying well-known security attacks by using Support Vector Machines (SVMs) and l(1)-regularized method with Least Absolute Shrinkage and Selection Operator (LASSO) for robust regression both to binary and multiclass attack classification. SVMs are one of the standards of ML classification techniques that give a reasonably good performance but with some drawbacks in terms of interpretability. On the other hand, LASSO is a regularized regression method often performing comparably well and it has extra compelling advantages of being very easily interpretable. LASSO provides coefficients that contribute how individual features affect the probability of specific security attack classes to occur. Hence, we finally use LASSO in particular for multiclass classification to help us better understand which actual features shared by attacks in a network are the most important ones. To perform our analysis, we use the recent NSL-KDD intrusion detection public dataset where the data are labeled into either anomalous (denial-of-service (DoS), remote-to-local (R2L), user-to-root (U2R) and probe attack classes) or normal. Empirical results of the analysis and computational performance comparison over the competing methods used are also presented and discussed. We believe that the methodology presented in this paper may strengthen a future research in network intrusion detection settings.
引用
收藏
页码:909 / 918
页数:10
相关论文
共 50 条
[1]  
[Anonymous], 1998, STAT LEARNING THEORY
[2]  
[Anonymous], 1992, P 5 ANN WORKSHOP COM
[3]  
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[4]  
[Anonymous], 2000, Pattern Classification, DOI DOI 10.1007/978-3-319-57027-3_4
[5]  
[Anonymous], 1998, 14 ISIS
[6]  
[Anonymous], 1996, STANFORD INFOLAB
[7]  
[Anonymous], AM J PSYCHOL
[8]  
[Anonymous], 1995, SPRINGER SERIES INFO, DOI DOI 10.1007/978-3-642-97610-0_6
[9]  
[Anonymous], MACHINE LEARNING
[10]  
[Anonymous], 2009, S COMP INT SEC DEF A