A machine learning approach for feature selection traffic classification using security analysis

被引:2
作者
Muhammad Shafiq
Xiangzhan Yu
Ali Kashif Bashir
Hassan Nazeer Chaudhry
Dawei Wang
机构
[1] Harbin Institute of Technology,School of Computer Science and Technology
[2] University of Faroe Islands,Faculty of Science and Technology
[3] Politecnico di Milano,Department of Electronics, Information and Bioengineering
[4] National Computer Network Emergency Response Technical Team/Coordination Center,undefined
来源
The Journal of Supercomputing | 2018年 / 74卷
关键词
Network traffic classification; Class imbalance; Feature selection; Machine learning; Security;
D O I
暂无
中图分类号
学科分类号
摘要
Class imbalance has become a big problem that leads to inaccurate traffic classification. Accurate traffic classification of traffic flows helps us in security monitoring, IP management, intrusion detection, etc. To address the traffic classification problem, in literature, machine learning (ML) approaches are widely used. Therefore, in this paper, we also proposed an ML-based hybrid feature selection algorithm named WMI_AUC that make use of two metrics: weighted mutual information (WMI) metric and area under ROC curve (AUC). These metrics select effective features from a traffic flow. However, in order to select robust features from the selected features, we proposed robust features selection algorithm. The proposed approach increases the accuracy of ML classifiers and helps in detecting malicious traffic. We evaluate our work using 11 well-known ML classifiers on the different network environment traces datasets. Experimental results showed that our algorithms achieve more than 95% flow accuracy results.
引用
收藏
页码:4867 / 4892
页数:25
相关论文
共 52 条
  • [1] Foremski P(2013)On different ways to classify internet traffic? A short review of selected publications Theor Appl Inform 25 119-136
  • [2] Moore A(2005)Toward the accurate identification of network applications Passiv Act Netw Meas 3431 4-54
  • [3] Papagiannaki K(2008)A survey of techniques for internet traffic classification using machine learning IEEE Commun Surv Tutor 10 56-76
  • [4] Nguyen T(2005)Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy IEEE Trans Pattern Anal Mach Intell 27 1226-1238
  • [5] Armitage G(1997)Multimodality image registration by maximization of mutual information IEEE Trans Med Imaging 16 187-1471
  • [6] Peng H(2012)Feature selection for optimizing traffic classification Comput Commun 35 1457-1159
  • [7] Long F(1997)The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recognit 30 1145-239
  • [8] Ding C(2007)Bayesian neural networks for internet traffic classification IEEE Trans Neural Netw 18 223-692
  • [9] Maes F(2009)Controlling false alarm/discovery rates in online internet traffic flow classification IEEE INFOCOM 2009 684-809
  • [10] Collignon A(2009)Efficient application identification and the temporal and spatial stability of classification schema Comput Netw 53 790-2247