Evaluation of feature selection on network traffic classification

被引:0
作者
Wang, Yun [1 ]
Wang, Pan [1 ]
Wang, ZiXuan [1 ]
Wu, KaiLin [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Modern Posts, Nanjing, Peoples R China
来源
2021 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS DASC/PICOM/CBDCOM/CYBERSCITECH 2021 | 2021年
关键词
malicious traffic classification; feature selection; deep learning; convolutional neural network; random forest; InfomationGain; RFE;
D O I
10.1109/DASC-PICom-CBDCom-CyberSciTech52372.2021.00135
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Malicious traffic classification has become a challenge in modern communications. It is a very important task for a trained model to successfully distinguish malicious traffic. With the gradual application of machine learning and deep learning in the field of traffic classification, traffic classification has reached a high accuracy rate. Feature selection can lighten models and improve classification performance by selecting the optimal subfeature set. Therefore, the selection of effective features is an important issue for malicious traffic classification.In this article, we propose the idea of applying feature selection methods Information Gain and RFE to malicious traffic classification. The essence is to select an effective and optimal sub-feature set from a large number of features to characterize network traffic. Then, we used the deep learning method CNN and the machine learning method RF on the three real network traffic datasets of CICIDS2017, NSL-KDD and UNSW-NB15 respectively to evaluate and verify. The experiment shows that the combination of CNN and Information Gain has the best effect. The results of many experiments show that the performance of traffic classification is greatly improved after feature selection.
引用
收藏
页码:813 / 818
页数:6
相关论文
共 17 条
  • [1] A New Method for Learning Decision Trees from Rules
    Abdelhalim, Amany
    Traore, Issa
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 693 - 698
  • [2] Arslan S., 2017 21 NATL BIOMEDI, pi
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] SVM-based intrusion detection system for wireless ad hoc networks
    Deng, HM
    Zeng, QA
    Agrawal, DP
    [J]. 2003 IEEE 58TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS1-5, PROCEEDINGS, 2003, : 2147 - 2151
  • [5] Deep Learning
    Hao, Xing
    Zhang, Guigang
    Ma, Shang
    [J]. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2016, 10 (03) : 417 - 439
  • [6] One-Class Oriented Feature Selection and Classification of Heterogeneous Remote Sensing Images
    Hossain, Md. Ali
    Jia, Xiuping
    Benediktsson, Jon Atli
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2016, 9 (04) : 1606 - 1612
  • [7] Jia YS, 2005, PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, P4370
  • [8] Jihyun Kim, 2016, 2016 International Conference on Platform Technology and Service (PlatCon). Proceedings, P1, DOI 10.1109/PlatCon.2016.7456805
  • [9] John GH, 1994, P 11 INT C MACH LEAR, P121, DOI 10.1016/B978-1-55860-335-6.50023-4
  • [10] Kim J., 2016, INT C PLATFORM TECHN, P1