Feature selection for optimizing traffic classification

被引:96
|
作者
Zhang, Hongli [1 ]
Lu, Gang [1 ]
Qassrawi, Mahmoud T. [1 ]
Zhang, Yu [1 ]
Yu, Xiangzhan [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Traffic classification; Class imbalance; Robust features; IDENTIFICATION;
D O I
10.1016/j.comcom.2012.04.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) algorithms have been widely applied in recent traffic classification. However, due to the imbalance in the number of traffic flows, ML based classifiers are prone to misclassify flows as the traffic type that occupies the majority of flows on the Internet. To address the problem, a novel feature selection metric named Weighted Symmetrical Uncertainty (WSU) is proposed. We design a hybrid feature selection algorithm named WSU_AUC, which prefilters most of features with WSU metric and further uses a wrapper method to select features for a specific classifier with Area Under roc Curve (AUC) metric. Additionally, to overcome the impacts of dynamic traffic flows on feature selection, we propose an algorithm named SRSF that Selects the Robust and Stable Features from the results achieved by WSU_AUC. We evaluate our approaches using three classifiers on the traces captured from entirely different networks. Experimental results obtained by our algorithms are promising in terms of true positive rate (TPR) and false positive rate (FPR). Moreover, our algorithms can achieve >94% flow accuracy and >80% byte accuracy on average. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:1457 / 1471
页数:15
相关论文
共 50 条
  • [21] A New Feature Selection Method for Internet Traffic Classification Using ML
    Zhen, Liu
    Qiong, Liu
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 1338 - 1345
  • [22] An Efficient Traffic Classification Scheme Using Embedded Feature Selection and LightGBM
    Hua, Yanpei
    2020 INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC), 2020, : 125 - 130
  • [23] Effect of Feature Selection on Performance of Internet Traffic Classification on NIMS Multi-Class dataset
    Oluranti, Jonathan
    Omoregbe, Nicholas
    Misra, Sanjay
    3RD INTERNATIONAL CONFERENCE ON SCIENCE AND SUSTAINABLE DEVELOPMENT (ICSSD 2019): SCIENCE, TECHNOLOGY AND RESEARCH: KEYS TO SUSTAINABLE DEVELOPMENT, 2019, 1299
  • [25] Optimizing text classification through efficient feature selection based on quality metric
    Jean-Charles Lamirel
    Pascal Cuxac
    Aneesh Sreevallabh Chivukula
    Kafil Hajlaoui
    Journal of Intelligent Information Systems, 2015, 45 : 379 - 396
  • [26] Optimizing feature selection across a multimodality database in computerized classification of breast lesions
    Horsch, K
    Ceballos, AF
    Giger, IL
    Bonta, I
    Huo, ZM
    Vyborny, CJ
    Hendrick, E
    Lan, L
    MEDICAL IMAGING 2002: IMAGE PROCESSING, VOL 1-3, 2002, 4684 : 986 - 992
  • [27] Optimizing feature selection and remote sensing classification with an enhanced machine learning method
    Ewees, Ahmed A.
    Alshahrani, Mohammed M.
    Alharthi, Abdullah M.
    Gaheen, Marwa A.
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (02)
  • [28] Optimizing text classification through efficient feature selection based on quality metric
    Lamirel, Jean-Charles
    Cuxac, Pascal
    Chivukula, Aneesh Sreevallabh
    Hajlaoui, Kafil
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2015, 45 (03) : 379 - 396
  • [29] Traffic classification based on extended feature set
    Dai, Lei
    Yun, Xiaochun
    Xiao, Jun
    Chen, You
    Gaojishu Tongxin/Chinese High Technology Letters, 2009, 19 (10): : 998 - 1005
  • [30] Edge Computing Intelligence Using Robust Feature Selection for Network Traffic Classification in Internet-of-Things
    Mohammed, Bushra
    Hamdan, Mosab
    Bassi, Joseph Stephen
    Jamil, Haitham A.
    Khan, Suleman
    Elhigazi, Abdallah
    Rawat, Danda B.
    Ismail, Ismahani Binti
    Marsono, Muhammad Nadzir
    IEEE ACCESS, 2020, 8 : 224059 - 224070