Feature selection for optimizing traffic classification

被引:96
作者
Zhang, Hongli [1 ]
Lu, Gang [1 ]
Qassrawi, Mahmoud T. [1 ]
Zhang, Yu [1 ]
Yu, Xiangzhan [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Traffic classification; Class imbalance; Robust features; IDENTIFICATION;
D O I
10.1016/j.comcom.2012.04.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) algorithms have been widely applied in recent traffic classification. However, due to the imbalance in the number of traffic flows, ML based classifiers are prone to misclassify flows as the traffic type that occupies the majority of flows on the Internet. To address the problem, a novel feature selection metric named Weighted Symmetrical Uncertainty (WSU) is proposed. We design a hybrid feature selection algorithm named WSU_AUC, which prefilters most of features with WSU metric and further uses a wrapper method to select features for a specific classifier with Area Under roc Curve (AUC) metric. Additionally, to overcome the impacts of dynamic traffic flows on feature selection, we propose an algorithm named SRSF that Selects the Robust and Stable Features from the results achieved by WSU_AUC. We evaluate our approaches using three classifiers on the traces captured from entirely different networks. Experimental results obtained by our algorithms are promising in terms of true positive rate (TPR) and false positive rate (FPR). Moreover, our algorithms can achieve >94% flow accuracy and >80% byte accuracy on average. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:1457 / 1471
页数:15
相关论文
共 50 条
  • [41] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [42] Sequential Feature Selection for Classification
    Rueckstiess, Thomas
    Osendorfer, Christian
    van der Smagt, Patrick
    AI 2011: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 7106 : 132 - +
  • [43] A new feature selection approach for optimizing prediction models, applied to breast cancer subtype classification
    Pham Quang Huy
    Ngom, Alioune
    Rueda, Luis
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1535 - 1541
  • [44] Optimizing cancer classification: a hybrid RDO-XGBoost approach for feature selection and predictive insights
    Yaqoob, Abrar
    Verma, Navneet Kumar
    Aziz, Rabia Musheer
    Shah, Mohd Asif
    CANCER IMMUNOLOGY IMMUNOTHERAPY, 2024, 73 (12)
  • [45] Waterfall Traffic Identification: Optimizing Classification Cascades
    Foremski, Pawel
    Callegari, Christian
    Pagano, Michele
    COMPUTER NETWORKS, CN 2015, 2015, 522 : 1 - 10
  • [46] Optimizing feature selection to improve medical diagnosis
    Fan, Ya-Ju
    Chaovalitwongse, Wanpracha Art
    ANNALS OF OPERATIONS RESEARCH, 2010, 174 (01) : 169 - 183
  • [47] Optimizing feature selection to improve medical diagnosis
    Ya-Ju Fan
    Wanpracha Art Chaovalitwongse
    Annals of Operations Research, 2010, 174 : 169 - 183
  • [48] An optimal and stable feature selection approach for traffic classification based on multi-criterion fusion
    Fahad, Adil
    Tari, Zahir
    Khalil, Ibrahim
    Almalawi, Abdulmohsen
    Zomaya, Albert Y.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 36 : 156 - 169
  • [49] FEATURE SELECTION BASED ON COMPLEMENTARITY OF FEATURE CLASSIFICATION CAPABILITY
    Gao, Fei
    Yu, Tian
    Wei, Yang
    Jin, Han
    Wei, Jin-Mao
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 130 - 135
  • [50] Unveiling traffic paths: Explainable path signature feature-based encrypted traffic classification
    Xu, Shi-Jie
    Kong, Kai-Chuan
    Jin, Xiao-Bo
    Geng, Guang-Gang
    COMPUTERS & SECURITY, 2025, 150