Ensemble network traffic classification: Algorithm comparison and novel ensemble scheme proposal

被引:41
作者
Egea Gomez, Santiago [1 ]
Carro Martinez, Belen [1 ]
Sanchez-Esguevillas, Antonio J. [1 ]
Hernandez Callejo, Luis [1 ]
机构
[1] Univ Valladolid, Escuela Tecn Super Ingn Telecomunicac, Campus Miguel Delibes, E-47011 Valladolid, Spain
关键词
MACHINE LEARNING ALGORITHMS; NEURAL-NETWORKS;
D O I
10.1016/j.comnet.2017.07.018
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Network Traffic Classification (NTC) is a key piece for network monitoring, Quality-of-Service management and network security. Machine Learning algorithms have drawn the attention of many researchers during the last few years as a promising solution for network traffic classification. In Machine Learning, ensemble algorithms are classifiers formed by a set of base estimators that cooperate to build more complex models according to given training and classification strategies. Resulting models normally exhibit significant accuracy improvements compared to single estimators, but also extra time cost, which may obstruct the application of these methods to online NTC. This paper studies and compares the performance of seven popular ensemble algorithms based on Decision Trees, focusing on model accuracy, byte accuracy, and latency to determine whether ensemble learning can be properly applied to this modeling task. We show that some of the studied algorithms overcome single Decision Tree in terms of model accuracy and byte accuracy. However, the notable latency increase hinders the application of these methods in real time contexts. Additionally, we introduce a novel ensemble classifier that exploits the imbalanced populations presented in traffic networks datasets to achieve faster classifications. The experimental results show that our scheme retains the accuracy improvements of ensemble methods but with low latency punishment, enhancing the prospect of ensembles methods for online network traffic classification. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:68 / 80
页数:13
相关论文
共 40 条
  • [1] [Anonymous], 2006, INT WORKSH FRONT HAN
  • [2] Bayesian neural networks for Internet traffic classification
    Auld, Tom
    Moore, Andrew W.
    Gull, Stephen F.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (01): : 223 - 239
  • [3] A comparison of decision tree ensemble creation techniques
    Banfield, Robert E.
    Hall, Lawrence O.
    Bowyer, Kevin W.
    Kegelmeyer, W. P.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (01) : 173 - 180
  • [4] Strategies for learning in class imbalance problems
    Barandela, R
    Sánchez, JS
    García, V
    Rangel, E
    [J]. PATTERN RECOGNITION, 2003, 36 (03) : 849 - 851
  • [5] An empirical comparison of voting classification algorithms: Bagging, boosting, and variants
    Bauer, E
    Kohavi, R
    [J]. MACHINE LEARNING, 1999, 36 (1-2) : 105 - 139
  • [6] Bernaille L., 2006, C FUTURE NETWORKING, P6
  • [7] Traffic classification on the fly
    Bernaille, Laurent
    Teixeira, Renata
    Akodkenou, Ismael
    Soule, Augustin
    Salamatian, Kave
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2006, 36 (02) : 23 - 26
  • [8] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [9] Independent comparison of popular DPI tools for traffic classification
    Bujlow, Tomasz
    Carela-Espanol, Valentin
    Barlet-Ros, Pere
    [J]. COMPUTER NETWORKS, 2015, 76 : 75 - 89
  • [10] Better network traffic identification through the independent combination of techniques
    Callado, Arthur
    Kelner, Judith
    Sadok, Djamel
    Kamienski, Carlos Alberto
    Fernandes, Stenio
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2010, 33 (04) : 433 - 446