Ensemble network traffic classification: Algorithm comparison and novel ensemble scheme proposal

被引：41

作者：

Egea Gomez, Santiago ^{[1
]}

Carro Martinez, Belen ^{[1
]}

Sanchez-Esguevillas, Antonio J. ^{[1
]}

Hernandez Callejo, Luis ^{[1
]}

机构：

[1] Univ Valladolid, Escuela Tecn Super Ingn Telecomunicac, Campus Miguel Delibes, E-47011 Valladolid, Spain

来源：

COMPUTER NETWORKS | 2017年 / 127卷

关键词：

MACHINE LEARNING ALGORITHMS; NEURAL-NETWORKS;

D O I：

10.1016/j.comnet.2017.07.018

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Network Traffic Classification (NTC) is a key piece for network monitoring, Quality-of-Service management and network security. Machine Learning algorithms have drawn the attention of many researchers during the last few years as a promising solution for network traffic classification. In Machine Learning, ensemble algorithms are classifiers formed by a set of base estimators that cooperate to build more complex models according to given training and classification strategies. Resulting models normally exhibit significant accuracy improvements compared to single estimators, but also extra time cost, which may obstruct the application of these methods to online NTC. This paper studies and compares the performance of seven popular ensemble algorithms based on Decision Trees, focusing on model accuracy, byte accuracy, and latency to determine whether ensemble learning can be properly applied to this modeling task. We show that some of the studied algorithms overcome single Decision Tree in terms of model accuracy and byte accuracy. However, the notable latency increase hinders the application of these methods in real time contexts. Additionally, we introduce a novel ensemble classifier that exploits the imbalanced populations presented in traffic networks datasets to achieve faster classifications. The experimental results show that our scheme retains the accuracy improvements of ensemble methods but with low latency punishment, enhancing the prospect of ensembles methods for online network traffic classification. (C) 2017 Elsevier B.V. All rights reserved.

引用

页码：68 / 80

页数：13

共 40 条

[11] A Survey on Internet Traffic Identification
Callado, Arthur
Kamienski, Carlos
Szabo, Geza
Gero, Balazs Peter
Kelner, Judith
Fernandes, Stenio
Sadok, Djamel
[J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2009, 11 (03): : 37 - 52
[12] Carela-Espanol Valentin, 2014, Passive and Active Measurement. 15th International Conference, PAM 2014. Proceedings: LNCS 8362, P98, DOI 10.1007/978-3-319-04918-2_10
[13] Analysis of the impact of sampling on Net Flow traffic classification
Carela-Espanol, Valentin
Barlet-Ros, Pere
Cabellos-Aparicio, Albert
Sole-Pareta, Josep
[J]. COMPUTER NETWORKS, 2011, 55 (05) : 1083 - 1099
[14] Carvalho P., 2007, BROADB CONV NETW BCN, P1
[15] Casas P., 2011, MINETRAC MINING FLOW
[16] Issues and Future Directions in Traffic Classification
Dainotti, Alberto
Pescape, Antonio
Claffy, Kimberly C.
[J]. IEEE NETWORK, 2012, 26 (01): : 35 - 40
[17] Demsar J, 2006, J MACH LEARN RES, V7, P1
[18] Deri L, 2014, INT WIREL COMMUN, P617, DOI 10.1109/IWCMC.2014.6906427
[19] Dietterich T. G., 1995, Journal of Artificial Intelligence Research, V2, P263
[20] An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization
Dietterich, TG
[J]. MACHINE LEARNING, 2000, 40 (02) : 139 - 157

← 1 2 3 4 →