Active Learning Framework For Long-term Network Traffic Classification

被引:1
作者
Pesek, Jaroslav [1 ]
Soukup, Dominik [1 ]
Cejka, Tomas [2 ]
机构
[1] Czech Tech Univ, Thakurova 9, Prague, Czech Republic
[2] CESNET, Zikova 4, Prague, Czech Republic
来源
2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC | 2023年
关键词
Active Learning; Dataset Quality; Network traffic analysis;
D O I
10.1109/CCWC57344.2023.10099065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent network traffic classification methods benefit from machine learning (ML) technology. However, there are many challenges due to the use of ML, such as lack of high-quality annotated datasets, data drifts and other effects causing aging of datasets and ML models, high volumes of network traffic, etc. This paper presents the benefits of augmenting traditional workflows of ML training&deployment and adaption of the Active Learning (AL) concept on network traffic analysis. The paper proposes a novel Active Learning Framework (ALF) to address this topic. ALF provides prepared software components that can be used to deploy an AL loop and maintain an ALF instance that continuously evolves a dataset and ML model automatically. Moreover, ALF includes monitoring, datasets quality evaluation, and optimization capabilities that enhance the current state of the art in the AL domain. The resulting solution is deployable for IP flow-based analysis of high-speed (100 Gb/s) networks, where it was evaluated for more than eight months. Additional use cases were evaluated on publicly available datasets.
引用
收藏
页码:893 / 899
页数:7
相关论文
共 21 条
[1]  
Atighehchian P., 2022, Baal, a Bayesian active learning library
[2]   On Model Evaluation Under Non-constant Class Imbalance [J].
Brabec, Jan ;
Komarek, Tomas ;
Franc, Vojtech ;
Machlica, Lukas .
COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 :74-87
[3]   Dataset Quality Assessment in Autonomous Networks with Permutation Testing [J].
Camacho, Jose ;
Wasielewska, Katarzyna .
PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
[4]   Ranked batch-mode active learning [J].
Cardoso, Thiago N. C. ;
Silva, Rodrigo M. ;
Canuto, Sergio ;
Moro, Mirella M. ;
Goncalves, Marcos A. .
INFORMATION SCIENCES, 2017, 379 :313-337
[5]  
Cejka T, 2016, INT CONF NETW SER, P195, DOI 10.1109/CNSM.2016.7818417
[6]  
Danka T, 2018, Arxiv, DOI [arXiv:1805.00979, DOI 10.48550/ARXIV.1805.00979, 10.48550/arXiv.1805.00979]
[7]   An empirical comparison of botnet detection methods [J].
Garcia, S. ;
Grill, M. ;
Stiborek, J. ;
Zunino, A. .
COMPUTERS & SECURITY, 2014, 45 :100-123
[8]  
Ginart AA, 2022, PR MACH LEARN RES, V151
[9]   Semi-Supervised Encrypted Traffic Classification With Deep Convolutional Generative Adversarial Networks [J].
Iliyasu, Auwal Sani ;
Deng, Huifang .
IEEE ACCESS, 2020, 8 :118-126
[10]  
Lin BYC, 2019, PROCEEDINGS OF THE 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: SYSTEM DEMONSTRATIONS, (ACL 2019), P58