A new platform for machine-learning-based network traffic classification

被引:11
作者
Bozkir, Ramazan [1 ]
Cicioglu, Murtaza [2 ]
Calhan, Ali [3 ]
Togay, Cengiz [4 ]
机构
[1] AKSA Informat Technol, IT Dept, Bursa, Turkiye
[2] Bursa Uludag Univ, Comp Engn Dept, Bursa, Turkiye
[3] Duzce Univ, Comp Engn Dept, Duzce, Turkiye
[4] Andasis Elect Ind & Trade Inc, IT Dept, Istanbul, Turkiye
关键词
Network traffic classification; Machine learning; Feature extraction; INTERNET; ENGINE; DEEP;
D O I
10.1016/j.comcom.2023.05.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study provides a new platform for classifying encrypted network traffic based on machine learning (ML) techniques. The architecture of the platform is designed for real-world network traffic classification problems with performance-oriented, practical, and up-to-date software technologies. In addition, this study introduces a new feature extraction method to the literature. The proposed platform applies ML techniques with flowbased statistical features of encrypted network traffic and new feature extraction. It takes network traffic packets as input and passes them through feature extraction, data preparation, and ML stages. In the feature extraction stage, network flows are extracted from the network traffic data by calculating their features with the NFStream tool. During the data preparation stage, the dataset is transformed into a processable state for the ML algorithm with the Apache Spark framework. This stage also includes the feature selection operation. The ML stage runs GBTree, LightGBM, and XGBoost algorithms. Moreover, we use the MLflow framework in the proposed process management to observe the ML lifecycle, including experimentation, reproducibility, and deployment. The experimental results show that the XGBoost algorithm achieves the best result with an F1 score of above 99%.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 42 条
[1]   Identifying Encrypted Malware Traffic with Contextual Flow Data [J].
Anderson, Blake ;
McGrew, David .
AISEC'16: PROCEEDINGS OF THE 2016 ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, 2016, :35-46
[2]  
[Anonymous], 2021, Measuring Digital Development: Facts and Figures 2021
[3]   NFStream A flexible network data analysis framework [J].
Aouini, Zied ;
Pekar, Adrian .
COMPUTER NETWORKS, 2022, 204
[4]   Application layer classification of Internet traffic using ensemble learning models [J].
Arfeen, Asad ;
Ul Haq, Khizar ;
Yasir, Syed Muhammad .
INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT, 2021, 31 (04)
[5]   Malware traffic classification using principal component analysis and artificial neural network for extreme surveillance [J].
Arivudainambi, D. ;
Kumar, Varun K. A. ;
Chakkaravarthy, Sibi S. ;
Visu, P. .
COMPUTER COMMUNICATIONS, 2019, 147 :50-57
[6]  
Bozkir R., 2022, European Journal of Science and Technology, V36, P276
[7]   Encrypted Network Traffic Classification Using Deep and Parallel Network-in-Network Models [J].
Bu, Zhiyong ;
Zhou, Bin ;
Cheng, Pengyu ;
Zhang, Kecheng ;
Ling, Zhen-Hua .
IEEE ACCESS, 2020, 8 :132950-132959
[8]   A Network Traffic Classification Model Based on Metric Learning [J].
Chen, Mo ;
Wang, Xiaojuan ;
He, Mingshu ;
Jin, Lei ;
Javeed, Khalid ;
Wang, Xiaojun .
CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 64 (02) :941-959
[9]   On using eXtreme Gradient Boosting (XGBoost) Machine Learning algorithm for Home Network Traffic Classification [J].
Cherif, Iyad Lahsen ;
Kortebi, Abdesselem .
2019 WIRELESS DAYS (WD), 2019,
[10]  
Cisco systems, 2022, GLOB 2021 FOR HIGHL