A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning

被引:14
作者
Ahmed, Ahmed Abdelmoamen [1 ]
Agunsoye, Gbenga [1 ]
机构
[1] Prairie View A&M Univ, Dept Comp Sci, Prairie View, TX 77446 USA
基金
美国国家科学基金会;
关键词
real-time; traffic classifier; network flow; machine learning; KNN; RF; ANN;
D O I
10.3390/a14080250
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing ubiquity of network traffic and the new online applications' deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send-receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset.
引用
收藏
页数:20
相关论文
共 32 条
[1]   A Modular Approach to Programming Multi-Modal Sensing Applications [J].
Abdelmoamen, Ahmed .
2018 IEEE INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING (ICCC), 2018, :91-98
[2]   A Model for Representing Mobile Distributed Sensing-Based Services [J].
Abdelmoamen, Ahmed ;
Jamali, Nadeem .
2018 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (IEEE SCC 2018), 2018, :282-286
[3]  
Abdelsalam A.M., 2018, 2018 International Symposium on Computers in Education, P1
[4]  
Agha G., 1986, ACTORS MODEL CONCURR
[5]  
Ahmed A. A., 2020, Engineering Reports, V3, P1
[6]  
Ahmed A.A., 2019, SER PDPTA 19, P37
[7]  
Ahmed A.A., 2019, P INT C INTERNET COM, P108
[8]   A Mobile-Based System for Detecting Plant Leaf Diseases Using Deep Learning [J].
Ahmed, Ahmed Abdelmoamen ;
Reddy, Gopireddy Harshavardhan .
AGRIENGINEERING, 2021, 3 (03) :478-493
[9]   A privacy-preserving mobile location-based advertising system for small businesses [J].
Ahmed, Ahmed Abdelmoamen .
ENGINEERING REPORTS, 2021, 3 (11)
[10]   Hawk-Eye: An AI-Powered Threat Detector for Intelligent Surveillance Cameras [J].
Ahmed, Ahmed Abdelmoamen ;
Echi, Mathias .
IEEE ACCESS, 2021, 9 :63283-63293