A Big Data-Enabled Hierarchical Framework for Traffic Classification

被引:23
作者
Bovenzi, Giampaolo [1 ]
Aceto, Giuseppe [2 ]
Ciuonzo, Domenico [2 ]
Persico, Valerio [1 ]
Pescape, Antonio [3 ]
机构
[1] Univ Napoli Federico II, DIETI, I-80125 Naples, Italy
[2] Univ Napoli Federico II, I-80125 Naples, Italy
[3] Univ Napoli Federico II, Comp Engn, I-80125 Naples, Italy
来源
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING | 2020年 / 7卷 / 04期
关键词
Parallel processing; Data models; Internet; Support vector machines; Tools; Task analysis; Computer crime; Big data; dark web; encrypted traffic; hierarchical classification; traffic classification;
D O I
10.1109/TNSE.2020.3009832
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
According to the critical requirements of the Internet, a wide range of privacy-preserving technologies are available, e.g. proxy sites, virtual private networks, and anonymity tools. Such mechanisms are challenged by traffic-classification endeavors which are crucial for network-management tasks and have recently become a milestone in their privacy-degree assessment, both from attacker and designer standpoints. Further, the new Internet era is characterized by the capillary distribution of smart devices leveraging high-capacity communication infrastructures: this results in huge amount of heterogeneous network traffic, i.e. big data. Hence, herein we present BDeH, a novel hierarchical framework for traffic classification of anonymity tools. BDeH is enabled by big data-paradigm and capitalizes the machine learning workhorse for operating with encrypted traffic. In detail, our proposal allows for seamless integration of data parallelism provided by big-data technologies with model parallelism enabled by hierarchical approaches. Results prove that the so-achieved double parallelism carries no negative impact on traffic-classification effectiveness at any granularity level and achieves non negligible performance enhancements with respect to non-hierarchical architectures (+4.5% F-measure). Also, it significantly gains over either pure data or pure model parallelism (resp. centralized) approaches by reducing both training completion time-up to 78% (resp. 90%)-and cloud-deployment cost-up to 31% (resp. 10%).
引用
收藏
页码:2608 / 2619
页数:12
相关论文
共 31 条
[1]  
Aceto G., 2019, P IEEE ACM NETW TRAF
[2]  
Aceto G, 2019, 2019 4TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND SECURITY (ICCCS)
[3]   Internet Censorship detection: A survey [J].
Aceto, Giuseppe ;
Pescape, Antonio .
COMPUTER NETWORKS, 2015, 83 :381-421
[4]   DDoS Detection System: Using a Set of Classification Algorithms Controlled by Fuzzy Logic System in Apache Spark [J].
Alsirhani, Amjad ;
Sampalli, Srinivas ;
Bodorik, Peter .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2019, 16 (03) :936-949
[5]  
[Anonymous], 2010, KSII T INTERNET INF
[6]   The Dagstuhl Beginners Guide to Reproducibility for Experimental Networking Research [J].
Bajpai, Vaibhav ;
Brunstrom, Anna ;
Feldmann, Anja ;
Kellerer, Wolfgang ;
Pras, Aiko ;
Schulzrinne, Henning ;
Smaragdakis, Georgios ;
Waehlisch, Matthias ;
Wehrle, Klaus .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2019, 49 (01) :24-30
[7]   Worm traffic analysis and characterization [J].
Dainotti, Alberto ;
Pescape, Antonio ;
Ventre, Giorgio .
2007 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-14, 2007, :1435-1442
[8]   Issues and Future Directions in Traffic Classification [J].
Dainotti, Alberto ;
Pescape, Antonio ;
Claffy, Kimberly C. .
IEEE NETWORK, 2012, 26 (01) :35-40
[9]   Scalable Network Traffic Classification Using Distributed Support Vector Machines [J].
Do Le Quoc ;
D'Alessandro, Valerio ;
Park, Byungchul ;
Romano, Luigi ;
Fetzer, Christof .
2015 IEEE 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, 2015, :1008-1012
[10]   Novel feature selection and classification of Internet video traffic based on a hierarchical scheme [J].
Dong, Yu-ning ;
Zhao, Jia-jie ;
Jin, Jiong .
COMPUTER NETWORKS, 2017, 119 :102-111