A Big Data-Enabled Hierarchical Framework for Traffic Classification

被引：21

作者：

Bovenzi, Giampaolo ^{[1
]}

Aceto, Giuseppe ^{[2
]}

Ciuonzo, Domenico ^{[2
]}

Persico, Valerio ^{[1
]}

Pescape, Antonio ^{[3
]}

机构：

[1] Univ Napoli Federico II, DIETI, I-80125 Naples, Italy

[2] Univ Napoli Federico II, I-80125 Naples, Italy

[3] Univ Napoli Federico II, Comp Engn, I-80125 Naples, Italy

来源：

IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING | 2020年 / 7卷 / 04期

关键词：

Parallel processing; Data models; Internet; Support vector machines; Tools; Task analysis; Computer crime; Big data; dark web; encrypted traffic; hierarchical classification; traffic classification;

D O I：

10.1109/TNSE.2020.3009832

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

According to the critical requirements of the Internet, a wide range of privacy-preserving technologies are available, e.g. proxy sites, virtual private networks, and anonymity tools. Such mechanisms are challenged by traffic-classification endeavors which are crucial for network-management tasks and have recently become a milestone in their privacy-degree assessment, both from attacker and designer standpoints. Further, the new Internet era is characterized by the capillary distribution of smart devices leveraging high-capacity communication infrastructures: this results in huge amount of heterogeneous network traffic, i.e. big data. Hence, herein we present BDeH, a novel hierarchical framework for traffic classification of anonymity tools. BDeH is enabled by big data-paradigm and capitalizes the machine learning workhorse for operating with encrypted traffic. In detail, our proposal allows for seamless integration of data parallelism provided by big-data technologies with model parallelism enabled by hierarchical approaches. Results prove that the so-achieved double parallelism carries no negative impact on traffic-classification effectiveness at any granularity level and achieves non negligible performance enhancements with respect to non-hierarchical architectures (+4.5% F-measure). Also, it significantly gains over either pure data or pure model parallelism (resp. centralized) approaches by reducing both training completion time-up to 78% (resp. 90%)-and cloud-deployment cost-up to 31% (resp. 10%).

引用

页码：2608 / 2619

页数：12

共 31 条

[1] Aceto G., 2019, P IEEE ACM NETW TRAF
[2] Aceto G, 2019, 2019 4TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND SECURITY (ICCCS)
[3] Internet Censorship detection: A survey
Aceto, Giuseppe
Pescape, Antonio
[J]. COMPUTER NETWORKS, 2015, 83 : 381 - 421
[4] DDoS Detection System: Using a Set of Classification Algorithms Controlled by Fuzzy Logic System in Apache Spark
Alsirhani, Amjad
Sampalli, Srinivas
Bodorik, Peter
[J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2019, 16 (03): : 936 - 949
[5] [Anonymous], 2010, KSII T INTERNET INF
[6] The Dagstuhl Beginners Guide to Reproducibility for Experimental Networking Research
Bajpai, Vaibhav
Brunstrom, Anna
Feldmann, Anja
Kellerer, Wolfgang
Pras, Aiko
Schulzrinne, Henning
Smaragdakis, Georgios
Waehlisch, Matthias
Wehrle, Klaus
[J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2019, 49 (01) : 24 - 30
[7] Worm traffic analysis and characterization
Dainotti, Alberto
Pescape, Antonio
Ventre, Giorgio
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-14, 2007, : 1435 - 1442
[8] Issues and Future Directions in Traffic Classification
Dainotti, Alberto
Pescape, Antonio
Claffy, Kimberly C.
[J]. IEEE NETWORK, 2012, 26 (01): : 35 - 40
[9] Scalable Network Traffic Classification Using Distributed Support Vector Machines
Do Le Quoc
D'Alessandro, Valerio
Park, Byungchul
Romano, Luigi
Fetzer, Christof
[J]. 2015 IEEE 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, 2015, : 1008 - 1012
[10] Novel feature selection and classification of Internet video traffic based on a hierarchical scheme
Dong, Yu-ning
Zhao, Jia-jie
Jin, Jiong
[J]. COMPUTER NETWORKS, 2017, 119 : 102 - 111

← 1 2 3 4 →