SETA plus plus : Real-Time Scalable Encrypted Traffic Analytics in Multi-Gbps Networks

被引:12
作者
Kattadige, Chamara [1 ]
Choi, Kwon Nung [1 ]
Wijesinghe, Achintha [2 ]
Nama, Arpit [1 ]
Thilakarathna, Kanchana [1 ]
Seneviratne, Suranga [1 ]
Jourjon, Guillaume [3 ]
机构
[1] Univ Sydney, Sch Comp Sci, Sydney, NSW 2006, Australia
[2] Univ Moratuwa, Sch Elect & Telecommun, Moratuwa 10400, Sri Lanka
[3] CSIRO, Data61, Marsfield, NSW 2154, Australia
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2021年 / 18卷 / 03期
关键词
Streaming media; Cryptography; Real-time systems; Privacy; Encryption; Quality of experience; Estimation; Encrypted traffic; flow sampling; flow sketching; network measurements; side-channel attacks;
D O I
10.1109/TNSM.2021.3085097
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The security and privacy of the end-users are a few of the most important components of a communication network. Though end-to-end encryption (e.g., TLS/SSL) fulfils this requirement, it makes inspecting network traffic with legacy solutions such as Deep Packet Inspection difficult. Recent Machine Learning techniques have shown outstanding performance in encrypted traffic classification. Nevertheless, such approaches require efficient flow sampling at real enterprise-scale networks due to the sheer volume of transferred data. Through this paper, we propose a holistic architecture to extract flow information of encrypted data at multi Gbps line rate using sampling and sketching mechanisms, enabling network operators to estimate flow size distribution accurately and understand the behavior of VPN-obfuscated traffic. Using over 6000 video traffic traces, under three main evaluation scenarios based on trace duration and starting time point, we show that it is possible to achieve 99% accuracy for service provider classification and over 90% accuracy for content classification for a given service provider in the best case. We also deploy our solution at an operational enterprise-scale network leveraging kernel bypassing to demonstrate its capability to efficiently sample live traffic for analytics.
引用
收藏
页码:3244 / 3259
页数:16
相关论文
共 46 条
  • [1] Mobile Encrypted Traffic Classification Using Deep Learning: Experimental Evaluation, Lessons Learned, and Challenges
    Aceto, Giuseppe
    Ciuonzo, Domenico
    Montieri, Antonio
    Pescape, Antonio
    [J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2019, 16 (02): : 445 - 458
  • [2] [Anonymous], 2004, 4 ACM SIGCOMM C INT
  • [3] [Anonymous], 2019, IEEE GLOB COMM CONF
  • [4] PINT: Probabilistic In-band Network Telemetry
    Ben Basat, Ran
    Ramanathan, Sivaramakrishnan
    Li, Yuliang
    Antichi, Gianni
    Yu, Minlan
    Mitzenmacher, Michael
    [J]. SIGCOMM '20: PROCEEDINGS OF THE 2020 ANNUAL CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION ON THE APPLICATIONS, TECHNOLOGIES, ARCHITECTURES, AND PROTOCOLS FOR COMPUTER COMMUNICATION, 2020, : 662 - 680
  • [5] Bronzino F., 2019, ARXIV190105800
  • [6] When YouTube Does not Work-Analysis of QoE-Relevant Degradation in Google CDN Traffic
    Casas, Pedro
    D'Alconzo, Alessandro
    Fiadino, Pierdomenico
    Baer, Arian
    Finamore, Alessandro
    Zseby, Tanja
    [J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2014, 11 (04): : 441 - 457
  • [7] Finding frequent items in data streams
    Charikar, M
    Chen, K
    Farach-Colton, M
    [J]. THEORETICAL COMPUTER SCIENCE, 2004, 312 (01) : 3 - 15
  • [8] Side-Channel Leaks in Web Applications: a Reality Today, a Challenge Tomorrow
    Chen, Shuo
    Wang, Rui
    Wang, XiaoFeng
    Zhang, Kehuan
    [J]. 2010 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, 2010, : 191 - 206
  • [9] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [10] Clegg R. G., 2007, ARXIV07051939