Clustering and evolutionary approach for longitudinal web traffic analysis

被引:1
作者
Morichetta, Andrea [1 ]
Mellia, Marco [1 ]
机构
[1] Politecn Torino, Turin, Italy
关键词
Big data; Clustering; Edit distance; Machine learning; Security; Traffic monitoring; NETWORK;
D O I
10.1016/j.peva.2019.102033
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, data-driven approaches have attracted the interest of the research community. Considering network monitoring, unsupervised machine learning solutions such as clustering are particularly appealing to let the network analysts observe patterns, and track the evolution of traffic over time. In this paper, we present a novel unsupervised methodology to automatically process and analyze batches of HTTP traffic, looking just at the URL structure. First, we describe IDBSCAN, Iterative-DBSCAN. We design it to obtain well-shaped clusters, and to simplify the choice of parameters - often a cumbersome step for the network analyst. Second, we show LENTA, Longitudinal Exploration for Network Traffic Analysis, which allows to automatically observe the evolution over time of traffic, naturally highlighting trends and pinpointing anomalies. We first evaluate IDBSCAN and LENTA on synthetic data to compare their performance against well-known algorithms. Then we apply them on a real case, facing the analysis of hundred thousands of URLs collected from a live network. Results show both the goodness of clusters produced by IDBSCAN and LENTA ability to highlight changes in traffic, facilitating the analyst job. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 33 条
  • [1] Aggarwal CC, 2014, CH CRC DATA MIN KNOW, P1
  • [2] Altman Eitan, 2012, 2012 IEEE/ACM International Conference on Advances in Social Network Analysis and Mining, P1
  • [3] Anderson B., 2016, HIDING PLAIN SIGHT M
  • [4] Ankerst M, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P49
  • [5] [Anonymous], SIMILARITY DISSIMILA
  • [6] [Anonymous], 2012, USENIX SEC S
  • [7] SeLINA: A Self-Learning Insightful Network Analyzer
    Apiletti, Daniele
    Baralis, Elena
    Cerquitelli, Tania
    Garza, Paolo
    Giordano, Danilo
    Mellia, Marco
    Venturini, Luca
    [J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2016, 13 (03): : 696 - 710
  • [8] Bär A, 2014, IEEE INT CONF BIG DA, P165, DOI 10.1109/BigData.2014.7004227
  • [9] Bulut E, 2015, IEEE INT CONF COMM, P1563, DOI 10.1109/ICCW.2015.7247402
  • [10] Campello Ricardo J. G. B., 2013, Advances in Knowledge Discovery and Data Mining. 17th Pacific-Asia Conference (PAKDD 2013). Proceedings, P160, DOI 10.1007/978-3-642-37456-2_14