DBStream: A holistic approach to large-scale network traffic monitoring and analysis

被引:18
作者
Baer, Arian [1 ]
Casas, Pedro [2 ]
D'Alconzo, Alessandro [2 ]
Fiadino, Pierdomenico [3 ]
Golab, Lukasz [4 ]
Mellia, Marco [5 ]
Schikuta, Erich [6 ]
机构
[1] FTW Forschungszentrum Telekommunikat Wien, Donau City St 1, A-1220 Vienna, Austria
[2] Austrian Inst Technol GmbH, AIT, Vienna, Austria
[3] EURECAT Technol Ctr Catalonia, Ave Diagonal 177,Planta 9, Barcelona 08018, Spain
[4] Univ Waterloo, 200 Univ Ave West, Waterloo, ON, Canada
[5] Politecn Torino, Corso Duca Abruzzi 24, I-10129 Turin, Italy
[6] Univ Vienna, Waehringerstr 29, A-1090 Vienna, Austria
关键词
Network monitoring; Data stream warehouse; Machine-to-machine traffic; On-line traffic classification; Machine learning; Cellular networks; DEGRADATION; MAPREDUCE; YOUTUBE;
D O I
10.1016/j.comnet.2016.04.020
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the last decade, many systems for the extraction of operational statistics from computer network interconnects have been designed and implemented. Those systems generate huge amounts of data of various formats and in various granularities, from packet level to statistics about whole flows. In addition, the complexity of Internet services has increased drastically with the introduction of cloud infrastructures, Content Delivery Networks (CDNs) and mobile Internet usage, and complexity will continue to increase in the future with the rise of Machine-to-Machine communication and ubiquitous wearable devices. Therefore, current and future network monitoring frameworks cannot rely only on information gathered at a single network interconnect, but must consolidate information from various vantage points distributed across the network. In this paper, we present DBStream, a holistic approach to large-scale network monitoring and analysis applications. After a precise system introduction, we show how its Continuous Execution Language (CEL) can be used to automate several data processing and analysis tasks typical for monitoring operational ISP networks. We discuss the performance of DBStream as compared to MapReduce processing engines and show how intelligent job scheduling can increase its performance even further. Furthermore, we show the versatility of DBStream by explaining how it has been integrated to import and process data from two passive network monitoring systems, namely METAWIN and Tstat. Finally, multiple examples of network monitoring applications are given, ranging from simple statistical analysis to more complex traffic classification tasks applying machine learning techniques using the Weka toolkit. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:5 / 19
页数:15
相关论文
共 50 条
[21]   Automated assessment of balance: A neural network approach based on large-scale balance function data [J].
Wu, Jingsong ;
Li, Yang ;
Yin, Lianhua ;
He, Youze ;
Wu, Tiecheng ;
Ruan, Chendong ;
Li, Xidian ;
Wu, Jianhuang ;
Tao, Jing .
FRONTIERS IN PUBLIC HEALTH, 2022, 10
[22]   Large-Scale Measurement and Characterization of Cellular Machine-to-Machine Traffic [J].
Shafiq, M. Zubair ;
Ji, Lusheng ;
Liu, Alex X. ;
Pang, Jeffrey ;
Wang, Jia .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2013, 21 (06) :1960-1973
[23]   An ML-Accelerated Framework for Large-Scale Constrained Traffic Engineering [J].
Gu, Cheng ;
Song, Xin ;
Ng, Ben Hok ;
Xiang, Qiao ;
Guo, Zehua ;
Li, Geng .
2024 IEEE 44TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, ICDCS 2024, 2024, :47-58
[24]   Enabling Large-Scale Biomedical Analysis in the Cloud [J].
Lin, Ying-Chih ;
Yu, Chin-Sheng ;
Lin, Yen-Jen .
BIOMED RESEARCH INTERNATIONAL, 2013, 2013
[25]   Large-scale seismic signal analysis with Hadoop [J].
Addair, T. G. ;
Dodge, D. A. ;
Walter, W. R. ;
Ruppert, S. D. .
COMPUTERS & GEOSCIENCES, 2014, 66 :145-154
[26]   A New Approach to Multivariate Network Traffic Analysis [J].
Kim, Jinoh ;
Sim, Alex .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (02) :388-402
[27]   A New Approach to Multivariate Network Traffic Analysis [J].
Jinoh Kim ;
Alex Sim .
Journal of Computer Science and Technology, 2019, 34 :388-402
[28]   Denoising large-scale biological data using network filters [J].
Andrew J. Kavran ;
Aaron Clauset .
BMC Bioinformatics, 22
[29]   Denoising large-scale biological data using network filters [J].
Kavran, Andrew J. ;
Clauset, Aaron .
BMC BIOINFORMATICS, 2021, 22 (01)
[30]   A distributed computation of the shortest path in large-scale road network [J].
Zhang, Dongbo ;
Zhang, Wei ;
Yang, Rui ;
Guo, Mamman ;
Chen, Chien-Ming .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019,