Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks

被引:6
作者
Giannakou, Anna [1 ]
Gunter, Daniel [1 ]
Peisert, Sean [1 ]
机构
[1] Lawrence Berkeley Natl Lab, Data Sci & Technol Dept, Berkeley, CA 94720 USA
来源
PROCEEDINGS OF INDIS 2018: IEEE/ACM INNOVATING THE NETWORK FOR DATA-INTENSIVE SCIENCE (INDIS) | 2018年
关键词
network security; network performance measurement; anomaly detection;
D O I
10.1109/INDIS.2018.00004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Research networks are designed to support high volume scientific data transfers that span multiple network links. Like any other network, research networks experience anomalies. Anomalies are deviations from profiles of normality in a research network's traffic levels. Diagnosing anomalies is critical both for network operators and users (e.g., scientists). In this paper we present Flowzilla, a general framework for detecting and quantifying anomalies on scientific data transfers of arbitrary size. Flowzilla incorporates Random Forest Regression(RFR) for predicting the size of data transfers and utilizes an adaptive threshold mechanism for detecting outliers. Our results demonstrate that our framework achieves up to 92.5% detection accuracy. Furthermore, we are able to predict data transfer sizes up to 10 weeks after training with accuracy above 90%.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 19 条
[1]  
Abell P.A., 2009, LSST science book
[2]  
Allcock W., 2018, GRIDFTP PROTOCOL EXT
[3]  
[Anonymous], 2004, IMC
[4]  
[Anonymous], 2000, TECH REP
[5]  
Axelsson S., 2000, ACM Transactions on Information and Systems Security, V3, P186, DOI 10.1145/357830.357849
[6]  
Balouek D, 2013, COMM COM INF SC, V367, P3
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Robust Principal Component Analysis? [J].
Candes, Emmanuel J. ;
Li, Xiaodong ;
Ma, Yi ;
Wright, John .
JOURNAL OF THE ACM, 2011, 58 (03)
[9]  
Chhabra A., 2017, P INDIS 2017
[10]  
Dart E., 2013, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, P1