Hunting attacks in the dark: clustering and correlation analysis for unsupervised anomaly detection

被引:19
作者
Mazel, Johan [1 ]
Casas, Pedro [2 ]
Fontugne, Romain [1 ]
Fukuda, Kensuke [1 ]
Owezarski, Philippe [3 ,4 ]
机构
[1] NII, Tokyo, Japan
[2] Telecommun Res Ctr Vienna FTW, Vienna, Austria
[3] CNRS, LAAS, F-31077 Toulouse 4, France
[4] Univ Toulouse, UPS, INSA, ISAE,UTI,UTM,LAAS,INP, F-31077 Toulouse 4, France
关键词
unsupervised anomaly detection & characterization; clustering; outliers detection; anomaly correlation; filtering rules; MAWILab; BACKBONE NETWORKS; DIAGNOSIS; PCA;
D O I
10.1002/nem.1903
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Network anomalies and attacks represent a serious challenge to ISPs, who need to cope with an increasing number of unknown events that put their networks' integrity at risk. Most of the network anomaly detection systems proposed so far employ a supervised strategy to accomplish their task, using either signature-based detection methods or supervised-learning techniques. The former fails to detect unknown anomalies, exposing the network to severe consequences; the latter requires labeled traffic, which is difficult and expensive to produce. In this paper, we introduce a powerful unsupervised approach to detect and characterize network anomalies in the dark, that is, without relying on signatures or labeled traffic. Unsupervised detection is accomplished by means of robust clustering techniques, combining subspace clustering with correlation analysis to blindly identify anomalies. To alleviate network operator's post-processing tasks and to speed up the deployment of effective countermeasures, anomaly ranking and characterization are automatically performed on the detected events. The system is extensively tested with real traffic from the Widely Integrated Distributed Environment backbone network, spanning 6years of flows captured from a trans-Pacific link between Japan and the USA, using the MAWILab framework for ground-truth generation. We additionally evaluate the proposed approach with synthetic data, consisting of traffic from an operational network with synthetic attacks. Finally, we compare the performance of the unsupervised detection against different previously used unsupervised detection techniques, as well as against multiple anomaly detectors used in MAWILab. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:283 / 305
页数:23
相关论文
共 51 条
[1]  
Agrawal R., 1998, SIGMOD Record, V27, P94, DOI 10.1145/276305.276314
[2]  
Allman M, 2007, IMC'07: PROCEEDINGS OF THE 2007 ACM SIGCOMM INTERNET MEASUREMENT CONFERENCE, P77
[3]  
[Anonymous], 2011, 2011 7 INT C NETW SE
[4]  
[Anonymous], PROCEEDINGS OF THE O
[5]  
[Anonymous], P INFOCOM
[6]  
[Anonymous], 2005, Internet Measurement Conference
[7]  
[Anonymous], 2007, METROLOGY SECURITY Q
[8]  
Barford P, 2002, IMW 2002: PROCEEDINGS OF THE SECOND INTERNET MEASUREMENT WORKSHOP, P71, DOI 10.1145/637201.637210
[9]  
Barnett RichardJ., 2008, Proceedings of the 2008 annual research conference of the South African Institute of Computer Scientists and Information Technologists on IT research in developing countries: riding the wave of technology, P1
[10]  
Bhuyan MH, 2014, COMPUT INFORM, V33, P1