An Efficient Technique for Network Traffic Summarizationusing Multi view Clustering and Statistical Sampling

被引:5
作者
Ahmed, Mohiuddin [1 ]
Mahmood, Abdun Naser [1 ]
Maher, Michael J. [1 ]
机构
[1] UNSW, Sch Engn & Informat Technol, Canberra, ACT, Australia
关键词
Scalable Data Mining; Network Traffic Summarization; Multiview Clustering;
D O I
10.4108/sis.2.5.e4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is significant interest in the data mining and network management communities to efficiently analyse huge amounts of network traffic, given the amount of network traffic generated even in small networks. Summarization is a primary data mining task for generating a concise yet informative summary of the given data and it is a research challenge to create summary from network traffic data. Existing clustering based summarization techniques lack the ability to create a suitable summary for further data mining tasks such as anomaly detection and require the summary size as an external input. Additionally, for complex and high dimensional network traffic datasets, there is often no single clustering solution that explains the structure of the given data. In this paper, we investigate the use of multiview clustering to create a meaningful summary using original data instances from network traffic data in an efficient manner. We develop a mathematically sound approach to select the summary size using a sampling technique. We compare our proposed approach with regular clustering based summarization incorporating the summary size calculation method and random approach. We validate our proposed approach using the benchmark network traffic dataset and state-of-the-art summary evaluation metrics.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 18 条
[1]  
Ahmed M., 2014, STATE ART INTRUSION, P3
[2]  
Ahmed M., 2015, FUTURE GENERATION CO
[3]  
Ahmed M, 2014, C IND ELECT APPL, P1780, DOI 10.1109/ICIEA.2014.6931456
[4]   Summarization - compressing data into an informative representation [J].
Chandola, Varun ;
Kumar, Vipin .
KNOWLEDGE AND INFORMATION SYSTEMS, 2007, 12 (03) :355-378
[5]  
Cochran, SAMPLING TECHNIQUES
[6]  
Dang X.-H, 2010, SDM, P118
[7]  
Dang X. H., 2013, MACH LEARN, P1
[8]   Data summarization for network traffic monitoring [J].
Hoplaros, Demetris ;
Tari, Zahir ;
Khalil, Ibrahim .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2014, 37 :194-205
[9]  
Hore P, 2004, IEEE INT CONF FUZZY, P143
[10]   Data clustering: A review [J].
Jain, AK ;
Murty, MN ;
Flynn, PJ .
ACM COMPUTING SURVEYS, 1999, 31 (03) :264-323