Efficient classification using parallel and scalable compressed model and its application on intrusion detection

被引:27
作者
Chen, Tieming [1 ]
Zhang, Xu [1 ]
Jin, Shichao [2 ]
Kim, Okhee [2 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Zhejiang, Peoples R China
[2] Peking Univ, Sch Software Microelect, Beijing 100190, Peoples R China
关键词
Compressed model; Map Reduce; Parallelization; Classification; Intrusion detection; SUPPORT VECTOR MACHINES; DETECTION SYSTEM; MAPREDUCE;
D O I
10.1016/j.eswa.2014.04.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to achieve high efficiency of classification in intrusion detection, a compressed model is proposed in this paper which combines horizontal compression with vertical compression. OneR is utilized as horizontal compression for attribute reduction, and affinity propagation is employed as vertical compression to select small representative exemplars from large training data. As to be able to computationally compress the larger volume of training data with scalability, MapReduce based parallelization approach is then implemented and evaluated for each step of the model compression process abovementioned, on which common but efficient classification methods can be directly used. Experimental application study on two publicly available datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the classification using the compressed model proposed can effectively speed up the detection procedure at up to 184 times, most importantly at the cost of a minimal accuracy difference with less than 1% on average. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5972 / 5983
页数:12
相关论文
共 47 条
[1]   Mutual information-based feature selection for intrusion detection systems [J].
Amiri, Fatemeh ;
Yousefi, MohammadMahdi Rezaei ;
Lucas, Caro ;
Shakery, Azadeh ;
Yazdani, Nasser .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (04) :1184-1199
[2]  
[Anonymous], 2013, P 15 ANN C COMP GEN
[3]  
[Anonymous], 2013, INT J COMPUT APPL
[4]  
[Anonymous], 2011, P 2011 ACM S APPL CO
[5]  
[Anonymous], 2012, 3 CYBERSECURITY DATA
[6]   Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset [J].
Bolon-Canedo, V. ;
Sanchez-Marono, N. ;
Alonso-Betanzos, A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) :5947-5957
[7]  
Brauckhoff Daniela., 2010, INFOCOM'10: Proceedings of the 29th conference on Information communications, P713
[8]   Big Data Analytics for Security [J].
Cardenas, Alvaro A. ;
Manadhata, Pratyusa K. ;
Rajan, Sreeranga P. .
IEEE SECURITY & PRIVACY, 2013, 11 (06) :74-76
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]   Data preprocessing for anomaly based network intrusion detection: A review [J].
Davis, Jonathan J. ;
Clark, Andrew J. .
COMPUTERS & SECURITY, 2011, 30 (6-7) :353-375