Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

被引:16
作者
Lee, Jinlee [1 ]
Park, Dooho [2 ]
Lee, Changhoon [1 ]
机构
[1] Konkuk Univ, Div Comp Sci & Engn, Seoul, South Korea
[2] XIIlab Co Ltd, Intelligent Serv Dev, Seoul, South Korea
关键词
FeatureSelection; SFFS; RandomForest; IDS;
D O I
10.3837/tiis.2017.10.024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cyber attacks are evolving commensurate with recent developments in information security technology. Intrusion detection systems collect various types of data from computers and networks to detect security threats and analyze the attack information. The large amount of data examined make the large number of computations and low detection rates problematic. Feature selection is expected to improve the classification performance and provide faster and more cost-effective results. Despite the various feature selection studies conducted for intrusion detection systems, it is difficult to automate feature selection because it is based on the knowledge of security experts. This paper proposes a feature selection technique to overcome the performance problems of intrusion detection systems. Focusing on feature selection, the first phase of the proposed system aims at constructing a feature subset using a sequential forward floating search (SFFS) to downsize the dimension of the variables. The second phase constructs a classification model with the selected feature subset using a random forest classifier (RFC) and evaluates the classification accuracy. Experiments were conducted with the NSL-KDD dataset using SFFS-RF, and the results indicated that feature selection techniques are a necessary preprocessing step to improve the overall system performance in systems that handle large datasets. They also verified that SFFS-RF could be used for data classification. In conclusion, SFFS-RF could be the key to improving the classification model performance in machine learning.
引用
收藏
页码:5112 / 5128
页数:17
相关论文
共 19 条
[1]   A proposed HTTP service based IDS [J].
Abd-Eldayem, Mohamed M. .
EGYPTIAN INFORMATICS JOURNAL, 2014, 15 (01) :13-24
[2]  
Baumann F, 2013, LECT NOTES COMPUT SC, V7944, P131
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[5]  
de la Hoz E, 2013, LECT NOTES COMPUT SC, V8073, P103, DOI 10.1007/978-3-642-40846-5_11
[6]  
Eid HF, 2013, COMM COM INF SC, V381, P240
[7]  
Eid HF, 2011, COMM COM INF SC, V259, P195
[8]  
Frank A., 2010, UCI MACHINE LEARNING
[9]  
Hall M. A., 1999, Proceedings of the Twelfth International Florida AI Research Society Conference, P235
[10]  
Huan Liu, 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P319