Building an efficient intrusion detection system based on feature selection and ensemble classifier

被引:340
作者
Zhou, Yuyang [1 ,2 ,3 ]
Cheng, Guang [1 ,2 ,3 ]
Jiang, Shanqing [1 ,4 ]
Dai, Mian [1 ,2 ,3 ]
机构
[1] Southeast Univ, Sch Cyber Sci & Engn, Nanjing, Peoples R China
[2] Minist Educ, Key Lab Comp Network & Informat Integrat, Nanjing, Peoples R China
[3] Southeast Univ, Jiangsu Prov Key Lab Comp Network Technol, Nanjing, Peoples R China
[4] Natl Key Lab Sci & Technol Informat Syst Secur, Beijing, Peoples R China
关键词
Cyber security; Intrusion detection system; Data mining; Feature selection; Ensemble classifier; ALGORITHM; FOREST; MODEL; ATTACKS; IDS;
D O I
10.1016/j.comnet.2020.107247
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Intrusion detection system (IDS) is one of extensively used techniques in a network topology to safeguard the integrity and availability of sensitive assets in the protected systems. Although many supervised and unsupervised learning approaches from the field of machine learning have been used to increase the efficacy of IDSs, it is still a problem for existing intrusion detection algorithms to achieve good performance. First, lots of redundant and irrelevant data in high-dimensional datasets interfere with the classification process of an IDS. Second, an individual classifier may not perform well in the detection of each type of attacks. Third, many models are built for stale datasets, making them less adaptable for novel attacks. Thus, we propose a new intrusion detection framework in this paper, and this framework is based on the feature selection and ensemble learning techniques. In the first step, a heuristic algorithm called CFS-BA is proposed for dimensionality reduction, which selects the optimal subset based on the correlation between features. Then, we introduce an ensemble approach that combines C4.5, Random Forest (RF), and Forest by Penalizing Attributes (Forest PA) algorithms. Finally, voting technique is used to combine the probability distributions of the base learners for attack recognition. The experimental results, using NSL-KDD, AWID, and CIC-IDS2017 datasets, reveal that the proposed CFS-BA-Ensemble method is able to exhibit better performance than other related and state of the art approaches under several metrics.
引用
收藏
页数:17
相关论文
共 99 条
[11]  
[Anonymous], SPACE
[12]  
[Anonymous], 2016, Transactions on Machine Learning and Artificial Intelligence
[13]  
[Anonymous], AUT J MODEL SIMULAT
[14]   DeepDetect: Detection of Distributed Denial of Service Attacks Using Deep Learning [J].
Asad, Muhammad ;
Asim, Muhammad ;
Javed, Talha ;
Beg, Mirza O. ;
Mujtaba, Hasan ;
Abbas, Sohail .
COMPUTER JOURNAL, 2020, 63 (07) :983-994
[15]   Fuzziness based semi-supervised learning approach for intrusion detection system [J].
Ashfaq, Rana Aamir Raza ;
Wang, Xi-Zhao ;
Huang, Joshua Zhexue ;
Abbas, Haider ;
He, Yu-Lin .
INFORMATION SCIENCES, 2017, 378 :484-497
[16]  
Azhagusundari B., 2013, Int. J. Innov. Technol. Explor. Eng., V2, P18
[17]  
Bala R., 2019, INT J ADV RES COMPUT, V10, DOI DOI 10.26483/IJARCS.V10I2.6395
[18]  
Bansal Ashu, 2018, Advances in Computing and Data Sciences: Second International Conference, ICACDS 2018, Dehradun, India, April 20-21, 2018, Revised Selected Papers, Part I. Communications in Computer and Information Science (905), P372, DOI 10.1007/978-981-13-1810-8_37
[19]   Feature selection for high-dimensional data [J].
Bolón-Canedo V. ;
Sánchez-Maroño N. ;
Alonso-Betanzos A. .
Progress in Artificial Intelligence, 2016, 5 (02) :65-75
[20]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140