Supervised feature selection techniques in network intrusion detection: A critical review

被引:112
作者
Di Mauro, M. [1 ]
Galatro, G. [2 ]
Fortino, G. [3 ]
Liotta, A. [4 ]
机构
[1] Univ Salerno, Dept Informat & Elect Engn & Appl Math DIEM, I-84084 Fisciano, Italy
[2] Amazon AWS, Belgard Retail Pk, Dublin, Ireland
[3] Univ Calabria, Dept Informat Modeling Elect & Syst, Calabria, Italy
[4] Free Univ Bozen Bolzano, Fac Comp Sci, Bolzano, Italy
关键词
Feature selection; Machine learning; Network intrusion detection; Network performance; IMAGE RECOGNITION; ANOMALY DETECTION; SCATTER SEARCH; ROUGH SET; CLASSIFICATION; ONLINE; FILTER; OPTIMIZATION; ALGORITHMS; ATTACKS;
D O I
10.1016/j.engappai.2021.104216
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) techniques are becoming an invaluable support for network intrusion detection, especially in revealing anomalous flows, which often hide cyber-threats. Typically, ML algorithms are exploited to classify/recognize data traffic on the basis of statistical features such as inter-arrival times, packets length distribution, mean number of flows, etc. Dealing with the vast diversity and number of features that typically characterize data traffic is a hard problem. This results in the following issues: (i) the presence of so many features leads to lengthy training processes (particularly when features are highly correlated), while prediction accuracy does not proportionally improve; (ii) some of the features may introduce bias during the classification process, particularly those that have scarce relation with the data traffic to be classified. To this end, by reducing the feature space and retaining only the most significant features, Feature Selection (FS) becomes a crucial pre-processing step in network management and, specifically, for the purposes of network intrusion detection. In this review paper, we complement other surveys in multiple ways: (i) evaluating more recent datasets (updated w.r.t. obsolete KDD 99) by means of a designed-from-scratch Python-based procedure; (ii) providing a synopsis of most credited FS approaches in the field of intrusion detection, including Multi-Objective Evolutionary techniques; (iii) assessing various experimental analyses such as feature correlation, time complexity, and performance. Our comparisons offer useful guidelines to network/security managers who are considering the incorporation of ML concepts into network intrusion detection, where trade-offs between performance and resource consumption are crucial.
引用
收藏
页数:15
相关论文
共 153 条
[1]   SecSDN-Cloud: Defeating Vulnerable Attacks Through Secure Software-Defined Networks [J].
Abdulqadder, Ihsan H. ;
Zou, Deqing ;
Aziz, Israa T. ;
Yuan, Bin ;
Li, Weiming .
IEEE ACCESS, 2018, 6 :8292-8301
[2]  
Abou Daya A, 2019, 2019 IFIP/IEEE SYMPOSIUM ON INTEGRATED NETWORK AND SERVICE MANAGEMENT (IM), P144
[3]  
Abu Taher K, 2019, 2019 1ST INTERNATIONAL CONFERENCE ON ROBOTICS, ELECTRICAL AND SIGNAL PROCESSING TECHNIQUES (ICREST), P643, DOI [10.1109/ICREST.2019.8644161, 10.1109/icrest.2019.8644161]
[4]   A comparative study of feature selection and classification methods for gene expression data of glioma [J].
Abusamra, Heba .
4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS-BIOLOGY AND BIOINFORMATICS (CSBIO2013), 2013, 23 :5-14
[5]   ADVoIP: Adversarial Detection of Encrypted and Concealed VoIP [J].
Addesso, Paolo ;
Cirillo, Michele ;
Di Mauro, Mario ;
Matta, Vincenzo .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 :943-958
[6]  
Alelyani S, 2014, CH CRC DATA MIN KNOW, P29
[7]  
Aliakbarian MS, 2013, IRAN CONF ELECTR ENG
[8]   Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm [J].
Ambusaidi, Mohammed A. ;
He, Xiangjian ;
Nanda, Priyadarsi ;
Tan, Zhiyuan .
IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (10) :2986-2998
[9]   Mutual information-based feature selection for intrusion detection systems [J].
Amiri, Fatemeh ;
Yousefi, MohammadMahdi Rezaei ;
Lucas, Caro ;
Shakery, Azadeh ;
Yazdani, Nasser .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (04) :1184-1199
[10]   Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection [J].
Ang, Jun Chin ;
Mirzal, Andri ;
Haron, Habibollah ;
Hamed, Haza Nuzly Abdull .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) :971-989