Automatic Dataset Labelling and Feature Selection for Intrusion Detection Systems

被引:17
作者
Aparicio-Navarro, Francisco J. [1 ]
Kyriakopoulos, Konstantinos G. [1 ]
Parish, David J. [1 ]
机构
[1] Univ Loughborough, Sch Elect Elect & Syst Engn, Loughborough LE11 3TU, Leics, England
来源
2014 IEEE MILITARY COMMUNICATIONS CONFERENCE: AFFORDABLE MISSION SUCCESS: MEETING THE CHALLENGE (MILCOM 2014) | 2014年
基金
英国工程与自然科学研究理事会;
关键词
Automatic Labelling; Network Traffic Labelling; Unsupervised Anomaly IDS; Feature Selection; Genetic Algorithm;
D O I
10.1109/MILCOM.2014.17
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Correctly labelled datasets are commonly required. Three particular scenarios are highlighted, which showcase this need. When using supervised Intrusion Detection Systems (IDSs), these systems need labelled datasets to be trained. Also, the real nature of the analysed datasets must be known when evaluating the efficiency of the IDSs when detecting intrusions. Another scenario is the use of feature selection that works only if the processed datasets are labelled. In normal conditions, collecting labelled datasets from real networks is impossible. Currently, datasets are mainly labelled by implementing off-line forensic analysis, which is impractical because it does not allow real-time implementation. We have developed a novel approach to automatically generate labelled network traffic datasets using an unsupervised anomaly based IDS. The resulting labelled datasets are subsets of the original unlabelled datasets. The labelled dataset is then processed using a Genetic Algorithm (GA) based approach, which performs the task of feature selection. The GA has been implemented to automatically provide the set of metrics that generate the most appropriate intrusion detection results.
引用
收藏
页码:46 / 51
页数:6
相关论文
共 11 条
[1]  
Aparicio-Navarro Francisco J., 2013, International Journal of Internet Technology and Secured Transactions, V5, P42, DOI 10.1504/IJITST.2013.058294
[2]   Teaching genetic algorithm using MATLAB [J].
Cao, YJ ;
Wu, QH .
INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING EDUCATION, 1999, 36 (02) :139-153
[3]   Automatic network intrusion detection: Current techniques and open issues [J].
Catania, Carlos A. ;
Garcia Garino, Carlos .
COMPUTERS & ELECTRICAL ENGINEERING, 2012, 38 (05) :1062-1072
[4]   A New Data-Mining Based Approach for Network Intrusion Detection [J].
Dartigue, Christine ;
Jang, Hyun Ik ;
Zeng, Wenjun .
2009 7TH ANNUAL COMMUNICATION NETWORKS AND SERVICES RESEARCH CONFERENCE, 2009, :372-377
[5]   Data preprocessing for anomaly based network intrusion detection: A review [J].
Davis, Jonathan J. ;
Clark, Andrew J. .
COMPUTERS & SECURITY, 2011, 30 (6-7) :353-375
[6]  
Eskin Eleazar, 2002, APPL DATA MINING COM, V6, P77, DOI DOI 10.1007/978-1-4615-0953-0_4
[7]  
Fan J., 2009, THESIS LOUGHBOROUGH
[8]   Automatically building datasets of labeled IP traffic traces: A self-training approach [J].
Gargiulo, Francesco ;
Mazzariello, Claudio ;
Sansone, Carlo .
APPLIED SOFT COMPUTING, 2012, 12 (06) :1640-1649
[9]  
Laskov P, 2005, LECT NOTES COMPUT SC, V3617, P50, DOI 10.1007/11553595_6
[10]   Improving Effectiveness of Intrusion Detection by Correlation Feature Selection [J].
Nguyen, Hai ;
Franke, Katrin ;
Petrovic, Slobodan .
FIFTH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY, AND SECURITY: ARES 2010, PROCEEDINGS, 2010, :17-24