Data preprocessing for anomaly based network intrusion detection: A review

被引:173
作者
Davis, Jonathan J. [1 ]
Clark, Andrew J. [2 ]
机构
[1] DSTO, Div C3I, Edinburgh, SA 5111, Australia
[2] Queensland Univ Technol, Informat Secur Inst, Brisbane, Qld 4001, Australia
关键词
Data preprocessing; Network intrusion; Anomaly detection; Data mining; Feature construction; Feature selection; SYSTEM;
D O I
10.1016/j.cose.2011.05.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data preprocessing is widely recognized as an important stage in anomaly detection. This paper reviews the data preprocessing techniques used by anomaly-based network intrusion detection systems (NIDS), concentrating on which aspects of the network traffic are analyzed, and what feature construction and selection methods have been used. Motivation for the paper comes from the large impact data preprocessing has on the accuracy and capability of anomaly-based NIDS. The review finds that many NIDS limit their view of network traffic to the TCP/IP packet headers. Time-based statistics can be derived from these headers to detect network scans, network worm behavior, and denial of service attacks. A number of other NIDS perform deeper inspection of request packets to detect attacks against network services and network applications. More recent approaches analyze full service responses to detect attacks targeting clients. The review covers a wide range of NIDS, highlighting which classes of attack are detectable by each of these approaches. Data preprocessing is found to predominantly rely on expert domain knowledge for identifying the most relevant parts of network traffic and for constructing the initial candidate set of traffic features. On the other hand, automated methods have been widely used for feature extraction to reduce data dimensionality, and feature selection to find the most relevant subset of features from this candidate set. The review shows a trend toward deeper packet inspection to construct more relevant features through targeted content parsing. These context sensitive features are required to detect current attacks. Crown Copyright (C) 2011 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:353 / 375
页数:23
相关论文
共 74 条
[11]  
Bolzoni D, 2009, LECT NOTES COMPUT SC, V5758, P1, DOI 10.1007/978-3-642-04342-0_1
[12]   Anomaly Detection: A Survey [J].
Chandola, Varun ;
Banerjee, Arindam ;
Kumar, Vipin .
ACM COMPUTING SURVEYS, 2009, 41 (03)
[13]   Feature deduction and ensemble design of intrusion detection systems [J].
Chebrolu, S ;
Abraham, A ;
Thomas, JP .
COMPUTERS & SECURITY, 2005, 24 (04) :295-307
[14]  
CHEN CM, 2009, NETWORKS COMMUNICATI, P358
[15]  
Cova M, 2010, P 19 INT C WORLD WID, P281, DOI DOI 10.1145/1772690.1772720
[16]  
Dhamankar R., 2009, The top cyber security risks
[17]   Fuzzy network profiling for intrusion detection [J].
Dickerson, JE ;
Dickerson, JA .
PEACHFUZZ 2000 : 19TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 2000, :301-306
[18]  
Dokas, 2004, NEXT GENERATION DATA
[19]  
Early JP, 2006, ADV INFO KNOW PROC, P107
[20]  
Estevez-Tapiador JM, 2003, IWIA 2003: FIRST IEEE INTERNATIONAL WORKSHOP ON INFORMATION ASSURANCE, PROCEEDINGS, P3