Handling nominal features in anomaly intrusion detection problems

被引:43
作者
Shyu, ML [1 ]
Sarinnapakorn, K [1 ]
Kuruppu-Appuhamilage, I [1 ]
Chen, SC [1 ]
Chang, LW [1 ]
Goldring, T [1 ]
机构
[1] Univ Miami, Dept Elect & Comp Engn, Coral Gables, FL 33124 USA
来源
15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, Proceedings | 2005年
关键词
anomaly detection; intrusion detection; indicator variables; multiple correspondence analysis; nominal features; principal component classifier; NOVELTY DETECTION;
D O I
10.1109/RIDE.2005.10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Computer network data stream used in intrusion detection usually involve many data types. A common data type is that of symbolic or nominal features. Whether being coded into numerical values or not, nominal features need to be treated differently from numeric features. This paper studies the effectiveness of two approaches in handling nominal features: a simple coding scheme via the use of indicator variables and a scaling method based on multiple correspondence analysis (MCA). In particular, we apply the techniques with two anomaly detection methods: the principal component classifier (PCC) and the Canberra metric. The experiments with KDD 1999 data demonstrate that MCA works better than the indicator variable approach for both detection methods with the PCC coming much ahead of the Canberra metric.
引用
收藏
页码:55 / 62
页数:8
相关论文
共 26 条
[1]  
[Anonymous], 1990, Applied Linear Statistical Models
[2]  
[Anonymous], SRICSL9506
[3]  
[Anonymous], 2001, PROC IEEE WORKSHOP I
[4]  
[Anonymous], 2001, P ACM CSS WORKSH DAT
[5]  
[Anonymous], 2003, P IEEE FDN NEW DIR D
[6]  
Barbara D., 2001, P 1 SIAM INT C DAT M
[7]  
CLARK MC, 1994, 12 INT C PATT REC JE, P245
[8]  
Gomez J., 2002, P 2002 IEEE WORKSH I
[9]  
Greenacre M., 1984, Theory and application of correspodence analysis
[10]  
Johnson R.A., 2014, Applied Multivariate Statistical Analysis, V6