Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset

被引:124
作者
Bolon-Canedo, V. [1 ]
Sanchez-Marono, N. [1 ]
Alonso-Betanzos, A. [1 ]
机构
[1] Univ A Coruna, Dept Comp Sci, Lab Res & Dev Artificial Intelligence LIDIA, La Coruna 15071, Spain
关键词
Feature selection; Filters; Classification; Discretization; KDD Cup 99 dataset; INTRUSION DETECTION; DISCRETIZATION; MULTICLASS; ENSEMBLE;
D O I
10.1016/j.eswa.2010.11.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, a new method consisting of a combination of discretizers, filters and classifiers is presented. Its aim is to improve the performance results of classifiers but using a significantly reduced set of features. The method has been applied to a binary and to a multiple class classification problem. Specifically, the KDD Cup 99 benchmark was used for testing its effectiveness. A comparative study with other methods and the KDD winner was accomplished. The results obtained showed the adequacy of the proposed method, achieving better performance in most cases while reducing the number of features in more than 80%. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5947 / 5957
页数:11
相关论文
共 46 条
[1]   Reducing multiclass to binary: A unifying approach for margin classifiers [J].
Allwein, EL ;
Schapire, RE ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :113-141
[2]  
Alonso-Betanzos A., 2007, P EUR S ART NEUR NET, P25
[3]  
[Anonymous], KDD CUP 99 DAT
[4]  
[Anonymous], 1996, PROBABILISTIC APPROA
[5]  
[Anonymous], 2014, C4. 5: programs for machine learning
[6]  
[Anonymous], NUMERICAL RECIPES C
[7]  
[Anonymous], 1993, Proceedings of the 13th International Joint Conference on Artificial Intelligence
[8]  
[Anonymous], 2000, ACM SIGKDD EXPLORATI, DOI DOI 10.1145/846183.846199
[9]  
Bishop CM., 1995, NEURAL NETWORKS PATT
[10]  
Blake C. L., 1998, Uci repository of machine learning databases