GP ensembles for large-scale data classification

被引:40
作者
Folino, Gianluigi [1 ]
Pizzuti, Clara [1 ]
Spezzano, Giandomenico [1 ]
机构
[1] CNR, ICAR, I-87036 Cosenza, Italy
关键词
bagging; boosting; classification; data mining; genetic programming (GP);
D O I
10.1109/TEVC.2005.863627
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An extension of cellular genetic programming for data classification (CGPC) to induce an ensemble of predictors is presented. Two algorithms implementing the bagging and boosting techniques are described and compared with CGPC. The approach is able to deal with large data sets that do not fit in main memory since each classifier is trained on a subset of the overall training data. The predictors are then combined to classify new tuples. Experiments on several data sets show that, by using a training set of reduced size, better classification accuracy can be obtained, but at a much lower computational cost.
引用
收藏
页码:604 / 616
页数:13
相关论文
共 32 条
[1]  
[Anonymous], P 2 ANN C GEN PROGR
[2]  
[Anonymous], P 12 INT C MACH LEAR
[3]  
[Anonymous], 1995, PROC WORKSHOP GENETI
[4]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[5]  
Breiman L, 1998, ANN STAT, V26, P801
[6]   Pasting small votes for classification in large databases and on-line [J].
Breiman, L .
MACHINE LEARNING, 1999, 36 (1-2) :85-103
[7]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[8]   Inducing oblique decision trees with evolutionary algorithms [J].
Cantú-Paz, E ;
Kamath, C .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2003, 7 (01) :54-68
[9]  
CHAWLA N, 2001, P BIOKDDOI WORKSH DA, P50
[10]   An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization [J].
Dietterich, TG .
MACHINE LEARNING, 2000, 40 (02) :139-157