Applying Ant Colony Optimization to configuring stacking ensembles for data mining

被引:56
作者
Chen, Yijun [1 ]
Wong, Man-Leung [1 ]
Li, Haibing [1 ]
机构
[1] Lingnan Univ, Dept Comp & Decis Sci, Tuen Mun, Hong Kong, Peoples R China
关键词
ACO; Ensemble; Stacking; Metaheuristics; Data mining; Direct marketing; CLASSIFICATION; SELECTION;
D O I
10.1016/j.eswa.2013.10.063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An ensemble is a collective decision-making system which applies a strategy to combine the predictions of learned classifiers to generate its prediction of new instances. Early research has proved that ensemble classifiers in most cases can be more accurate than any single component classifier both empirically and theoretically. Though many ensemble approaches are proposed, it is still not an easy task to find a suitable ensemble configuration for a specific dataset. In some early works, the ensemble is selected manually according to the experience of the specialists. Metaheuristic methods can be alternative solutions to find configurations. Ant Colony Optimization (ACO) is one popular approach among metaheuristics. In this work, we propose a new ensemble construction method which applies ACO to the stacking ensemble construction process to generate domain-specific configurations. A number of experiments are performed to compare the proposed approach with some well-known ensemble methods on 18 benchmark data mining datasets. The approach is also applied to learning ensembles for a real-world cost-sensitive data mining problem. The experiment results show that the new approach can generate better stacking ensembles. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2688 / 2702
页数:15
相关论文
共 55 条
[1]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[2]  
Al-Ani A., 2005, INT J COMPUTATIONAL, V2, P53
[3]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[4]   Direct marketing performance modeling using genetic algorithms [J].
Bhattacharyya, S .
INFORMS JOURNAL ON COMPUTING, 1999, 11 (03) :248-257
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Chan A, 2006, LECT NOTES COMPUT SC, V3871, P25
[8]  
Cleary J.G., 1995, PROC 12 INT C MACHIN, P108
[9]   Model selection for direct marketing: performance criteria and validation methods [J].
Cui, Geng ;
Wong, Man ;
Zhang, Guichang ;
Li, Lin .
MARKETING INTELLIGENCE & PLANNING, 2008, 26 (03) :275-+
[10]   Ant colony optimization for learning Bayesian networks [J].
de Campos, LM ;
Fernández-Luna, JM ;
Gámez, JA ;
Puerta, JM .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2002, 31 (03) :291-311