A Survey on Ensemble Learning for Data Stream Classification

被引:380
作者
Gomes, Heitor Murilo [1 ,3 ]
Barddal, Jean Paul [1 ,3 ]
Enembreck, Fabricio [1 ,3 ]
Bifet, Albert [2 ,4 ]
机构
[1] Ponticia Univ Catolica Parana, Curitiba, Parana, Brazil
[2] Univ Paris Saclay, Telecom ParisTech, Inst Mines Telecom, Paris, France
[3] Imaculada Conceicao St 1155, Curitiba, Parana, Brazil
[4] 46 Rue Barrault, Paris, France
关键词
Ensemble learning; supervised learning; data stream classification; EVOLVING DATA; CONCEPT DRIFT; CLASSIFIERS; ACCURACY; DIVERSITY; ALGORITHM; SYSTEMS; VOTE;
D O I
10.1145/3054925
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Ensemble-based methods are among the most widely used techniques for data stream classification. Their popularity is attributable to their good performance in comparison to strong single learners while being relatively easy to deploy in real-world applications. Ensemble algorithms are especially useful for data stream learning as they can be integrated with drift detection algorithms and incorporate dynamic updates, such as selective removal or addition of classifiers. This work proposes a taxonomy for data stream ensemble learning as derived from reviewing over 60 algorithms. Important aspects such as combination, diversity, and dynamic updates, are thoroughly discussed. Additional contributions include a listing of popular open-source tools and a discussion about current data stream research challenges and how they relate to ensemble learning (big data streams, concept evolution, feature drifts, temporal dependencies, and others).
引用
收藏
页数:36
相关论文
共 162 条
[1]  
Abdulsalam H, 2008, LECT NOTES COMPUT SC, V5181, P643, DOI 10.1007/978-3-540-85654-2_54
[2]  
Abdulsalam H, 2007, INT DATABASE ENG APP, P225
[3]  
Agarwal A, 2014, J MACH LEARN RES, V15, P1111
[4]  
Aggarwal CC., 2007, DATA STREAMS ADV DAT
[5]   DATABASE MINING - A PERFORMANCE PERSPECTIVE [J].
AGRAWAL, R ;
IMIELINSKI, T ;
SWAMI, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) :914-925
[6]   Statistical mechanics of complex networks [J].
Albert, R ;
Barabási, AL .
REVIEWS OF MODERN PHYSICS, 2002, 74 (01) :47-97
[7]  
Alpaydin E, 1998, KYBERNETIKA, V34, P369
[8]   Shape quantization and recognition with randomized trees [J].
Amit, Y ;
Geman, D .
NEURAL COMPUTATION, 1997, 9 (07) :1545-1588
[9]  
[Anonymous], ELUSIVE DIVERSITY CL
[10]  
[Anonymous], P INT C TOOLS ART IN