Using a Genetic Algorithm to optimize a stacking ensemble in data streaming scenarios

被引:5
作者
Ramos, Diogo [1 ]
Carneiro, Davide [1 ,2 ]
Novais, Paulo [1 ]
机构
[1] Inst Politecn Porto, Escola Super Tecnol & Gestao, CIICESI, Porto, Portugal
[2] Univ Minho, Algoritmi Ctr, Dept Informat, Braga, Portugal
关键词
Genetic algorithms; random forest; stacking ensemble; optimization; RANDOM FORESTS; RECOGNITION;
D O I
10.3233/AIC-200648
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The requirements of Machine Learning applications are changing rapidly. Machine Learning models need to deal with increasing volumes of data, and need to do so quicker as responses are expected more than ever in real-time. Plus, sources of data are becoming more and more dynamic, with patterns that change more frequently. This calls for new approaches and algorithms, that are able to efficiently deal with these challenges. In this paper we propose the use of a Genetic Algorithm to Optimize a Stacking Ensemble specifically developed for streaming scenarios. A pool of solutions is maintained in which each solution represents a distribution of weights in the ensemble. The Genetic Algorithm continuously optimizes these weights to minimize the cost function. Moreover, new models are added at regular intervals, trained on more recent data. These models eventually replace older and less accurate ones, making the ensemble adapt continuously do changes in the distribution of the data.
引用
收藏
页码:27 / 40
页数:14
相关论文
共 36 条
  • [1] Alippi C., 2006, 2006 IEEE International Symposium on Circuits and Systems (IEEE Cat. No. 06CH37717C)
  • [2] Alsabti K., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P2
  • [3] Baena-Garcia M., 2006, P 4 INT WORKSH KNOWL, V6, P77
  • [4] Biau G, 2016, TEST-SPAIN, V25, P197, DOI 10.1007/s11749-016-0481-7
  • [5] Bifet A, 2010, JMLR WORKSH CONF PRO, V13, P225
  • [6] Carneiro D., 2018, INT C HYBR INT SYST, P272
  • [7] Fault diagnosis in spur gears based on genetic algorithm and random forest
    Cerrada, Mariela
    Zurita, Grover
    Cabrera, Diego
    Sanchez, Rene-Vinicio
    Artes, Mariano
    Li, Chuan
    [J]. MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2016, 70-71 : 87 - 103
  • [8] Online Boosting for Vehicle Detection
    Chang, Wen-Chung
    Cho, Chih-Wei
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2010, 40 (03): : 892 - 902
  • [9] Applying Ant Colony Optimization to configuring stacking ensembles for data mining
    Chen, Yijun
    Wong, Man-Leung
    Li, Haibing
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (06) : 2688 - 2702
  • [10] BOOSTING AND OTHER ENSEMBLE METHODS
    DRUCKER, H
    CORTES, C
    JACKEL, LD
    LECUN, Y
    VAPNIK, V
    [J]. NEURAL COMPUTATION, 1994, 6 (06) : 1289 - 1301