An ensemble method for concept drift in nonstationary environment

被引:17
作者
Mejri, Dhouha [1 ]
Khanchel, Riadh [2 ]
Limam, Mohamed [1 ]
机构
[1] Univ Tunis, Larodec, ISG Tunis, Tunis, Tunisia
[2] Univ Carthage, FSEG Nabeul, Larodec, Tunis, Tunisia
关键词
concept drift; ensemble method; data stream;
D O I
10.1080/00949655.2011.651797
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Most statistical and data-mining algorithms assume that data come from a stationary distribution. However, in many real-world classification tasks, data arrive over time and the target concept to be learned from the data stream may change accordingly. Many algorithms have been proposed for learning drifting concepts. To deal with the problem of learning when the distribution generating the data changes over time, dynamic weighted majority was proposed as an ensemble method for concept drift. Unfortunately, this technique considers neither the age of the classifiers in the ensemble nor their past correct classification. In this paper, we propose a method that takes into account expert's age as well as its contribution to the global algorithm's accuracy. We evaluate the effectiveness of our proposed method by using m classifiers and training a collection of n-fold partitioning of the data. Experimental results on a benchmark data set show that our method outperforms existing ones.
引用
收藏
页码:1115 / 1128
页数:14
相关论文
共 15 条
  • [1] [Anonymous], 2007, Uci machine learning repository
  • [2] SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
    Blewitt, Marnie E.
    Gendrel, Anne-Valerie
    Pang, Zhenyi
    Sparrow, Duncan B.
    Whitelaw, Nadia
    Craig, Jeffrey M.
    Apedaile, Anwyn
    Hilton, Douglas J.
    Dunwoodie, Sally L.
    Brockdorff, Neil
    Kay, Graham F.
    Whitelaw, Emma
    [J]. NATURE GENETICS, 2008, 40 (05) : 663 - 669
  • [3] Empirical support for Winnow and Weighted-Majority algorithms: Results on a calendar scheduling domain
    Blum, A
    [J]. MACHINE LEARNING, 1997, 26 (01) : 5 - 23
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [6] Gao J, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P3
  • [7] An empirical comparison of ensemble methods based on classification trees
    Hamza, M
    Larocque, D
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2005, 75 (08) : 629 - 643
  • [8] Hulten G, 2001, P 7 ACM SIGKDD INT C, P97, DOI [10.1145/502512.502529, DOI 10.1145/502512.502529]
  • [9] Karnick M, 2008, INT C PATT RECOG, P497
  • [10] Kolter JZ, 2007, J MACH LEARN RES, V8, P2755