On the window size for classification in changing environments

被引:34
作者
Kuncheva, Ludmila I. [1 ]
Zliobaite, Indre [2 ]
机构
[1] Bangor Univ, Sch Comp Sci, Bangor LL57 1UT, Gwynedd, Wales
[2] Vilnius State Univ, Fac Math & Informat, Vilnius, Lithuania
关键词
Concept drift; streaming data; training sample size; moving window size; SAMPLE-SIZE; DRIFT;
D O I
10.3233/IDA-2009-0397
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification in changing environments (commonly known as concept drift) requires adaptation of the classifier to accommodate the changes. One approach is to keep a moving window on the streaming data and constantly update the classifier on it. Here we consider an abrupt change scenario where one set of probability distributions of the classes is instantly replaced with another. For a fixed 'transition period' around the change, we derive a generic relationship between the size of the moving window and the classification error rate. We derive expressions for the error in the transition period and for the optimal window size for the case of two Gaussian classes where the concept change is a geometrical displacement of the whole class configuration in the space. A simple window resize strategy based on the derived relationship is proposed and compared with fixed-size windows on a real benchmark data set data set (Electricity Market).
引用
收藏
页码:861 / 872
页数:12
相关论文
共 19 条
[1]  
[Anonymous], 2000, ICML, DOI DOI 10.1007/978-3-540-44871-6_130
[2]  
[Anonymous], INTELL DATA ANAL
[3]  
[Anonymous], 2005, P 2 INT WORKSHOP KNO
[4]  
Baena-Garcia M., 2006, 4 INT WORKSH KNOWL D, V6, P77
[5]  
Bifet A, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P443
[6]  
Duda R. O., 2000, Pattern classification
[7]   EFFECTS OF SAMPLE-SIZE IN CLASSIFIER DESIGN [J].
FUKUNAGA, K ;
HAYES, RR .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1989, 11 (08) :873-885
[8]  
Gama J, 2004, LECT NOTES ARTIF INT, V3171, P286
[9]  
Gama J., 2006, 4 INT WORKSH KNOWL D
[10]  
Harries M., 1999, Tech. Rep. UNSW-CSE-TR-9905