Stream Classification with Recurring and Novel Class Detection using Class-Based Ensemble

被引:50
作者
Al-Khateeb, Tahseen [1 ]
Masud, Mohammad M. [2 ]
Khan, Latifur [1 ]
Aggarwal, Charu [3 ]
Han, Jiawei [4 ]
Thuraisingham, Bhavani [1 ]
机构
[1] Univ Texas Dallas, Dept Comp Sc, Dallas, TX 75230 USA
[2] United Arab Emirates Univ, Coll Informat Technol, Al Ain, U Arab Emirates
[3] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY USA
[4] Univ Illinois, Dept Comp Sci, Urbana, IL USA
来源
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012) | 2012年
关键词
stream classification; novel class; recurring class;
D O I
10.1109/ICDM.2012.125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Concept-evolution has recently received a lot of attention in the context of mining data streams. Concept-evolution occurs when a new class evolves in the stream. Although many recent studies address this issue, most of them do not consider the scenario of recurring classes in the stream. A class is called recurring if it appears in the stream, disappears for a while, and then reappears again. Existing data stream classification techniques either misclassify the recurring class instances as another class, or falsely identify the recurring classes as novel. This increases the prediction error of the classifiers, and in some cases causes unnecessary waste in memory and computational resources. In this paper we address the recurring class issue by proposing a novel "class-based" ensemble technique, which substitutes the traditional "chunkbased" ensemble approaches and correctly distinguishes between a recurring class and a novel one. We analytically and experimentally confirm the superiority of our method over state-of-the-art techniques.
引用
收藏
页码:31 / 40
页数:10
相关论文
共 17 条
[1]   A framework for on-demand classification of evolving data streams [J].
Aggarwal, CC ;
Han, JW ;
Wang, JY ;
Yu, PS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (05) :577-589
[2]  
Bifet A, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P139
[3]  
Gao J, 2007, PROCEEDINGS OF 2007 IEEE INTERNATIONAL CONFERENCE ON GREY SYSTEMS AND INTELLIGENT SERVICES, VOLS 1 AND 2, P1014
[4]   Adapted One-versus-All Decision Trees for Data Stream Classification [J].
Hashemi, Sattar ;
Yang, Ying ;
Mirzamomen, Zahra ;
Kangavari, Mohammadreza .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (05) :624-637
[5]  
Hulten G., 2001, KDD-2001. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P97, DOI 10.1145/502512.502529
[6]  
Kolter J. Z., 2005, P 22 INT C MACH LEAR, P449
[7]  
Masud M. M., 2011, Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM 2011), P1176, DOI 10.1109/ICDM.2011.49
[8]  
Masud M. M., P ICDM 10, P929
[9]   Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints [J].
Masud, Mohammad M. ;
Gao, Jing ;
Khan, Latifur ;
Han, Jiawei ;
Thuraisingham, Bhavani .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (06) :859-874
[10]  
Spinosa EJ, 2008, APPLIED COMPUTING 2008, VOLS 1-3, P976