Mining frequent closed trees in evolving data streams

被引:9
作者
Bifet, Albert [1 ,2 ]
Gavalda, Ricard [2 ]
机构
[1] Univ Waikato, Dept Comp Sci, Barcelona, Spain
[2] Univ Politecn Cataluna, Dept LSI, LARCA Res Grp, Barcelona, Spain
关键词
ITEMSETS;
D O I
10.3233/IDA-2010-0454
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose new algorithms for adaptively mining closed rooted trees, both labeled and unlabeled, from data streams that change over time. Closed patterns are powerful representatives of frequent patterns, since they eliminate redundant information. Our approach is based on an advantageous representation of trees and a low-complexity notion of relaxed closed trees, as well as ideas from Galois Lattice Theory. More precisely, we present three closed tree mining algorithms in sequence: an incremental one, INCTREEMINER, a sliding-window based one, WINTREEMINER, and finally one that mines closed trees adaptively from data streams, ADATREEMINER. By adaptive we mean here that it presents at all times the closed trees that are frequent in the current state of the data stream. To the best of our knowledge this is the first work on mining closed frequent trees in streaming data varying with time. We give a first experimental evaluation of the proposed algorithms.
引用
收藏
页码:29 / 48
页数:20
相关论文
共 32 条
  • [1] [Anonymous], P 2004 IEEE INT C DA
  • [2] [Anonymous], 2002, ALGORITHMS TREES GRA
  • [3] [Anonymous], 2007, P 7 SIAM INT C DAT M
  • [4] [Anonymous], 2005, ART COMPUTER PROGRAM
  • [5] Arimura H, 2005, LECT NOTES ARTIF INT, V3625, P1
  • [6] Asai T, 2003, LECT NOTES ARTIF INT, V2843, P47
  • [7] Online algorithms for mining semi-structured data stream
    Asai, T
    Arimura, H
    Abe, K
    Kawasoe, S
    Arikawa, S
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 27 - 34
  • [8] BALCAZAR JL, 2009, ACCEPTED PUBLICATION
  • [9] BIFET A, 2008, 14 ACM SIGKDD INT C
  • [10] BIFET A, 2009, ECML PKDD 09