Mining sequential patterns with periodic wildcard gaps

被引:0
作者
Youxi Wu
Lingling Wang
Jiadong Ren
Wei Ding
Xindong Wu
机构
[1] Hebei University of Technology,School of Computer Science and Engineering
[2] Yanshan University,School of Information Science and Engineering
[3] University of Massachusetts Boston,Department of Computer Science
[4] Hefei University of Technology,School of Computer Science and Information Engineering
[5] University of Vermont,Department of Computer Science
来源
Applied Intelligence | 2014年 / 41卷
关键词
Sequential pattern mining; Periodic wildcard gap; Pattern matching; Heuristic algorithm; Nettree;
D O I
暂无
中图分类号
学科分类号
摘要
Mining frequent patterns with periodic wildcard gaps is a critical data mining problem to deal with complex real-world problems. This problem can be described as follows: given a subject sequence, a pre-specified threshold, and a variable gap-length with wildcards between each two consecutive letters. The task is to gain all frequent patterns with periodic wildcard gaps. State-of-the-art mining algorithms which use matrices or other linear data structures to solve the problem not only consume a large amount of memory but also run slowly. In this study, we use an Incomplete Nettree structure (the last layer of a Nettree which is an extension of a tree) of a sub-pattern P to efficiently create Incomplete Nettrees of all its super-patterns with prefix pattern P and compute the numbers of their supports in a one-way scan. We propose two new algorithms, MAPB (Mining sequentiAl Pattern using incomplete Nettree with Breadth first search) and MAPD (Mining sequentiAl Pattern using incomplete Nettree with Depth first search), to solve the problem effectively with low memory requirements. Furthermore, we design a heuristic algorithm MAPBOK (MAPB for tOp-K) based on MAPB to deal with the Top-K frequent patterns for each length. Experimental results on real-world biological data demonstrate the superiority of the proposed algorithms in running time and space consumption and also show that the pattern matching approach can be employed to mine special frequent patterns effectively.
引用
收藏
页码:99 / 116
页数:17
相关论文
共 67 条
[1]  
Kang U(2011)Hadi: mining radii of large graphs ACM Trans Knowl Discov Data 5 315-344
[2]  
Tsourakakis CE(2012)Mining travel patterns from geotagged photos ACM Trans Intell Syst Technol 3 5605-5612
[3]  
Appel AP(2013)Stream mining on univariate uncertain data Appl Intell 39 1068-1086
[4]  
Faloutsos C(2013)Sequential pattern mining—approaches and algorithms ACM Comput Surv 45 677-694
[5]  
Leskovec J(2011)MoveMine: mining moving object data for discovery of animal movement patterns ACM Trans Intell Syst Technol 2 418-435
[6]  
Zheng YT(2009)Data mining-based intrusion detectors Expert Syst Appl 36 807-818
[7]  
Zha ZJ(2012)Mining the change of customer behavior in fuzzy time-interval sequential patterns Appl Soft Comput 12 267-278
[8]  
Chua TS(2009)Mining sequential patterns in the B2B environment J Inf Sci 35 32-40
[9]  
Liu YH(2013)Mining interesting user behavior patterns in mobile commerce environments Appl Intell 38 727-738
[10]  
Mooney CH(2011)Mining Top-K large structural patterns in a massive network Proc VLDB Endow 4 259-286