An incremental algorithm for mining frequent closed patterns

被引:0
作者
Shi, Huai-Dong [1 ]
Cai, Ming [1 ]
Wu, Hong-Sen [2 ]
Dong, Jin-Xiang [1 ]
Fu, Hao [1 ]
机构
[1] College of Computer Science and Technology, Zhejiang University
[2] Zhejiang Police College
来源
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science) | 2009年 / 43卷 / 08期
关键词
Closed itemset mining; Data mining; Frequent closed pattern; Knowledge discovery;
D O I
10.3785/j.issn.1008-973X.2009.08.007
中图分类号
学科分类号
摘要
While mining frequent closed patterns (FCP), the input sequence database dynamically increases in many situations. By analyzing Bide algorithm, the theorem of backward-extension event (BEE) detection was proposed and proved. It shows that the BEE set of any prefix item is non-increasing with the extension of the prefix. Based on the theorem, the accumulation performance of the BEE set was optimized by 4.8% averagely. The FCP tree was defined to represent the final result of FCP mining and its three characteristics were demonstrated. When the frequent item and the prefix are not coexistent in the new input sequence, the results of contiguous FCP mining are equal. And the corresponding theorem was proved. The BideInc algorithm was proposed to incrementally mine FCPs. The experiments validated the algorithm, and the performance was improved by 47% averagely.
引用
收藏
页码:1389 / 1395
页数:6
相关论文
共 17 条
[1]  
Weiser M., The computer for the twenty-first century, Scientific American, 265, 3, pp. 94-104, (1991)
[2]  
Norman D., The Invisible Computer, (1999)
[3]  
Agrawal R., Srikant R., Mining sequential patterns, Proceedings of the 11th International Conference on Data Engineering, pp. 3-14, (1995)
[4]  
Srikant R., Agrawal R., Mining sequential patterns: Generalizations and performance improvements, Proceedings of the 5th International Conference on Extending Database Technology, pp. 3-17, (1996)
[5]  
Zaki M.J., SPADE: An efficient algorithm for mining frequent sequences, Machine Learning, 42, 1, pp. 31-60, (2001)
[6]  
Pei J., Han J., Mortazavi-Asl B., Et al., Prefix-Span: Mining sequential patterns efficiently by prefix-projected pattern growth, Proceedings of the 17th International Conference on Data Engineering, pp. 215-224, (2001)
[7]  
Ayres J., Gehrke J., Yiu T., Et al., Sequential pattern mining using a bitmap representation, Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, pp. 429-435, (2002)
[8]  
Pasquier N., Bastide Y., Taouil R., Et al., Discovering frequent closed itemsets for association rules, Proceedings of the 7th International Conference on Database Theory, pp. 398-416, (1999)
[9]  
Pei J., Han J., Mao R., CLOSET: An efficient algorithm for mining frequent closed itemsets, Proceedings of the 2000 ACM SIGMOD International Workshop Data Mining and Knowledge Discovery, pp. 11-20, (2001)
[10]  
Wang J., Han J., Pei J., CLOSET+: Searching for the best strategies for mining frequent closed itemsets, Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 236-245, (2003)