Fast unified mining of hyperclique patterns and maximal hyperclique patterns

被引:0
|
作者
Xiao B. [1 ]
Zhang L. [1 ]
Xu Q.-F. [1 ]
Lin Z.-Q. [1 ]
Guo J. [1 ]
机构
[1] School of Information and Communication, Beijing University of Posts and Telecommunications
来源
Ruan Jian Xue Bao/Journal of Software | 2010年 / 21卷 / 04期
关键词
Association rule; Data mining; FP-tree (frequent pattern tree); Hyperclique pattern; Maximal hyperclique pattern;
D O I
10.3724/SP.J.1001.2010.03595
中图分类号
学科分类号
摘要
The hyperclique pattern is a new type of association pattern, in which items are highly affiliated with each other. The presence of an item in one transaction strongly implies the presence of every other item in the same hyperclique pattern. The maximal hyperclique pattern is a more compact representation of a group of hyperclique patterns, which is desirable for many applications. The standard algorithms mining the two kinds of patterns are different. This paper presents a fast algorithm called hybrid hyperclique pattern growth (HHCP-growth) based on FP-tree (frequent pattern tree), which unifies the mining processes of the two patterns. This algorithm adopts recursive mining method and exploits many efficient pruning strategies. Some propositions are also presented and proved to indicate the effectiveness of the strategies and the validity of the algorithm. The experimental results show that HHCP-growth is more effective than the standard hyperclique pattern and maximal hyperclique pattern mining algorithms, particularly for large-scale datasets or at low levels of support. © by Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:659 / 671
页数:12
相关论文
共 9 条
  • [1] Agrawal R., Imielinski T., Swami A., Mining association rules between sets of items in large databases, Proc. of the ACM SIGMOD Conf. on Management of Data (SIGMOD'93), pp. 207-216, (1993)
  • [2] Xiao B., Xu Q.F., Lin Z.Q., Guo J., Li C.G., Credible association rule and its mining algorithm based on maximum clique, Journal of Software, 19, 10, pp. 2597-2610, (2008)
  • [3] Xu Q.F., Xiao B., Guo J., A mining algorithm with alarm association rules based on statistical correlation, Journal of Beijing University of Posts and Telecommunications, 30, 1, pp. 66-70, (2007)
  • [4] Agrawal R., Srikant R., Fast algorithms for mining association rules, Proc. of the 20th Int'l Conf. on Very Large Data Bases, pp. 478-499, (1994)
  • [5] Han J., Pei J., Yin Y., Mining frequent patterns without candidate generation, Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD 2000), pp. 1-12, (2000)
  • [6] Xiong H., Tan P.N., Kumar V., Mining strong affinity association patterns in data sets with skewed support distribution, Proc. of the ICDM 2003, pp. 387-394, (2003)
  • [7] Xiong H., Tan P.N., Kumar V., Hyperclique pattern discovery, Data Mining and Knowledge Discovery Journal, 13, 2, pp. 219-242, (2006)
  • [8] Huang Y.C., Xiong H., Wu W.L., Deng P., Zhang Z.N., Mining maximal hyperclique pattern: A hybrid search strategy, Information Sciences, 177, 3, pp. 703-721, (2007)
  • [9] Burdick D., Calimlim M., Gehrke J., MAFIA: A maximal frequent itemset algorithm for transactional databases, Proc. of the 17th IEEE Int'l Conf. on Data Engineering, pp. 443-452, (2001)