Fast mining of global maximum frequent itemsets

被引:4
作者
Lu, Jie-Ping [1 ]
Yang, Ming [1 ]
Sun, Zhi-Hui [1 ]
Ju, Shi-Guang [2 ]
机构
[1] Dept. of Comp. Sci. and Eng., Southeast Univ.
[2] Dept. of Comp. Sci. and Commun. Eng., Jiangsu Univ.
来源
Ruan Jian Xue Bao/Journal of Software | 2005年 / 16卷 / 04期
关键词
Data mining; Distribute database; Frequent pattern tree; Global maximum frequent itemset;
D O I
10.1360/jos160553
中图分类号
学科分类号
摘要
Mining maximum frequent itemsets is a key problem in data mining field with numerous important applications. The existing algorithms of mining maximum frequent itemsets are based on local databases, and very little work has been done in distributed databases. However, using the existing algorithms for the maximum frequent itemsets or using the algorithms proposed for the global frequent itemsets needs to generate a lots of candidate itemsets and requires a large amount of communication overhead. Therefore, this paper proposes an algorithm for fast mining global maximum frequent itemsets (FMGMFI), which can conveniently get the global frequency of any itemset from the corresponding paths of every local FP-tree by using frequent pattern tree and require far less communication overhead by the searching strategy of bottom-up and top-down. Experimental results show that FMGMFI is effective and efficient.
引用
收藏
页码:553 / 560
页数:7
相关论文
共 14 条
  • [1] Han J., Kamber M., Data Mining: Concepts and Techniques, (2001)
  • [2] Agrawal R., ImielinSki T., Swami A., Mining association rules between sets of items in large database, Proc. of the ACM SIGMOD Int'l Conf. on Management of Data, 2, pp. 207-216, (1993)
  • [3] Srikant A.R., Fast algorithms for mining association rules, Proc. of the 20th Int'l Conf. Very Large Data Bases (VLDB'94), pp. 487-499, (1994)
  • [4] Yang M., Sun Z.H., An incremental updating algorithm based on prefix cenetral list for association rules, Chinese Journal of Computers, 26, 10, pp. 1318-1325, (2003)
  • [5] Han J., Pei J., Yin Y., Mining frequent patterns without candidate generation, Proc. of the 2000 ACM-SIGMOD Int'l Conf. on Management of Data, pp. 1-12, (2000)
  • [6] Bayardo R.J., Efficiently mining long patterns from databases, Proc. of the ACM SIGMOD Int'l Conf. on Management of Data, pp. 85-93, (1998)
  • [7] Lin D., Kedem Z.M., Pincer-Search: A new algorithm for discovering the maximum frequent set, Proc. of the 6th European Conf. on Extending Database Technology, pp. 105-119, (1998)
  • [8] Lu S.F., Lu Z.D., Fast mining maximum frequent itemsets, Journal of Software, 12, 2, pp. 293-297, (2001)
  • [9] Song Y.Q., Zhu Y.Q., Sun Z.H., Chen G., An algorithm an its updating algorithm based on FP-Tree for mining maximum frequent itemsets, Journal of Software, 14, 9, pp. 1586-1592, (2003)
  • [10] Park J.S., Chen M.S., Yu P.S., Efficient parallel data mining for association rules, Proc. of the 4th Int'l Conf. on Information and Knowledge Management, pp. 31-36, (1995)