Mining optimized association rules with categorical and numeric attributes

被引:49
作者
Rastogi, R
Shim, K
机构
[1] Bell Labs, Murray Hill, NJ 07974 USA
[2] Korea Adv Inst Sci & Technol, Yusong Gu, Taejon 305701, South Korea
[3] Adv Informat Technol Res Ctr, Yusong Gu, Taejon 305701, South Korea
关键词
data mining; knowledge discovery; optimized association rules; algorithm;
D O I
10.1109/69.979971
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining association rules on large data sets has received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial, and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support or confidence of the rule is maximized. In this paper, we generalize the optimized association rules problem in three ways: 1) association rules are allowed to contain disjunctions over uninstantiated attributes, 2) association rules are permitted to contain an arbitrary number of uninstantiated attributes, and 3) uninstantiated attributes can be either categorical or numeric. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving multiple attributes. We present effective techniques for pruning the search space when computing optimized association rules for both categorical and numeric attributes. Finally, we report the results of our experiments that indicate that our pruning algorithms are efficient for a large number of uninstantiated attributes, disjunctions, and values in the domain of the attributes.
引用
收藏
页码:29 / 50
页数:22
相关论文
共 12 条
  • [1] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [2] AGRAWAL R, 1994, P VER LARG DAT C SEP
  • [3] FAYYAD UM, 1996, ADV KNOWLEDGE DISCOV
  • [4] HAN J, 1995, P VER LARG DAT C SEP
  • [5] LENT B, 1997, P INT C DAT ENG APR
  • [6] Mannila H., 1994, Knowledge Discovery in Databases (KDD'94), P181
  • [7] PARK JS, 1995, P ACM SIGMOD C MAN D
  • [8] Piatetsky-Shapiro G., 1991, Knowledge discovery in databases, P229
  • [9] SAVASERE A, 1995, P VER LARG DAT C SEP
  • [10] SRIKANT R, 1995, P VER LARG DAT C SEP