An algorithm to mine general association rules from tabular data

被引:21
作者
Ayubi, Siyamand [2 ]
Muyeba, Maybin K. [1 ]
Baraani, Ahmad [2 ]
Keane, John [3 ]
机构
[1] Manchester Metropolitan Univ, Dept Comp & Math, Manchester M15 6BH, Lancs, England
[2] Univ Isfahan, Fac Engn, Esfahan, Iran
[3] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England
关键词
Data mining; General association rules; Tabular data; Equality operators; Signature; PATTERNS;
D O I
10.1016/j.ins.2009.06.021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most methods for mining association rules from tabular data mine simple rules which only use the equality operator "=" in their items. For quantitative attributes, approaches tend to discretize domain values by partitioning them into intervals. Limiting the operator only to "=" results in many interesting frequent patterns that may not be identified. It is obvious that where there is an order between objects, operators such as greater than or less than a given value are as important as the equality operator. This motivates us to extend association rules, from the simple equality operator, to a more general set of operators. We address the problem of mining general association rules in tabular data where rules can have all operators {<=,>,not equal,=} in their antecedent part. The proposed algorithm, mining general rules (MGR), is applicable to datasets with discrete-ordered attributes and on quantitative discretized attributes. The proposed algorithm stores candidate general itemsets in a tree structure in such a way that supports of complex itemsets can be recursively computed from supports of simpler itemsets. The algorithm is shown to have benefits in terms of time complexity, memory management and has good potential for parallelization. (C) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:3520 / 3539
页数:20
相关论文
共 35 条
[1]  
Agarwal R., 1994, VLDB, V487, P499, DOI DOI 10.5555/645920.672836
[2]   A tree projection algorithm for generation of frequent item sets [J].
Agarwal, RC ;
Aggarwal, CC ;
Prasad, VVV .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2001, 61 (03) :350-371
[3]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[4]  
[Anonymous], P 21 VLDB C SEP
[5]  
[Anonymous], 2005, OSDM'05: Proceedings of the 1st international workshop on open source data mining, DOI DOI 10.1145/1133905.1133913
[6]   A statistical theory for quantitative association rules [J].
Aumann, Y ;
Lindell, Y .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2003, 20 (03) :255-283
[7]   Bottom-up discovery of frequent rooted unordered subtrees [J].
Bei, Yijun ;
Chen, Gang ;
shou, Lidan ;
Li, Xiaoyan ;
Dong, Jinxiang .
INFORMATION SCIENCES, 2009, 179 (1-2) :70-88
[8]   TBAR:: An efficient method for association rule mining in relational databases [J].
Berzal, F ;
Cubero, JC ;
Marín, N ;
Serrano, JM .
DATA & KNOWLEDGE ENGINEERING, 2001, 37 (01) :47-64
[9]   Mining optimized gain rules for numeric attributes [J].
Brin, S ;
Rastogi, R ;
Shim, K .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (02) :324-338
[10]   Data mining: An overview from a database perspective [J].
Chen, MS ;
Han, JW ;
Yu, PS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) :866-883