Parametric algorithms for mining share frequent itemsets

被引:8
作者
Barber, B [1 ]
Hamilton, HJ [1 ]
机构
[1] Univ Regina, Dept Comp Sci, Regina, SK S4S 0A2, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
knowledge discovery; data mining; itemsets; association rule mining; share based measures;
D O I
10.1023/A:1011276003319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Itemset share, the fraction of some numerical total contributed by items when they occur in itemsets, has been proposed as a measure of the importance of itemsets in association rule mining. The IAB and CAC algorithms are able to find share frequent itemsets that have infrequent subsets. These algorithms perform well, but they do not always find all possible share frequent itemsets. In this paper, we describe the incorporation of a threshold factor into these algorithms. The threshold factor can be used to increase the number of frequent itemsets found at a cost of an increase in the number of infrequent itemsets examined. The modified algorithms are tested on a large commercial database. Their behavior is examined using principles of classifier evaluation from machine learning.
引用
收藏
页码:277 / 293
页数:17
相关论文
共 16 条
[1]  
AGRAWAL A, 1994, P 20 INT C VER LARG, P487
[2]  
AGRAWAL A, 1996, ADV KNOWLEDGE DISCOV
[3]  
AGRAWAL A, 1993, P ACM SIGMOD C MAN D, P207
[4]  
BARBER B, 2001, IN PRESS DATA MINING
[5]  
Barber B, 2000, LECT NOTES COMPUT<D>, V1910, P316
[6]  
Carter CL, 1997, LECT NOTES ARTIF INT, V1263, P14
[7]  
Hilderman R. J., 1998, International Journal on Artificial Intelligence Tools (Architectures, Languages, Algorithms), V7, P189, DOI 10.1142/S0218213098000111
[8]  
Kohavi R., 1998, MACH LEARN, V30, P271, DOI DOI 10.1023/A:1017181826899
[9]   Machine learning for the detection of oil spills in satellite radar images [J].
Kubat, M ;
Holte, RC ;
Matwin, S .
MACHINE LEARNING, 1998, 30 (2-3) :195-215
[10]  
MANNILA H, 1994, P AAAI WORKSH KNOWL, P144