Approximation to expected support of frequent itemsets in mining probabilistic sets of uncertain data

被引:2
作者
Cuzzocrea, Alfredo [1 ,2 ]
Leung, Carson K. [3 ]
MacKinnon, Richard Kyle [3 ]
机构
[1] Univ Trieste, Dept Engn & Architecture DIA, I-34127 Trieste, TS, Italy
[2] ICAR CNR, I-34127 Trieste, TS, Italy
[3] Univ Manitoba, Dept Comp Sci, Winnipeg, MB R3T 2N2, Canada
来源
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 19TH ANNUAL CONFERENCE, KES-2015 | 2015年 / 60卷
关键词
Knowledge discovery and data mining; expected support; frequent patterns; uncertain data; upper bounds;
D O I
10.1016/j.procs.2015.08.195
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge discovery and data mining generally discovers implicit, previously unknown, and useful knowledge from data. As one of the popular knowledge discovery and data mining tasks, frequent itemset mining, in particular, discovers knowledge in the form of sets of frequently co-occurring items, events, or objects. On the one hand, in many real-life applications, users mine frequent patterns from traditional databases of precise data, in which users know certainly the presence of items in transactions. On the other hand, in many other real-life applications, users mine frequent itemsets from probabilistic sets of uncertain data, in which users are uncertain about the likelihood of the presence of items in transactions. Each item in these probabilistic sets of uncertain data is often associated with an existential probability expressing the likelihood of its presence in that transaction. To mine frequent itemsets from these probabilistic datasets, many existing algorithms capture lots of information to compute expected support. To reduce the amount of space required, algorithms capture some but not all information in computing or approximating expected support. The tradeoff is that the upper bounds to expected support may not be tight. In this paper, we examine several upper bounds and recommend to the user which ones consume less space while providing good approximation to expected support of frequent itemsets in mining probabilistic sets of uncertain data. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:613 / 622
页数:10
相关论文
共 21 条
[1]  
Aggarwal CC, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P29
[2]  
Agrawal R., P 20 INT C VERY LARG
[3]   Effectively and efficiently mining frequent patterns from dense graph streams on disk [J].
Braun, Peter ;
Cameron, Juan J. ;
Cuzzocrea, Alfredo ;
Jiang, Fan ;
Leung, Carson K. .
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 18TH ANNUAL CONFERENCE, KES-2014, 2014, 35 :338-347
[4]   A tree-based algorithm for mining diverse social entities [J].
Braun, Peter ;
Cuzzocrea, Alfredo ;
Leung, Carson K. ;
MacKinnon, Richard Kyle ;
Tanbeer, Syed K. .
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 18TH ANNUAL CONFERENCE, KES-2014, 2014, 35 :223-232
[5]   Vertical Frequent Pattern Mining from Uncertain Data [J].
Budhia, Bhavek P. ;
Cuzzocrea, Alfredo ;
Leung, Carson K. .
ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 :1273-1282
[6]  
Calders T, 2010, LECT NOTES ARTIF INT, V6118, P480
[7]   Ensemble classifier for mining data streams [J].
Czarnowski, Ireneusz ;
Jedrzejowicz, Piotr .
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 18TH ANNUAL CONFERENCE, KES-2014, 2014, 35 :397-406
[8]   BigSAM: Mining Interesting Patterns from Probabilistic Databases of Uncertain Big Data [J].
Jiang, Fan ;
Leung, Carson Kai-Sang ;
MacKinnon, Richard Kyle .
TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 :780-792
[9]  
Leung C.K, 2014, Frequent Pattern Mining, P417
[10]  
Leung Carson Kai-Sang, 2013, Advances in Knowledge Discovery and Data Mining. 17th Pacific-Asia Conference, PAKDD 2013. Proceedings, P13, DOI 10.1007/978-3-642-37453-1_2