Fast algorithm for high utility pattern mining with the sum of item quantities

被引：36

作者：

Ryang, Heungmo ^{[1
]}

Yun, Unil ^{[1
]}

Ryu, Keun Ho ^{[2
]}

机构：

[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea

[2] Chungbuk Natl Univ, Dept Comp Sci, Cheongju, South Korea

来源：

INTELLIGENT DATA ANALYSIS | 2016年 / 20卷 / 02期

基金：

新加坡国家研究基金会;

关键词：

Data mining; high utility patterns; single-pass tree construction; tree restructuring; utility mining; FREQUENT ITEMSETS;

D O I：

10.3233/IDA-160811

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In frequent pattern mining, items are considered as having the same importance in a database and their occurrence are represented as binary values in transactions. In real-world databases, however, items not only have relative importance but also are represented as non-binary values in transactions. High utility pattern mining is one of the most essential issues in the pattern mining field, which recently emerged to address the limitation of frequent pattern mining. Meanwhile, tree construction with a single database scan is significant since a database scan is a time-consuming task. In utility mining, an additional database scan is necessary to identify actual high utility patterns from candidates. In this paper, we propose a novel tree structure, namely SIQ-Tree (Sum of Item Quantities), which captures database information through a single-pass. Moreover, a restructuring method is suggested with strategies for reducing overestimated utilities. The proposed algorithm can construct the SIQ-Tree with only a single scan and decrease the number of candidate patterns effectively with the reduced overestimation utilities, through which mining performance is improved. Experimental results show that our algorithm outperforms a state-of-the-art one in terms of runtime and the number of generated candidates with a similar memory usage.

引用

页码：395 / 415

页数：21

共 37 条

[1] Interactive mining of high utility patterns over data streams [J].