Top-k high average-utility itemsets mining with effective pruning strategies

被引:0
作者
Ronghui Wu
Zhan He
机构
[1] Hunan University,College of Computer Science and Electronic Engineering
来源
Applied Intelligence | 2018年 / 48卷
关键词
High average-utility itemsets mining; Top-k mining; List struct; Data mining;
D O I
暂无
中图分类号
学科分类号
摘要
High average-utility itemset (HAUI) mining has recently received interest in the data mining field due to its balanced utility measurement, which considers not only profits and quantities of items but also the lengths of itemsets. Although several algorithms have been designed for the task of HAUI mining in recent years, it is hard for users to determine an appropriate minimum average-utility threshold for the algorithms to work efficiently and control the mining result precisely. In this paper, we address this issue by introducing a framework of top-k HAUI mining, where k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$k$\end{document} is the desired number of high average-utility itemsets to be mined instead of setting a minimum average-utility threshold. An efficient list based algorithm named TKAU is proposed to mine the top-k high average-utility itemsets in a single phase. TKAU introduces two novel strategies, named EMUP and EA to avoid performing costly join operations for calculating the utilities of itemsets. Moreover, three strategies named RIU, CAD, and EPBF are also incorporated to raise its internal minimal average-utility threshold effectively, and thus reduce the search space. Extensive experiments on both real and synthetic datasets show that the proposed algorithm has excellent performance and scalability.
引用
收藏
页码:3429 / 3445
页数:16
相关论文
共 74 条
[1]  
Ahmed CF(2009)Efficient tree structures for high utility pattern mining in incremental databases IEEE Trans Knowl Data Eng 21 1708-1721
[2]  
Tanbeer SK(2004)Mining frequent itemsets without support threshold: with and without item constraints IEEE Trans Knowl Data Eng 16 1052-1069
[3]  
Jeong BS(2016)An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies Knowl-Based Syst 104 106-122
[4]  
Lee YK(2000)Mining frequent patterns without candidate generation ACM SIGMOD Record 29 1-12
[5]  
Cheung YL(2011)Effective utility mining with the measure of average utility Expert Syst Appl 38 8259-8265
[6]  
Fu AWC(2015)Pruning strategies for mining high utility itemsets Expert Syst Appl 42 2371-2381
[7]  
Duong QH(2012)Efficiently mining high average-utility itemsets with an improved upper-bound strategy Int J Inf Technol Decis Making 11 1009-1030
[8]  
Liao B(2012)A projection-based approach for discovering high average-utility itemsets J Inf Sci Eng 28 193-209
[9]  
Fournier-Viger P(2015)An N-list-based algorithm for mining frequent closed patterns Expert Syst Appl 42 6648-6657
[10]  
Dam TL(2008)Isolated items discarding strategy for discovering high utility itemsets Data Knowl Eng 64 198-217