An Efficient Algorithm for Incremental and Interactive High Utility Itemset Mining

被引:0
作者
Guo, Shiming [1 ]
Gao, Hong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China
来源
2017 2ND INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2017) | 2017年
基金
中国国家自然科学基金;
关键词
incremental mining; high utility itemset mining; pattern growth; candidate generation; interactive mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High utility itemset mining (HUIM) is an important data-mining task. Most of existing algorithms for HUIM do not consider transaction addition and deletion. When a database is updated, they need to scan the whole database to rebuild their data structures. To deal with this problem, an efficient tree structure IHUP-Tree is proposed. IHUP-Tree can be adjusted efficiently when a transaction is added into or deleted from a database. Incremental HUIM can be performed efficiently based on IHUP-Tree. IHUP-Tree can also be applied in interactive HUIM. The algorithm based on IHUP-Tree discovers high utility itemsets (HUIs) in two phases. In phase I, an over-estimated technique is adopted to set an upper bound for the utility of an itemset in the database. The itemsets whose over-estimated utilities are no less than a user-specified minimum utility threshold are selected as candidates. In phase II, the candidates are verified by scanning the database one more time. However the algorithm based on IHUP-Tree generates too many candidates, and it is time-consuming to verify them. Thus in this paper we proposed a novel tree structure IHUh-Tree and an efficient algorithm IHUI-Miner for incremental and interactive HUIM. Different from the algorithm based on IHUP-Tree, IHUI-Miner does not generate any candidate. Extensive performance analyses show our proposed tree structure is efficient, and our algorithm is at least one order of magnitude faster than the state-of-the-art algorithm in increment and interactive HUIM.
引用
收藏
页码:996 / 1001
页数:6
相关论文
共 6 条
[1]  
Ahmed C.F., 2009, TKDE, V21, P55
[2]  
Liu MC., 2012, ACM INT C P SERIES, P55, DOI DOI 10.1145/2396761.2396773
[3]  
Liu Y, 2005, LECT NOTES ARTIF INT, V3518, P689
[4]  
Tseng V. S., 2010, ACM SIGKDD INT C KNO, P253, DOI DOI 10.1145/1835804.1835839
[5]  
Yao H, 2004, SIAM PROC S, P482
[6]   Mining top-k high utility patterns over data streams [J].
Zihayat, Morteza ;
An, Aijun .
INFORMATION SCIENCES, 2014, 285 :138-161