BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap

被引:51
作者
Song, Wei [1 ]
Liu, Yu [1 ]
Li, Jinhong [1 ]
机构
[1] North China Univ Technol, Coll Informat Engn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Bitmap; Data Mining; High Utility Itemsets; Maximal Itemsets; EXCEPTION RULES; PATTERN; ALGORITHM;
D O I
10.4018/ijdwm.2014010101
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Mining high utility itemsets is one of the most important research issues in data mining owing to its ability to consider nonbinary frequency values of items in transactions and different profit values for each item. Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. In this paper, the authors propose an efficient algorithm, namely BAHUI (Bitmap-based Algorithm for High Utility Itemsets), for mining high utility itemsets with bitmap database representation. In BAHUI, bitmap is used vertically and horizontally. On the one hand, BAHUI exploits a divide-and-conquer approach to visit itemset lattice by using bitmap vertically. On the other hand, BAHUI horizontally uses bitmap to calculate the real utilities of candidates. Using bitmap compression scheme, BAHUI reduces the memory usage and makes use of the efficient bitwise operation. Furthermore, BAHUI only records candidate high utility itemsets with maximal length, and inherits the pruning and searching strategies from maximal itemset mining problem. Extensive experimental results show that the BAHUI algorithm is both efficient and scalable.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 28 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]   HUC-Prune: an efficient candidate pruning technique to mine high utility patterns [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo .
APPLIED INTELLIGENCE, 2011, 34 (02) :181-198
[3]   Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) :1708-1721
[4]   Incremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams [J].
Al-Mulla, Reem ;
Al Aghbari, Zaher .
INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2011, 7 (04) :1-20
[5]  
Ashrafi M.Z, 2007, Int. J. Bus. Intell. Data Min., V2, P29
[6]   Extracting share frequent itemsets with infrequent subsets [J].
Barber, B ;
Hamilton, HJ .
DATA MINING AND KNOWLEDGE DISCOVERY, 2003, 7 (02) :153-185
[7]   A lattice-based approach for mining most generalization association rules [J].
Bay Vo ;
Hong, Tzung-Pei ;
Le, Bac .
KNOWLEDGE-BASED SYSTEMS, 2013, 45 :20-30
[8]  
Bayardo R. J. Jr., 1998, SIGMOD Record, V27, P85, DOI 10.1145/276305.276313
[9]  
Chan R, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P19
[10]  
Cho M., 2005, INT J DATA WAREHOUS, V1, P56, DOI [10.4018/jdwm.2005100103, DOI 10.4018/JDWM.2005100103]