An Efficient Algorithm for Mining Frequent Closed Itemsets

被引:0
作者
Fang, Gang [1 ,2 ]
Wu, Yue [1 ]
Li, Ming [1 ]
Chen, Jia [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
[2] Chongqing Three Gorges Univ, Sch Comp Sci & Engn, Chongqing 404000, Peoples R China
来源
INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS | 2015年 / 39卷 / 01期
关键词
frequent closed itemsets; Galois connection; granular computing; association rules; data mining;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
To avoid generating an undesirably large set of frequent itemsets for discovering all high confidence association rules, the problem of finding frequent closed itemsets in a formal mining context is proposed. In this paper, aiming to these shortcomings of typical algorithms for mining frequent closed itemsets, such as the algorithm A-close and CLOSET, we propose an efficient algorithm for mining frequent closed itemsets, which is based on Galois connection and granular computing. Firstly, we present the smallest frequent closed itemsets and its characters, contain some properties and theorems, then propose a novel notion, called the smallest frequent closed granule, which can help the algorithm save reading the database to reduce the costed I/O for discovering frequent closed itemsets. And then we propose a novel model for mining frequent closed itemsets based on the smallest frequent closed granules, and a connection function for generating the smallest frequent closed itemsets. The generator function create the power set of the smallest frequent closed itemsets in the enlarged frequent 1-item manner, which can efficiently avoid generating an undesirably large set of candidate smallest frequent closed itemsets to reduce the costed CPU and the occupied main memory for generating the smallest frequent closed granules. Finally, we describe the algorithm for the proposed model. On these different datasets, we report the performances of the algorithm and its trend of the performances to discover frequent closed itemsets, and further discuss how to solve the bottleneck of the algorithm. For mining frequent closed itemsets, all these experimental results indicate that the performances of the algorithm are better than the traditional and typical algorithms, and it also has a good scalability. It is suitable for mining dynamic transactions datasets.
引用
收藏
页码:87 / 98
页数:12
相关论文
共 30 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Agrawal R., 1994, P 20 INT C VER LARG, VVolume 1215, P487
[3]   Granularity of attributes in formal concept analysis [J].
Belohlavek, Radim ;
De Baets, Bernard ;
Konecny, Jan .
INFORMATION SCIENCES, 2014, 260 :149-170
[4]   MAFIA: A maximal frequent itemset algorithm for transactional databases [J].
Burdick, D ;
Calimlim, M ;
Gehrke, J .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :443-452
[5]  
Fang G, 2013, INFORM-J COMPUT INFO, V37, P443
[6]  
Ganter B., 1999, FORMAL CONCEPT ANAL
[7]   A THEORY OF ABSTRACTION [J].
GIUNCHIGLIA, F ;
WALSH, T .
ARTIFICIAL INTELLIGENCE, 1992, 57 (2-3) :323-389
[8]  
Guang-Peng Chen, 2012, Proceedings of the 2012 IEEE 19th International Conference on Web Services (ICWS), P652, DOI 10.1109/ICWS.2012.19
[9]  
Han JW, 2000, SIGMOD RECORD, V29, P1
[10]  
Hobbs J.R., 1985, PROC IJCAI, P432