Mining top-k frequent-regular closed patterns

被引:22
作者
Amphawan, Komate [1 ]
Lenca, Philippe [2 ]
机构
[1] Burapha Univ, Computat Innovat Lab, Informat, Bangkok, Thailand
[2] CNRS, Inst Mines Telecom, Telecom Bretagne, UMR 6285,LabSTICC, Bretagne, France
关键词
Frequent pattern; Regular pattern; Closed pattern; Bit-vector; ITEMSETS; BITTABLEFI; ALGORITHM;
D O I
10.1016/j.eswa.2015.06.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent-regular pattern mining has attracted recently many works. Most of the approaches focus on discovering a complete set of patterns under the user-given support and regularity threshold constraints. This leads to several quantitative and qualitative drawbacks. First, it is often difficult to set appropriate support threshold. Second, algorithms produce a huge number of patterns, many of them being redundant. Third, most of the patterns are of very small size and it is arduous to extract interesting relationship among items. To reduce the number of patterns a common solution is to consider the desired number k of outputs and to mine the top-k patterns. In addition, this approach does not require to set a support threshold. To cope with redundancy and interestingness relationship among items, we suggest to focus on closed patterns and introduce a minimal length constraint. We thus propose to mine the top-k frequent-regular closed patterns with minimal length. An efficient single-pass algorithm, called TFRC-Mine, and a new compact bit-vector representation which allows to prune uninteresting candidate, are designed. Experiments show that the proposed algorithm is efficient to produce longer - non redundant - patterns, and that the new data representation is efficient for both computational time and memory usage. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:7882 / 7894
页数:13
相关论文
共 36 条
[1]  
Ada Wai-Chee Fu, 2000, Foundations of Intelligent Systems. 12th International Symposium, ISMIS 2000. Proceedings (Lecture Notes in Artificial Intelligence Vol.1932), P59
[2]  
Amir A, 2012, LECT NOTES COMPUT SC, V7608, P1, DOI 10.1007/978-3-642-34109-0_1
[3]  
Amphawan K., 2012, LECT NOTES COMPUTER, V7104, P124
[4]   Mining top-k regular-frequent itemsets using database partitioning and support estimation [J].
Amphawan, Komate ;
Lenca, Philippe ;
Surarerks, Athasit .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (02) :1924-1936
[5]  
Amphawan K, 2009, COMM COM INF SC, V55, P18
[6]   Mining periodic-frequent itemsets with approximate periodicity using interval transaction-ids list tree [J].
Amphawan, Komate ;
Surarerks, Athatsit ;
Lenca, Philippe .
THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, :245-248
[7]   'NTV' Metric based New Similarity Measure for Intuitionistic Fuzzy Sets with its Computational Application in Medical Diagnosis [J].
Bajaj, R. K. ;
Kumar, Tanuj ;
Gupta, Nitin .
2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC), 2012, :1-4
[8]   DBV-Miner: A Dynamic Bit-Vector approach for fast mining frequent closed itemsets [J].
Bay Vo ;
Hong, Tzung-Pei ;
Bac Le .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (08) :7196-7206
[9]   MAFIA: A maximal frequent itemset algorithm for transactional databases [J].
Burdick, D ;
Calimlim, M ;
Gehrke, J .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :443-452
[10]   Mining weighted sequential patterns in a sequence database with a time-interval weight [J].
Chang, Joong Hyuk .
KNOWLEDGE-BASED SYSTEMS, 2011, 24 (01) :1-9