A time-efficient breadth-first level-wise lattice-traversal algorithm to discover rare itemsets

被引:21
作者
Troiano, Luigi [1 ]
Scibelli, Giacomo [1 ]
机构
[1] Univ Sannio, Dept Engn, I-81031 Benevento, Italy
关键词
Data mining methods and algorithms; Rare itemsets; RULES;
D O I
10.1007/s10618-013-0304-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we face the problem of searching for rare itemsets. A main issue regards the strategy to adopt in exploring the power set lattice. Assuming a power set lattice with full set at the top and empty set at the bottom, the most of the algorithms adopt a bottom-up exploration, i.e. moving from smaller to larger sets. Although this approach is advantageous in the case of frequent itemsets, it might not be worth being used for rare itemsets, as they occur on the top of the lattice. We propose Rarity, a top-down breadth-first level-wise algorithm. Experimental results and comparisons are illustrated in order to provide a quantitative characterization of algorithm performances and complexity. Application to some UCI benchmark and real world datasets is provided. An algorithm parallelization is outlined. Experiments showed that this approach takes advantage of finding all rare non-zero itemsets in less time than other solutions, at expenses of higher memory demand.
引用
收藏
页码:773 / 807
页数:35
相关论文
共 37 条
[11]  
Bastide Y., 2000, SIGKDD EXPLORATIONS, V2, P66, DOI DOI 10.1145/380995.381017
[12]  
Brin S., 1997, SIGMOD Record, V26, P255, DOI [10.1145/253262.253327, 10.1145/253262.253325]
[13]   MAFIA: A maximal frequent itemset algorithm for transactional databases [J].
Burdick, D ;
Calimlim, M ;
Gehrke, J .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :443-452
[14]  
Forina M, 1991, WINE DATASET
[15]  
Haglin David J., 2007, Proceedings of the International Conference on Data Mining. DMIN 2007, P141
[16]   Mining frequent patterns without candidate generation: A frequent-pattern tree approach [J].
Han, JW ;
Pei, J ;
Yin, YW ;
Mao, RY .
DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (01) :53-87
[17]  
Jong Soo Park, 1995, Proceedings of the 1995 ACM CIKM International Conference on Information and Knowledge Management, P31, DOI 10.1145/221270.221320
[18]  
Koh YS, 2005, LECT NOTES ARTIF INT, V3518, P97
[19]   Mining interesting imperfectly sporadic rules [J].
Koh, Yun Sing ;
Rountree, Nathan ;
O'Keefe, Richard A. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 14 (02) :179-196
[20]  
Liu W., 1999, P 5 ACM SIGKDD INT C, P337, DOI [DOI 10.1145/312129.312274, 10.1145/312129.312274]