HARPP: HARnessing the Power of Power Sets for Mining Frequent Itemsets

被引:5
作者
Yasir, Muhammad [1 ]
Habib, Muhammad Asif [2 ]
Sarwar, Shahzad [3 ]
Faisal, Chaudhry Muhammad Nadeem [2 ]
Ahmad, Mudassar [2 ]
Jabbar, Sohail [2 ]
机构
[1] Univ Engn & Technol Lahore, Dept Comp Sci, Faisalabad Campus, Faisalabad, Pakistan
[2] Natl Text Univ, Dept Comp Sci, Faisalabad, Pakistan
[3] Univ Punjab, Punjab Univ Coll Informat Technol, Lahore, Pakistan
来源
INFORMATION TECHNOLOGY AND CONTROL | 2019年 / 48卷 / 03期
关键词
Association Rules; Frequent Itemset Mining; Apriori; FP-Growth; Recommendation Systems; N-LIST; ALGORITHM; PATTERNS; CBAR;
D O I
10.5755/j01.itc.48.3.21137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern algorithms for mining frequent itemsets face the noteworthy deterioration of performance when minimum support tends to decrease, especially for sparse datasets. Long-tailed itemsets, frequent itemsets found at lower minimum support, are significant for present-day applications such as recommender systems. In this study, a novel power set based method named as HARnessing the Power of Power sets (HARPP) for mining frequent itemsets is developed. HARPP is based on the concept of power set from set theory and incorporates efficient data structures for mining. Without storing it entirely in memory, HARPP scans the dataset only once and mines frequent itemsets on the fly. In contrast to state-of-the-art, the efficiency of HARPP increases with a decrease in minimum support that makes it a viable technique for mining long-tailed itemsets. A performance study shows that HARPP is efficient and scalable. It is faster up to two orders of magnitude than FP-Growth algorithm at lower minimum support, particularly when datasets are sparse. HARPP memory consumption is less than that of state-of-the-art by an order of magnitude, on most datasets.
引用
收藏
页码:415 / 431
页数:17
相关论文
共 64 条
  • [41] Clustering association rules
    Lent, B
    Swami, A
    Widom, J
    [J]. 13TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING - PROCEEDINGS, 1997, : 220 - 231
  • [42] Mining frequent patterns from network flows for monitoring network
    Li, Xin
    Deng, Zhi-Hong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) : 8850 - 8860
  • [43] Makhtar M, 2016, INT C SOFT COMP DAT, P437, DOI [10.1007/978-3-319-51281-5_44, DOI 10.1007/978-3-319-51281-5_44]
  • [44] Mobasher Bamshad, 2001, EFFECTIVE PERSONALIZ, P9, DOI [10.1145/502932.502935, DOI 10.1145/502933.502935]
  • [45] Association rule mining to detect factors which contribute to heart disease in males and females
    Nahar, Jesmin
    Imam, Tasadduq
    Tickle, Kevin S.
    Chen, Yi-Ping Phoebe
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (04) : 1086 - 1093
  • [46] Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data
    Najafabadi, Maryam Khanian
    Mahrin, Mohd Naz'ri
    Chuprat, Suriayati
    Sarkan, Haslina Md
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2017, 67 : 113 - 128
  • [47] Ng R. T., 1998, SIGMOD Record, V27, P13, DOI 10.1145/276305.276307
  • [48] Ozel S.A., 2001, 10 TURKISH S ARTIFIC, P257
  • [49] Pera MariaSoledad., 2015, P 26 ACM C HYPERTEXT, P221
  • [50] Sarawagi S., 1998, SIGMOD Record, V27, P343, DOI 10.1145/276305.276335