Comprehensive mining of frequent itemsets for a combination of certain and uncertain databases

被引:0
|
作者
Wazir S. [1 ]
Beg M.M.S. [2 ]
Ahmad T. [1 ]
机构
[1] Department of Computer Engineering, Jamia Millia Islamia, New Delhi
[2] Department of Computer Engineering, Aligarh Muslim University, Aligarh
关键词
Approximate Frequent Items; Certain and Uncertain Transactional Database; Expected Support; Frequent Itemset Mining; Normal Distribution; Poisson Distribution;
D O I
10.1007/s41870-019-00310-0
中图分类号
学科分类号
摘要
The mechanism of Frequent Itemset Mining can be performed by using sequential algorithms like Apriori on a standalone system, or it can be applied using parallel algorithms like Count Distribution on a distributed system. Due to communication overhead in parallel algorithms and exponential candidate generation, many algorithms were developed for calculating frequent items either over the certain or uncertain database. Yet not a single algorithm is developed so far which can cover the requirement of generating frequent itemset by combining both the databases. We had proposed earlier MasterApriori algorithm which is used to calculate Approximate Frequent Items for a combination of certain and uncertain databases with the support of Apriori for Certain and Expected support based UApriori for the uncertain database. In this paper, the researcher would like to extend the former work by using Poisson and Normal Distribution based UApriori for the uncertain database. In proposed algorithms, there is only one-time communication between sites where data is distributed, which reduce the communication overhead. Scalability and efficiency of proposed algorithms are then checked by using standard, and synthetic databases. The performances were then measured by comparing time taken and a number of frequent items generated by each algorithm. © 2019, Bharati Vidyapeeth's Institute of Computer Applications and Management.
引用
收藏
页码:1205 / 1216
页数:11
相关论文
共 50 条
  • [41] AN EFFICIENT ITEMSET REPRESENTATION FOR MINING FREQUENT PATTERNS IN TRANSACTIONAL DATABASES
    Tomovic, Savo
    Stanisic, Predrag
    COMPUTING AND INFORMATICS, 2018, 37 (04) : 894 - 914
  • [42] Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets
    Prost-Boucle, Adrien
    Petrot, Frederic
    Leroy, Vincent
    Alemdar, Hande
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2017, 10 (03)
  • [43] A haoop-based parallel mining of frequent itemsets using N-Lists
    Sohrabi, Mohammad Karim
    Taheri, Narjes
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2018, 41 (03) : 229 - 238
  • [44] TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data
    Yasir, Muhammad
    Habib, Muhammad Asif
    Ashraf, Muhammad
    Sarwar, Shahzad
    Chaudhry, Muhammad Umar
    Shahwani, Hamayoun
    Ahmad, Mudassar
    Faisal, Ch Muhammad Nadeem
    IEEE ACCESS, 2019, 7 : 181688 - 181705
  • [45] Fast Algorithms for Frequent Itemset Mining from Uncertain Data
    Leung, Carson Kai-Sang
    MacKinnon, Richard Kyle
    Tanbeer, Syed K.
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 893 - 898
  • [46] CL-MAX: a clustering-based approximation algorithm for mining maximal frequent itemsets
    Fatemi, Seyed Mohsen
    Hosseini, Seyed Mohsen
    Kamandi, Ali
    Shabankhah, Mahmood
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (02) : 365 - 383
  • [47] Extending inverse frequent itemsets mining to generate realistic datasets: complexity, accuracy and emerging applications
    Sacca, Domenico
    Serra, Edoardo
    Rullo, Antonino
    DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 33 (06) : 1736 - 1774
  • [48] Extending inverse frequent itemsets mining to generate realistic datasets: complexity, accuracy and emerging applications
    Domenico Saccá
    Edoardo Serra
    Antonino Rullo
    Data Mining and Knowledge Discovery, 2019, 33 : 1736 - 1774
  • [49] CL-MAX: a clustering-based approximation algorithm for mining maximal frequent itemsets
    Seyed Mohsen Fatemi
    Seyed Mohsen Hosseini
    Ali Kamandi
    Mahmood Shabankhah
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 365 - 383
  • [50] Efficient Probabilistic Frequent Itemset Mining in Big Sparse Uncertain Data
    Xu, Jing
    Li, Ning
    Mao, Xiao-Jiao
    Yang, Yu-Bin
    PRICAI 2014: TRENDS IN ARTIFICIAL INTELLIGENCE, 2014, 8862 : 235 - 247