Distributed synthesized association mining for big transactional data

被引:4
|
作者
Pal, Amrit [1 ,2 ]
Kumar, Manish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Prayagraj, India
来源
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2020年 / 45卷 / 01期
关键词
Big Data; HDFS; MapReduce; Apriori; frequent itemset; association rule; DATA SETS; RULES; PATTERNS;
D O I
10.1007/s12046-020-01380-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data is increasing rapidly day by day along with the transactional database. Dividing this data and storing it in a distributed manner is an effective way for storage and retrieval. Mining such distributed data with minimum dependence between sub-problems is a crucial task. Finding frequent itemsets and corresponding association rules is a big challenge while considering the aggregation in a distributed environment. To overcome these challenges, we propose a distributed frequent itemset generation and association rule mining algorithm using MapReduce programming model. The proposed scheme generates frequent itemset and mine association rules using a synthesized distributed technique. The rules are mined in a distributed manner, and then weights are assigned to subsets of data and association rules. A proper mixture of association rules that are generated in distributed manner is done using a weighted approach. This paper presents a novel MapReduce-based synthesis approach, which can work well over a distributed storage of large amount of data.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Distributed synthesized association mining for big transactional data
    Amrit Pal
    Manish Kumar
    Sādhanā, 2020, 45
  • [2] Hadoop based Mining of Distributed Association Rules from Big Data
    Bouraoui, Marwa
    Bouzouita, Ines
    Touzi, Amel Grissa
    2017 18TH INTERNATIONAL CONFERENCE ON SCIENCES AND TECHNIQUES OF AUTOMATIC CONTROL AND COMPUTER ENGINEERING (STA), 2017, : 185 - 190
  • [3] Parallel Mining Frequent Patterns over Big Transactional Data in Extended MapReduce
    Chen, Hui
    Lin, Tsau Young
    Zhang, Zhibing
    Zhong, Jie
    2013 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2013, : 43 - 48
  • [4] ClowdFlows: Online workflows for distributed big data mining
    Kranjc, Janez
    Orac, Roman
    Podpecan, Vid
    Lavrac, Nada
    Robnik-Sikonja, Marko
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 68 : 38 - 58
  • [5] Parallel and distributed clustering framework for big spatial data mining
    Bendechache, Malika
    Tari, A-Kamel
    Kechadi, M-Tahar
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2019, 34 (06) : 671 - 689
  • [6] Memory-optimized distributed utility mining for big data
    Kumar, Sunil
    Mohbey, Krishna Kumar
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6491 - 6503
  • [7] New Spark solutions for distributed frequent itemset and association rule mining algorithms
    Fernandez-Basso, Carlos
    Ruiz, M. Dolores
    Martin-Bautista, Maria J.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (02): : 1217 - 1234
  • [8] A Novel Mapreduce Lift Association Rule Mining Algorithm (MRLAR) for Big Data
    Oweis, Nour E.
    Fouad, Mohamed Mostafa
    Oweis, Sami R.
    Owais, Suhail S.
    Snasel, Vaclav
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (03) : 151 - 157
  • [9] A review on big data based parallel and distributed approaches of pattern mining
    Kumar, Sunil
    Mohbey, Krishna Kumar
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (05) : 1639 - 1662
  • [10] Big data mining with parallel computing: A comparison of distributed and MapReduce methodologies
    Tsai, Chih-Fong
    Lin, Wei-Chao
    Ke, Shih-Wen
    JOURNAL OF SYSTEMS AND SOFTWARE, 2016, 122 : 83 - 92