Distributed synthesized association mining for big transactional data

被引:4
|
作者
Pal, Amrit [1 ,2 ]
Kumar, Manish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Prayagraj, India
来源
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2020年 / 45卷 / 01期
关键词
Big Data; HDFS; MapReduce; Apriori; frequent itemset; association rule; DATA SETS; RULES; PATTERNS;
D O I
10.1007/s12046-020-01380-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data is increasing rapidly day by day along with the transactional database. Dividing this data and storing it in a distributed manner is an effective way for storage and retrieval. Mining such distributed data with minimum dependence between sub-problems is a crucial task. Finding frequent itemsets and corresponding association rules is a big challenge while considering the aggregation in a distributed environment. To overcome these challenges, we propose a distributed frequent itemset generation and association rule mining algorithm using MapReduce programming model. The proposed scheme generates frequent itemset and mine association rules using a synthesized distributed technique. The rules are mined in a distributed manner, and then weights are assigned to subsets of data and association rules. A proper mixture of association rules that are generated in distributed manner is done using a weighted approach. This paper presents a novel MapReduce-based synthesis approach, which can work well over a distributed storage of large amount of data.
引用
收藏
页数:13
相关论文
共 50 条
  • [11] Mining association rules in big data with NGEP
    Chen, Yunliang
    Li, Fangyuan
    Fan, Junqing
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 577 - 585
  • [12] Mining association rules in big data with NGEP
    Yunliang Chen
    Fangyuan Li
    Junqing Fan
    Cluster Computing, 2015, 18 : 577 - 585
  • [13] Distributed Relationship Mining over Big Scholar Data
    Zhang, Da
    Kabuka, Mansur R.
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (01) : 354 - 365
  • [14] Data Mining with Big Data
    Wu, Xindong
    Zhu, Xingquan
    Wu, Gong-Qing
    Ding, Wei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 97 - 107
  • [15] Mining association rules on Big Data through MapReduce genetic programming
    Padillo, F.
    Luna, J. M.
    Herrera, F.
    Ventura, S.
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2018, 25 (01) : 31 - 48
  • [16] Mining of Web Server Logs in a Distributed Cluster Using Big Data Technologies
    Savitha, K.
    Vijaya, M. S.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (01) : 137 - 142
  • [17] A distributed frequent itemset mining algorithm using Spark for Big Data analytics
    Zhang, Feng
    Liu, Min
    Gui, Feng
    Shen, Weiming
    Shami, Abdallah
    Ma, Yunlong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (04): : 1493 - 1501
  • [18] A Big Data Framework for Mining Sensor Data Using Hadoop
    El-Shafeiy, Engy A.
    El-Desouky, Ali I.
    STUDIES IN INFORMATICS AND CONTROL, 2017, 26 (03): : 365 - 376
  • [19] Research On Distributed Mining Algorithm For Association Rules Oriented Mass Data
    Zhang Yongliang
    Qin Jie
    Zheng Shiming
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 492 - 499
  • [20] Distributed Bayesian Matrix Decomposition for Big Data Mining and Clustering
    Zhang, Chihao
    Yang, Yang
    Zhou, Wei
    Zhang, Shihua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3701 - 3713