Distributed synthesized association mining for big transactional data

被引:4
|
作者
Pal, Amrit [1 ,2 ]
Kumar, Manish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Prayagraj, India
来源
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2020年 / 45卷 / 01期
关键词
Big Data; HDFS; MapReduce; Apriori; frequent itemset; association rule; DATA SETS; RULES; PATTERNS;
D O I
10.1007/s12046-020-01380-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data is increasing rapidly day by day along with the transactional database. Dividing this data and storing it in a distributed manner is an effective way for storage and retrieval. Mining such distributed data with minimum dependence between sub-problems is a crucial task. Finding frequent itemsets and corresponding association rules is a big challenge while considering the aggregation in a distributed environment. To overcome these challenges, we propose a distributed frequent itemset generation and association rule mining algorithm using MapReduce programming model. The proposed scheme generates frequent itemset and mine association rules using a synthesized distributed technique. The rules are mined in a distributed manner, and then weights are assigned to subsets of data and association rules. A proper mixture of association rules that are generated in distributed manner is done using a weighted approach. This paper presents a novel MapReduce-based synthesis approach, which can work well over a distributed storage of large amount of data.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] A Solution for Mining Big Data Based on Distributed Data Streams and Its Classifying Algorithms
    Mao, Guojun
    Qiao, Jiewei
    DATA MINING AND BIG DATA, DMBD 2017, 2017, 10387 : 263 - 271
  • [32] Data Mining Techniques for IoT and Big Data -A Survey
    Shobanadevi, A.
    Maragatham, G.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SUSTAINABLE SYSTEMS (ICISS 2017), 2017, : 66 - 78
  • [33] Operating Optimization of Steam Turbine Unit Based on Big Data Parallel Association Rule Mining
    Lin, Jinxing
    Lu, Mingjie
    Jiang, Yongjiang
    Fu, Rong
    Peng, Xianyong
    Wu, Edmond Q. Q.
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 4226 - 4236
  • [34] Big Data Analytics in Association Rule Mining: A Systematic Literature Review
    Shahin, Mahtab
    Peious, Sijo Arakkal
    Sharma, Rahul
    Kaushik, Minakshi
    Ben Yahia, Sadok
    Shah, Syed Attique
    Draheim, Dirk
    2021 THE 3RD INTERNATIONAL CONFERENCE ON BIG DATA ENGINEERING AND TECHNOLOGY, BDET 2021, 2021, : 40 - 49
  • [35] Online education big data mining method based on association rules
    Zhang N.
    International Journal of Information and Communication Technology, 2024, 24 (03) : 262 - 272
  • [36] A distributed frequent itemset mining algorithm using Spark for Big Data analytics
    Feng Zhang
    Min Liu
    Feng Gui
    Weiming Shen
    Abdallah Shami
    Yunlong Ma
    Cluster Computing, 2015, 18 : 1493 - 1501
  • [37] The Impact of Distributed Data in Big Data Platforms on Organizations
    Koren, Oded
    Binyaminov, Matan
    Perel, Nir
    PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2018, VOL 2, 2019, 881 : 1024 - 1036
  • [38] Dynamic Distributed and Parallel Machine Learning algorithms for big data mining processing
    Djafri, Laouni
    DATA TECHNOLOGIES AND APPLICATIONS, 2022, 56 (04) : 558 - 601
  • [39] Apriori Versions Based on MapReduce for Mining Frequent Patterns on Big Data
    Maria Luna, Jose
    Padillo, Francisco
    Pechenizkiy, Mykola
    Ventura, Sebastian
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (10) : 2851 - 2865
  • [40] Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms
    Djenouri, Youcef
    Djenouri, Djamel
    Lin, Jerry Chun-Wei
    Belhadi, Asma
    IEEE ACCESS, 2018, 6 : 68013 - 68026