Association rule mining algorithm based on Spark for pesticide transaction data analyses

被引:6
|
作者
Bai, Xiaoning [1 ,2 ]
Jia, Jingdun [1 ,3 ]
Wei, Qiwen [4 ]
Huang, Shuaiqi [1 ]
Du, Weicheng [5 ]
Gao, Wanlin [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr & Rural Affairs, Inst Control Agrochem, Beijing 100125, Peoples R China
[3] Minist Sci & Technol, Torch Ctr, Beijing 100045, Peoples R China
[4] Natl Agr Technol Promot Ctr, Beijing 100125, Peoples R China
[5] Minist Agr & Rural Affairs, Informat Ctr, Beijing 100125, Peoples R China
基金
中国国家自然科学基金;
关键词
Spark; association rule mining; ICAMA algorithm; big data; pesticide regulation; MapReduce;
D O I
10.25165/j.ijabe.20191205.4881
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
With the development of smart agriculture, the accumulation of data in the field of pesticide regulation has a certain scale. The pesticide transaction data collected by the Pesticide National Data Center alone produces more than 10 million records daily. However, due to the backward technical means, the existing pesticide supervision data lack deep mining and usage. The Apriori algorithm is one of the classic algorithms in association rule mining, but it needs to traverse the transaction database multiple times, which will cause an extra IO burden. Spark is an emerging big data parallel computing framework with advantages such as memory computing and flexible distributed data sets. Compared with the Hadoop MapReduce computing framework, IO performance was greatly improved. Therefore, this paper proposed an improved Apriori algorithm based on Spark framework, ICAMA. The MapReduce process was used to support the candidate set and then to generate the candidate set. After experimental comparison, when the data volume exceeds 250 Mb, the performance of Spark-based Apriori algorithm was 20% higher than that of the traditional Hadoop-based Apriori algorithm, and with the increase of data volume, the performance improvement was more obvious.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 50 条
  • [21] Big Data Analytics in Association Rule Mining: A Systematic Literature Review
    Shahin, Mahtab
    Peious, Sijo Arakkal
    Sharma, Rahul
    Kaushik, Minakshi
    Ben Yahia, Sadok
    Shah, Syed Attique
    Draheim, Dirk
    2021 THE 3RD INTERNATIONAL CONFERENCE ON BIG DATA ENGINEERING AND TECHNOLOGY, BDET 2021, 2021, : 40 - 49
  • [22] An Approach to Improve Apriori Algorithm Based On Association rule Mining
    Yadav, Chanchal
    Wang, Shuliang
    Kumar, Manoj
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [23] A distributed frequent itemset mining algorithm using Spark for Big Data analytics
    Zhang, Feng
    Liu, Min
    Gui, Feng
    Shen, Weiming
    Shami, Abdallah
    Ma, Yunlong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (04): : 1493 - 1501
  • [24] Association Rule Mining Based Algorithm for Recovery of Silent Data Corruption in Convolutional Neural Network Data Storage
    Ramzanpour, Mohammadreza
    Ludwig, Simone A.
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 3057 - 3064
  • [25] New Spark solutions for distributed frequent itemset and association rule mining algorithms
    Carlos Fernandez-Basso
    M. Dolores Ruiz
    Maria J. Martin-Bautista
    Cluster Computing, 2024, 27 : 1217 - 1234
  • [26] An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data
    Al-Bana, Mohamed Reda
    Farhan, Marwa Salah
    Othman, Nermin Abdelhakim
    DATA, 2022, 7 (01)
  • [27] Online Association Rule Mining over Fast Data
    Olmezogullari, Erdi
    Ari, Ismail
    2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, : 110 - 117
  • [28] Impact factor based data sanitization in association rule mining
    Nithya, S.
    Sangeetha, M.
    Prethi, K. N. Apinaya
    Vellingiri, S.
    MATERIALS TODAY-PROCEEDINGS, 2021, 45 : 2653 - 2659
  • [29] A Distributed Frequent Itemset Mining Algorithm Based on Spark
    Gui, Feng
    Ma, Yunlong
    Zhang, Feng
    Liu, Min
    Li, Fei
    Shen, Weiming
    Bai, Hua
    PROCEEDINGS OF THE 2015 IEEE 19TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2015, : 271 - 275
  • [30] Stock market prediction using weighted inter-transaction class association rule mining and evolutionary algorithm
    Chen, Yan
    Mo, Dongxu
    Zhang, Feipeng
    ECONOMIC RESEARCH-EKONOMSKA ISTRAZIVANJA, 2022, 35 (01): : 5971 - 5996