Association rule mining algorithm based on Spark for pesticide transaction data analyses

被引:6
|
作者
Bai, Xiaoning [1 ,2 ]
Jia, Jingdun [1 ,3 ]
Wei, Qiwen [4 ]
Huang, Shuaiqi [1 ]
Du, Weicheng [5 ]
Gao, Wanlin [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr & Rural Affairs, Inst Control Agrochem, Beijing 100125, Peoples R China
[3] Minist Sci & Technol, Torch Ctr, Beijing 100045, Peoples R China
[4] Natl Agr Technol Promot Ctr, Beijing 100125, Peoples R China
[5] Minist Agr & Rural Affairs, Informat Ctr, Beijing 100125, Peoples R China
基金
中国国家自然科学基金;
关键词
Spark; association rule mining; ICAMA algorithm; big data; pesticide regulation; MapReduce;
D O I
10.25165/j.ijabe.20191205.4881
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
With the development of smart agriculture, the accumulation of data in the field of pesticide regulation has a certain scale. The pesticide transaction data collected by the Pesticide National Data Center alone produces more than 10 million records daily. However, due to the backward technical means, the existing pesticide supervision data lack deep mining and usage. The Apriori algorithm is one of the classic algorithms in association rule mining, but it needs to traverse the transaction database multiple times, which will cause an extra IO burden. Spark is an emerging big data parallel computing framework with advantages such as memory computing and flexible distributed data sets. Compared with the Hadoop MapReduce computing framework, IO performance was greatly improved. Therefore, this paper proposed an improved Apriori algorithm based on Spark framework, ICAMA. The MapReduce process was used to support the candidate set and then to generate the candidate set. After experimental comparison, when the data volume exceeds 250 Mb, the performance of Spark-based Apriori algorithm was 20% higher than that of the traditional Hadoop-based Apriori algorithm, and with the increase of data volume, the performance improvement was more obvious.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 50 条
  • [31] Risk Identification-Based Association Rule Mining for Supply Chain Big Data
    Salamai, Abdullah
    Saberi, Morteza
    Hussain, Omar
    Chang, Elizabeth
    SECURITY, PRIVACY, AND ANONYMITY IN COMPUTATION, COMMUNICATION, AND STORAGE (SPACCS 2018), 2018, 11342 : 219 - 228
  • [32] Association Rule Mining in Social Network Data
    Mahoto, Naeem A.
    Shaikh, Anoud
    Nizamani, Shahzad
    COMMUNICATION TECHNOLOGIES, INFORMATION SECURITY AND SUSTAINABLE DEVELOPMENT, 2014, 414 : 149 - 160
  • [33] Optimization algorithm of association rule mining for heavy-haul railway freight train fault data based on distributed parallel computing
    Bai, Yanhui
    Li, Honghui
    Wang, Wengang
    Liu, Shufang
    Zhang, Ning
    Zhang, Chun
    SCIENCE PROGRESS, 2024, 107 (04)
  • [34] Inter-transaction Association Rule Mining in the Indonesia Stock Exchange Market
    Widiputra, Harya
    Pahlevi, Bagus
    2012 2ND INTERNATIONAL CONFERENCE ON UNCERTAINTY REASONING AND KNOWLEDGE ENGINEERING (URKE), 2012, : 149 - 152
  • [35] An Improved Association Rule Mining Technique for Xml Data Using Xquery and Apriori Algorithm
    Porkodi, R.
    Bhuvaneswari, V.
    Rajesh, R.
    Amudha, T.
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1510 - 1514
  • [36] Privacy-preserving Association Rule Mining Algorithm for Encrypted Data in Cloud Computing
    Kim, Hyeong-Jin
    Shin, Jae-Hwan
    Song, Young-ho
    Chang, Jae-Woo
    2019 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2019), 2019, : 487 - 489
  • [37] Association Rule Mining Algorithm Improvement and Implementation Analysis for Big Data Oriented Education
    Rui, Jiang
    EDUCATION AND MANAGEMENT INNOVATION, 2017, : 268 - 273
  • [38] Optimization Algorithm Improvement of Association Rule Mining Based on Particle Swarm Optimization
    Feng, Hao
    Liao, Rongtao
    Liu, Fen
    Wang, Yixi
    Yu, Zheng
    Zhu, Xiaojun
    2018 10TH INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION (ICMTMA), 2018, : 524 - 529
  • [39] Research on Audit Log Association Rule Mining Based on Improved Apriori Algorithm
    Cheng, Maocai
    Xu, Kaiyong
    Gong, Xuerong
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2016, : 11 - 17
  • [40] A novel approach for spam detection based on association rule mining and genetic algorithm
    Sokhangoee, Zeynab Fallah
    Rezapour, Abdoreza
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 97