Private Frequent Itemset Mining in the Local Setting

被引:2
|
作者
Fu, Hang [1 ]
Yang, Wei [1 ]
Huang, Liusheng [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China
来源
WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT II | 2021年 / 12938卷
关键词
Local differential privacy; Frequent itemset mining; Crowdsensing; Randomized response;
D O I
10.1007/978-3-030-86130-8_27
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Set-valued data, which is useful for representing user-generated data, becomes ubiquitous in numerous online services. Service provider profits by learning patterns and associations from users' set-valued data. However, it comes with privacy concerns if these data are collected from users directly. This work studies frequent itemset mining from user-generated set-valued datameanwhile locally preserving personal data privacy. Under local d-privacy constraints, which capture intrinsic dissimilarity between set-valued data in the framework of differential privacy, we propose a novel privacy-preserving frequent itemset mining mechanism, called PrivFIM. It provides rigorous data privacy protection on the user-side and allows effective statistical analyses on the server-side. Specifically, each user perturbs his set-valued data locally to guarantee that the server cannot infer the user's original itemset with high confidence. The server can reconstruct an unbiased estimation of itemset frequency from these randomized data and then combines it with the Apriori-based pruning technique to identify frequent itemsets efficiently and accurately. Extensive experiments conducted on real-world and synthetic datasets demonstrate that PrivFIM surpasses existing methods, and maintains high utility while providing strong privacy guarantees.
引用
收藏
页码:338 / 350
页数:13
相关论文
共 50 条
  • [41] Parallel Incremental Frequent Itemset Mining for Large Data
    Yu-Geng Song
    Hui-Min Cui
    Xiao-Bing Feng
    Journal of Computer Science and Technology, 2017, 32 : 368 - 385
  • [42] Iterative sampling based frequent itemset mining for big data
    Wu, Xian
    Fan, Wei
    Peng, Jing
    Zhang, Kun
    Yu, Yong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2015, 6 (06) : 875 - 882
  • [43] AnyFI: An Anytime Frequent Itemset Mining Algorithm for Data Streams
    Goyal, Poonam
    Challa, Jagat Sesh
    Shrivastava, Shivin
    Goyal, Navneet
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 942 - 947
  • [44] Applying frequent itemset mining to identify a small itemset that satisfies a large percentage of orders in a warehouse
    Wu, CW
    COMPUTERS & OPERATIONS RESEARCH, 2006, 33 (11) : 3161 - 3170
  • [45] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
    Fumarola, Fabio
    Malerba, Donato
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
  • [46] A Spark-based Incremental Algorithm for Frequent Itemset Mining
    Wen, Haoxing
    Li, Xiaoguang
    Kou, Mingdong
    Tou, Huaixiao
    He, Hengyi
    Yang, Yulu
    BDIOT 2018: PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS, 2018, : 53 - 58
  • [47] Integrity Verification of Outsourced Frequent Itemset Mining with Deterministic Guarantee
    Dong, Boxiang
    Liu, Ruilin
    Wang, Wendy Hui
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 1025 - 1030
  • [48] Implementation of an Improved Algorithm for Frequent Itemset Mining using Hadoop
    Agarwal, Ruchi
    Singh, Sunny
    Vats, Satvik
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 13 - 18
  • [49] Grafting for combinatorial binary model using frequent itemset mining
    Lee, Taito
    Matsushima, Shin
    Yamanishi, Kenji
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (01) : 101 - 123
  • [50] A new framework for metaheuristic-based frequent itemset mining
    Youcef Djenouri
    Djamel Djenouri
    Asma Belhadi
    Philippe Fournier-Viger
    Jerry Chun-Wei Lin
    Applied Intelligence, 2018, 48 : 4775 - 4791