Differentially Private Feature Selection for Data Mining

被引:4
|
作者
Anandan, Balamurugan [1 ,2 ]
Clifton, Chris [1 ,2 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
来源
IWSPA '18: PROCEEDINGS OF THE FOURTH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS | 2018年
关键词
Differential privacy; sensitivity; data mining; classification; decision trees; naive bayes; feature selection; privacy preserving data mining;
D O I
10.1145/3180445.3180452
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One approach to analysis of private data is epsilon-differential privacy, a randomization-based approach that protects individual data items by injecting carefully limited noise into results. A challenge in applying this to private data analysis is that the noise added to the feature parameters is directly proportional to the number of parameters learned. While careful feature selection would alleviate this problem, the process of feature selection itself can reveal private information, requiring the application of differential privacy to the feature selection process. In this paper, we analyze the sensitivity of various feature selection techniques used in data mining and show that some of them are not suitable for differentially private analysis due to high sensitivity. We give experimental results showing the value of using low sensitivity feature selection techniques. We also show that the same concepts can be used to improve differentially private decision trees.
引用
收藏
页码:43 / 53
页数:11
相关论文
共 50 条
  • [1] Differentially Private Feature Selection
    Yang, Jun
    Li, Yun
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 4182 - 4189
  • [2] Using Feature Selection to Improve the Utility of Differentially Private Data Publishing
    Jafer, Yasser
    Matwin, Stan
    Sokolova, Marina
    5TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS / THE 4TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE / AFFILIATED WORKSHOPS, 2014, 37 : 511 - 516
  • [3] Differentially Private Feature Selection Based on Dynamic Relevance for Correlated Data
    Wang, Chunxia
    Zhang, Qiuyu
    Yan, Yan
    Journal of Computers (Taiwan), 2023, 34 (01) : 157 - 173
  • [4] Differentially private approximate aggregation based on feature selection
    He, Zaobo
    Sai, Akshita Maradapu Vera Venkata
    Huang, Yan
    Seo, Daehee
    Zhang, Hanzhou
    Han, Qilong
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 2021, 41 (02) : 318 - 327
  • [5] Differentially private feature selection under MapReduce framework
    CHEN Kai
    WAN Wen-qiang
    LI Yun
    The Journal of China Universities of Posts and Telecommunications, 2013, (05) : 85 - 90
  • [6] Differentially private approximate aggregation based on feature selection
    Zaobo He
    Akshita Maradapu Vera Venkata Sai
    Yan Huang
    Daehee seo
    Hanzhou Zhang
    Qilong Han
    Journal of Combinatorial Optimization, 2021, 41 : 318 - 327
  • [7] Differentially private feature selection under MapReduce framework
    CHEN Kai
    WAN Wenqiang
    LI Yun
    TheJournalofChinaUniversitiesofPostsandTelecommunications, 2013, 20 (05) : 85 - 90+103
  • [8] TraVaS: Differentially Private Trace Variant Selection for Process Mining
    Rafiei, Majid
    Wangelik, Frederik
    van der Aalst, Wil M. P.
    PROCESS MINING WORKSHOPS, ICPM 2022, 2023, 468 : 114 - 126
  • [9] Adaptive Differentially Private Data Release for Data Sharing and Data Mining
    Xiong, Li
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, : 891 - 891
  • [10] Differentially private feature selection under Map Reduce framework
    Chen, Kai
    Wan, Wen-Qiang
    Li, Yun
    Journal of China Universities of Posts and Telecommunications, 2013, 20 (05): : 85 - 90