Differentially Private Feature Selection for Data Mining

被引:4
作者
Anandan, Balamurugan [1 ,2 ]
Clifton, Chris [1 ,2 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
来源
IWSPA '18: PROCEEDINGS OF THE FOURTH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS | 2018年
关键词
Differential privacy; sensitivity; data mining; classification; decision trees; naive bayes; feature selection; privacy preserving data mining;
D O I
10.1145/3180445.3180452
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One approach to analysis of private data is epsilon-differential privacy, a randomization-based approach that protects individual data items by injecting carefully limited noise into results. A challenge in applying this to private data analysis is that the noise added to the feature parameters is directly proportional to the number of parameters learned. While careful feature selection would alleviate this problem, the process of feature selection itself can reveal private information, requiring the application of differential privacy to the feature selection process. In this paper, we analyze the sensitivity of various feature selection techniques used in data mining and show that some of them are not suitable for differentially private analysis due to high sensitivity. We give experimental results showing the value of using low sensitivity feature selection techniques. We also show that the same concepts can be used to improve differentially private decision trees.
引用
收藏
页码:43 / 53
页数:11
相关论文
共 50 条
  • [1] Using Feature Selection to Improve the Utility of Differentially Private Data Publishing
    Jafer, Yasser
    Matwin, Stan
    Sokolova, Marina
    5TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS / THE 4TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE / AFFILIATED WORKSHOPS, 2014, 37 : 511 - 516
  • [2] Differentially private feature selection under MapReduce framework
    CHEN Kai
    WAN Wen-qiang
    LI Yun
    The Journal of China Universities of Posts and Telecommunications, 2013, (05) : 85 - 90
  • [3] Differentially private feature selection under Map Reduce framework
    Chen, Kai
    Wan, Wen-Qiang
    Li, Yun
    Journal of China Universities of Posts and Telecommunications, 2013, 20 (05): : 85 - 90
  • [4] Feature Selection: An Ever Evolving Frontier in Data Mining
    Liu, Huan
    Motoda, Hiroshi
    Setiono, Rudy
    Zhao, Zheng
    PROCEEDINGS OF THE FOURTH INTERNATIONAL WORKSHOP ON FEATURE SELECTION IN DATA MINING, 2010, 10 : 4 - 13
  • [5] Feature Selection and Extraction in Data mining
    Aparna, U. R.
    Paul, Shaiju
    PROCEEDINGS OF 2016 ONLINE INTERNATIONAL CONFERENCE ON GREEN ENGINEERING AND TECHNOLOGIES (IC-GET), 2016,
  • [6] Differentially private approximate aggregation based on feature selection
    He, Zaobo
    Sai, Akshita Maradapu Vera Venkata
    Huang, Yan
    Seo, Daehee
    Zhang, Hanzhou
    Han, Qilong
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 2021, 41 (02) : 318 - 327
  • [7] Differentially private approximate aggregation based on feature selection
    Zaobo He
    Akshita Maradapu Vera Venkata Sai
    Yan Huang
    Daehee seo
    Hanzhou Zhang
    Qilong Han
    Journal of Combinatorial Optimization, 2021, 41 : 318 - 327
  • [8] A Hybrid Feature Selection Method for Effective Data Classification in Data Mining Applications
    Sangaiya, Ilangovan
    Kumar, A. Vincent Antony
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2019, 11 (01) : 1 - 16
  • [9] Application of Data Mining Algorithms for Feature Selection and Prediction of Diabetic Retinopathy
    Oladele, Tinuke O.
    Ogundokun, Roseline Oluwaseun
    Kayode, Aderonke Anthonia
    Adegun, Adekanmi Adeyinka
    Adebiyi, Marion Oluwabunmi
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT V: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 14, 2019, PROCEEDINGS, PART V, 2019, 11623 : 716 - 730
  • [10] Data mining for feature selection in gene expression autism data
    Latkowski, Tomasz
    Osowski, Stanislaw
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (02) : 864 - 872