The Compromise of Data Privacy in Predictive Performance

被引:5
作者
Carvalho, Tania [1 ]
Moniz, Nuno [1 ,2 ]
机构
[1] Univ Porto, Fac Sci, Comp Sci Dept, Porto, Portugal
[2] INESC TEC, Porto, Portugal
来源
ADVANCES IN INTELLIGENT DATA ANALYSIS XIX, IDA 2021 | 2021年 / 12695卷
关键词
Data privacy; Supervised learning; Re-identification risk; Record linkage;
D O I
10.1007/978-3-030-74251-5_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Privacy-preservation has become an essential concern in many data mining applications since the emergence of legal obligations to protect personal data. Thus, the notion of Privacy-Preserving Data Mining emerged to allow the extraction of knowledge from data without violating the privacy of individuals. Several transformation techniques have been proposed to protect the privacy of individuals. However, their application does not guarantee a null risk of an individual being reidentified. Furthermore, and most importantly, for this paper, the application of such techniques may have a considerable impact on the utility of data and their use in predictive and descriptive tasks. In this paper, we present a study to provide key insights concerning the impact of privacy-preserving techniques in predictive performance. Unlike previous work, our main conclusions point towards a noticeable impact of privacy-preservation techniques in predictive performance.
引用
收藏
页码:426 / 438
页数:13
相关论文
共 27 条
  • [11] Differential privacy: A survey of results
    Dwork, Cynthia
    [J]. THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, PROCEEDINGS, 2008, 4978 : 1 - 19
  • [12] Calibrating noise to sensitivity in private data analysis
    Dwork, Cynthia
    McSherry, Frank
    Nissim, Kobbi
    Smith, Adam
    [J]. THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 : 265 - 284
  • [13] Protecting privacy using k-anonymity
    El Emam, Khaled
    Dankar, Fida Kamal
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2008, 15 (05) : 627 - 637
  • [14] Fletcher Sam, 2015, International Journal of Computer Theory and Engineering, V7, P21, DOI 10.7763/IJCTE.2015.V7.924
  • [15] Ho TK, 1998, IEEE T PATTERN ANAL, V20, P832, DOI 10.1109/34.709601
  • [16] Holohan N., 2017, ARXIV PREPRINT ARXIV
  • [17] Using Anonymized Data for Classification
    Inan, Ali
    Kantarcioglu, Murat
    Bertino, Elisa
    [J]. ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 429 - +
  • [18] Kaggle, 2018, NBAPLAYERS STATS 195
  • [19] Kent A., 1955, AM DOC, V6, P93, DOI [10.1002/, DOI 10.1002/ASI.5090060209, 10.1002/asi.5090060209]
  • [20] Privacy-Preserving Data Mining: Methods, Metrics, and Applications
    Mendes, Ricardo
    Vilela, Joao P.
    [J]. IEEE ACCESS, 2017, 5 : 10562 - 10582