Have the cake and eat it too: Differential Privacy enables privacy and precise analytics

被引:1
作者
Subramanian, Rishabh [1 ]
机构
[1] Univ Chicago, Chicago, IL 60637 USA
关键词
Data analytics; Data mining; Data privacy; Differential privacy; DiD; Difference-in-difference; OLS; Ordinary least squares; Prediction; Regression; SECURITY; SAFE;
D O I
10.1186/s40537-023-00712-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing research in differential privacy, whose applications have exploded across functional areas in the last few years, describes an intrinsic trade-off between the privacy of a dataset and its utility for analytics. Resolving this trade-off critically impacts potential applications of differential privacy to protect privacy in datasets even while enabling analytics using them. In contrast to the existing literature, this paper shows how differential privacy can be employed to precisely-not approximately-retrieve the analytics on the original dataset. We examine, conceptually and empirically, the impact of noise addition on the quality of data analytics. We show that the accuracy of analytics following noise addition increases with the privacy budget and the variance of the independent variable. Also, the accuracy of analytics following noise addition increases disproportionately with an increase in the privacy budget when the variance of the independent variable is greater. Using actual data to which we add Laplace noise, we provide evidence supporting these two predictions. We then demonstrate our central thesis that, once the privacy budget employed for differential privacy is declared and certain conditions for noise addition are satisfied, the slope parameters in the original dataset can be accurately retrieved using the estimates in the modified dataset of the variance of the independent variable and the slope parameter. Thus, differential privacy can enable robust privacy as well as precise data analytics.
引用
收藏
页数:14
相关论文
共 77 条
  • [1] Target-Based, Privacy Preserving, and Incremental Association Rule Mining
    Ahluwalia, Madhu V.
    Gangopadhyay, Aryya
    Chen, Zhiyuan
    Yesha, Yelena
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2017, 10 (04) : 633 - 645
  • [2] Towards an Architecture to Guarantee Both Data Privacy and Utility in the First Phases of Digital Clinical Trials
    Angeletti, Fabio
    Chatzigiannakis, Ioannis
    Vitaletti, Andrea
    [J]. SENSORS, 2018, 18 (12)
  • [3] Angrist JD, 2009, MOSTLY HARMLESS ECONOMETRICS: AN EMPIRICISTS COMPANION, P1
  • [4] Banerjee, 2019, SECURITY DESIGNS CLO, P191
  • [5] Barth-Jones D., 2012, Then Now
  • [6] Better Safe than Sorry - Implementing Reliable Health Data Anonymization
    Bild, Raffael
    Kuhn, Klaus A.
    Prasser, Fabian
    [J]. DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 68 - 72
  • [7] Cao Y, WEB INTELL, P1
  • [8] The Compromise of Data Privacy in Predictive Performance
    Carvalho, Tania
    Moniz, Nuno
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XIX, IDA 2021, 2021, 12695 : 426 - 438
  • [9] Bridging unlinkability and data utility: Privacy preserving data publication schemes for healthcare informatics
    Chong, Kah Meng
    Malip, Amizah
    [J]. COMPUTER COMMUNICATIONS, 2022, 191 : 194 - 207
  • [10] Chundawat VS, 2022, Arxiv, DOI arXiv:2201.05629