Algorithmically Effective Differentially Private Synthetic Data

被引:0
作者
He, Yiyun [1 ]
Vershynin, Roman [1 ]
Zhu, Yizhe [1 ]
机构
[1] Univ Calif Irvine, Irvine, CA 92717 USA
来源
THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195 | 2023年 / 195卷
关键词
differential privacy; synthetic data; Wasserstein metric;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a highly effective algorithmic approach for generating epsilon-differentially private synthetic data in a bounded metric space with near-optimal utility guarantees under the 1-Wasserstein distance. In particular, for a dataset X in the hypercube [0, 1](d), our algorithm generates synthetic dataset Y such that the expected 1-Wasserstein distance between the empirical measure of X and Y is O((epsilon n)(-1/d)) for d >= 2, and is O(log(2) (epsilon n)(epsilon n)(-1)) for d = 1. The accuracy guarantee is optimal up to a constant factor for d >= 2, and up to a logarithmic factor for d = 1. Our algorithm has a fast running time of O(epsilon dn) for all d >= 1 and demonstrates improved accuracy compared to the method in (Boedihardjo et al., 2022c) for d >= 2.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Deep Learning with Differential Privacy
    Abadi, Martin
    Chu, Andy
    Goodfellow, Ian
    McMahan, H. Brendan
    Mironov, Ilya
    Talwar, Kunal
    Zhang, Li
    [J]. CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 308 - 318
  • [2] Abowd J., 2022, Harvard Data Science Review
  • [3] Abowd J. M., 2019, Census TopDown: Differentially private data, incremental schemas, and consistency with public knowledge
  • [4] Barak B., 2007, P 26 ACM SIGMOD SIGA, P273, DOI 10.1145/1265530.1265569
  • [5] Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690
  • [6] Bellovin S.M., 2018, Stan. Tech. L. Rev., DOI 10.2139/ssrn.3255766
  • [7] Blum A., 2013, J. ACM, V60, P1
  • [8] A SIMPLE FOURIER ANALYTIC PROOF OF THE AKT OPTIMAL MATCHING THEOREM
    Bobkov, Sergey G.
    Ledoux, Michel
    [J]. ANNALS OF APPLIED PROBABILITY, 2021, 31 (06) : 2567 - 2584
  • [9] Boedihardjo M., 2022, ARXIV
  • [10] Privacy of Synthetic Data: A Statistical Framework
    Boedihardjo, March
    Strohmer, Thomas
    Vershynin, Roman
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (01) : 520 - 527