Private measures, random walks, and synthetic data

被引:1
|
作者
Boedihardjo, March [1 ]
Strohmer, Thomas [2 ]
Vershynin, Roman [3 ]
机构
[1] Michigan State Univ, Dept Math, E Lansing, MI 48864 USA
[2] Univ Calif Davis, Dept Math, Davis, CA USA
[3] Univ Calif Irvine, Dept Math, Irvine, CA 92697 USA
关键词
Random walks; Differential privacy; Synthetic data;
D O I
10.1007/s00440-024-01279-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to achieve it come with some serious limitations. Utility guarantees are usually provided only for a fixed, a priori specified set of queries. Moreover, there are no utility guarantees for more complex-but very common-machine learning tasks such as clustering or classification. In this paper we overcome some of these limitations. Working with metric privacy, a powerful generalization of differential privacy, we develop a polynomial-time algorithm that creates a private measure from a data set. This private measure allows us to efficiently construct private synthetic data that are accurate for a wide range of statistical analysis tools. Moreover, we prove an asymptotically sharp min-max result for private measures and synthetic data in general compact metric spaces, for any fixed privacy budget epsilon \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} bounded away from zero. A key ingredient in our construction is a new superregular random walk, whose joint distribution of steps is as regular as that of independent random variables, yet which deviates from the origin logarithmically slowly.
引用
收藏
页码:569 / 611
页数:43
相关论文
共 50 条
  • [21] Distributed Random Walks
    Das Sarma, Atish
    Nanongkai, Danupon
    Pandurangan, Gopal
    Tetali, Prasad
    JOURNAL OF THE ACM, 2013, 60 (01)
  • [22] Random walks on random simple graphs
    Hildebrand, M
    RANDOM STRUCTURES & ALGORITHMS, 1996, 8 (04) : 301 - 318
  • [23] Bootstrap random walks
    Collevecchio, Andrea
    Hamza, Kais
    Shi, Meng
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2016, 126 (06) : 1744 - 1760
  • [24] Disordered Random Walks
    Mauricio P. Pato
    Brazilian Journal of Physics, 2021, 51 : 238 - 243
  • [25] Collisions of random walks
    Barlow, Martin T.
    Peres, Yuval
    Sousi, Perla
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2012, 48 (04): : 922 - 946
  • [26] SEMI-AUTOMATIC SYNTHETIC DEPTH MAP GENERATION FOR VIDEO USING RANDOM WALKS
    Rzeszutek, Richard
    Phan, Raymond
    Androutsos, Dimitrios
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [27] CONFIDENCE ESTIMATION IN IVUS RADIO-FREQUENCY DATA WITH RANDOM WALKS
    Karamalis, Athanasios
    Katouzian, Amin
    Carlier, Stephane
    Navab, Nassir
    2012 9TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2012, : 1068 - 1071
  • [28] Coverage-Adaptive Random Walks for Fast Sensory Data Collection
    Angelopoulos, Constantinos-Marios
    Nikoletseas, Sotiris
    Patroumpa, Dirnitra
    Rolim, Jose
    AD-HOC, MOBILE AND WIRELESS NETWORKS, 2010, 6288 : 81 - +
  • [29] General and specific utility measures for synthetic data
    Snoke, Joshua
    Raab, Gillian M.
    Nowok, Beata
    Dibben, Chris
    Slavkovic, Aleksandra
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2018, 181 (03) : 663 - 688
  • [30] Collisions of random walks in reversible random graphs
    Hutchcroft, Tom
    Peres, Yuval
    ELECTRONIC COMMUNICATIONS IN PROBABILITY, 2015, 20 : 2 - 6