Privately Estimating a Gaussian: Effiicient, Robust, and Optimal

被引:7
作者
Alabi, Daniel [1 ]
Kothari, Pravesh K. [2 ]
Tankala, Pranay [3 ]
Venkat, Prayaag [3 ]
Zhang, Fred [4 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] CMU, Pittsburgh, PA USA
[3] Harvard Univ, Cambridge, MA USA
[4] Univ Calif Berkeley, Berkeley, CA USA
来源
PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023 | 2023年
关键词
DIfferential Privacy; Robust Statistics; High-Dimensional Statistics; Private Statistics; PRIVACY; DIVERGENCE; COMPLEXITY; NOISE;
D O I
10.1145/3564246.3585194
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this work, we give efficient algorithms for privately estimating a Gaussian distribution in both pure and approximate differential privacy (DP) models with optimal dependence on the dimension in the sample complexity. In the pure DP setting, we give an efficient algorithm that estimates an unknown 3 -dimensional Gaussian distribution up to an arbitrary tiny total variation error using (O) over tilde (d(2) log kappa) samples while tolerating a constant fraction of adversarial outliers. Here, kappa is the condition number of the target covariance matrix. The sample bound matches best non-private estimators in the dependence on the dimension (up to a polylogarithmic factor). We prove a new lower bound on differentially private covariance estimation to show that the dependence on the condition number kappa in the above sample bound is also tight. Prior to our work, only identifiability results (yielding inefficient super-polynomial time algorithms) were known for the problem. In the approximate DP setting, we give an efficient algorithm to estimate an unknown Gaussian distribution up to an arbitrarily tiny total variation error using (O) over bar (d(2)) samples while tolerating a constant fraction of adversarial outliers. Prior to our work, all effcient approximate DP algorithms incurred a super-quadratic sample cost or were not outlier-robust. For the special case of mean estimation, our algorithm achieves the optimal sample complexity of (O) over bar (d), improving on a (O) over bar (d(1.5)) bound from prior work. Our pure DP algorithm relies on a recursive private preconditioning subroutine that utilizes recent work of Hopkins et al. (STOC 2022) on private mean estimation. Our approximate DP algorithms are based on a substantial upgrade of the method of stabilizing convex relaxations introduced by Kothari et al. (COLT 2022). In particular, we improve on their mechanism by using a new unnormalized entropy regularization and a new and surprisingly simple mechanism for privately releasing covariances.
引用
收藏
页码:483 / 496
页数:14
相关论文
共 54 条
[1]  
Aden-Ali I., 2021, ALGORITHMIC LEARNING
[2]  
ALI SM, 1966, J ROY STAT SOC B, V28, P131
[3]  
Ashtiani Hassan, 2022, C LEARNING THEORY C
[4]  
Beimel A, 2010, LECT NOTES COMPUT SC, V5978, P437, DOI 10.1007/978-3-642-11799-2_26
[5]  
Biswas Sourav, 2020, ADV NEUR IN, V33
[6]   Collusion-secure fingerprinting for digital data [J].
Boneh, D ;
Shaw, J .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (05) :1897-1905
[7]  
Brown Gavin, 2021, ADV NEURAL INFORM PR
[8]   Private Hypothesis Selection [J].
Bun, Mark ;
Kamath, Gautam ;
Steinke, Thomas ;
Wu, Zhiwei Steven .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (03) :1981-2000
[9]   Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds [J].
Bun, Mark ;
Steinke, Thomas .
THEORY OF CRYPTOGRAPHY, TCC 2016-B, PT I, 2016, 9985 :635-658
[10]   Fingerprinting Codes and the Price of Approximate Differential Privacy [J].
Bun, Mark ;
Ullman, Jonathan ;
Vadhan, Salil .
STOC'14: PROCEEDINGS OF THE 46TH ANNUAL 2014 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2014, :1-10