Model-Agnostic Private Learning

被引:0
作者
Bassily, Raef [1 ]
Thakkar, Om [2 ]
Thakurta, Abhradeep [3 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Boston Univ, Dept Comp Sci, Boston, MA 02215 USA
[3] Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We design differentially private learning algorithms that are agnostic to the learning model assuming access to a limited amount of unlabeled public data. First, we provide a new differentially private algorithm for answering a sequence of m online classification queries (given by a sequence of m unlabeled public feature vectors) based on a private training set. Our algorithm follows the paradigm of subsample-and-aggregate, in which any generic non-private learner is trained on disjoint subsets of the private training set, and then for each classification query, the votes of the resulting classifiers ensemble are aggregated in a differentially private fashion. Our private aggregation is based on a novel combination of the distance-to-instability framework [26], and the sparse-vector technique [15, 18]. We show that our algorithm makes a conservative use of the privacy budget. In particular, if the underlying non-private learner yields a classification error of at most alpha is an element of (0, 1), then our construction answers more queries, by at least a factor of 1/alpha in some cases, than what is implied by a straightforward application of the advanced composition theorem for differential privacy. Next, we apply the knowledge transfer technique to construct a private learner that outputs a classifier, which can be used to answer an unlimited number of queries. In the PAC model, we analyze our construction and prove upper bounds on the sample complexity for both the realizable and the non-realizable cases Similar to non-private sample complexity, our bounds are completely characterized by the VC dimension of the concept class.
引用
收藏
页数:11
相关论文
共 27 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
[Anonymous], 2013, C LEARNING THEORY
[3]  
[Anonymous], 2017, 5 INT C LEARN REPR I
[4]  
[Anonymous], 2010, FOCS
[5]  
[Anonymous], 2011, JMLR
[6]  
[Anonymous], 2018, ARXIV180305101
[7]   Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds [J].
Bassily, Raef ;
Smith, Adam ;
Thakurta, Abhradeep .
2014 55TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2014), 2014, :464-473
[8]   Private Learning and Sanitization: Pure vs. Approximate Differential Privacy [J].
Beimel, Amos ;
Nissim, Kobbi ;
Stemmer, Uri .
THEORY OF COMPUTING, 2016, 12
[9]  
Beimel A, 2010, LECT NOTES COMPUT SC, V5978, P437, DOI 10.1007/978-3-642-11799-2_26
[10]  
Beimel Amos, 2013, ITCS