The Tao of Inference in Privacy-Protected Databases

被引:43
作者
Bindschaedler, Vincent [1 ]
Grubbs, Paul [2 ]
Cash, David [3 ]
Ristenpart, Thomas [2 ]
Shmatikov, Vitaly [2 ]
机构
[1] UIUC, Urbana, IL 61801 USA
[2] Cornell Tech, New York, NY USA
[3] Univ Chicago, Chicago, IL 60637 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2018年 / 11卷 / 11期
基金
美国国家科学基金会;
关键词
D O I
10.14778/3236187.3236217
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To protect database confidentiality even in the face of full compromise while supporting standard functionality, recent academic proposals and commercial products rely on a mix of encryption schemes. The recommendation is to apply strong, semantically secure encryption to the "sensitive" columns and protect other columns with property-revealing encryption (PRE) that supports operations such as sorting. We design, implement, and evaluate a new methodology for inferring data stored in such encrypted databases. The cornerstone is the multinomial attack, a new inference technique that is analytically optimal and empirically outperforms prior heuristic attacks against PRE-encrypted data. We also extend the multinomial attack to take advantage of correlations across multiple columns. This recovers PRE-encrypted data with sufficient accuracy to then apply machine learning and record linkage methods to infer columns protected by semantically secure encryption or redaction. We evaluate our methodology on medical, census, and union-membership datasets, showing for the first time how to infer full database records. For PRE-encrypted attributes such as demographics and ZIP codes, our attack outperforms the best prior heuristic by a factor of 16. Unlike any prior technique, we also infer attributes, such as incomes and medical diagnoses, protected by strong encryption. For example, when we infer that a patient in a hospital-discharge dataset has a mental health or substance abuse condition, this prediction is 97% accurate.
引用
收藏
页码:1715 / 1728
页数:14
相关论文
共 78 条
[1]  
Adam N., 1989, ACM COMPUT SURV, V21
[2]  
Agrawal R., 2004, P ACM SIGMOD INT C M, P563
[3]  
[Anonymous], 2009, EWEEK
[4]  
[Anonymous], 2010, WIRED
[5]  
Arasu Arvind, 2013, CIDR
[6]  
Archer David W., 2018, 2018450 CRYPT EPRINT
[7]  
Bellare Mihir, 2009, SAC
[8]  
Bindschaedler Vincent, 2017, TAO INFERENCE PRIVAC
[9]  
Black John, 2002, CT RSA
[10]  
Boldyreva A, 2009, LECT NOTES COMPUT SC, V5479, P224, DOI 10.1007/978-3-642-01001-9_13