Efficient Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage

被引:38
作者
Christen, Peter [1 ]
Ranbaduge, Thilina [1 ]
Vatsalan, Dinusha [1 ]
Schnell, Rainer [2 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
[2] Univ Duisburg Essen, Methodol Res Grp, Duisburg, Germany
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT I | 2017年 / 10234卷
基金
澳大利亚研究理事会; 英国工程与自然科学研究理事会;
关键词
Privacy; Re-identification; Frequency analysis; Data linkage; Entity resolution; Data matching;
D O I
10.1007/978-3-319-57454-7_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Privacy-preserving record linkage (PPRL) is the process of identifying records that represent the same entity across databases held by different organizations without revealing any sensitive information about these entities. A popular technique used in PPRL is Bloom filter encoding, which has shown to be an efficient and effective way to encode sensitive information into bit vectors while still enabling approximate matching of attribute values. However, the encoded values in Bloom filters are vulnerable to cryptanalysis attacks. Under specific conditions, these attacks are successful in that some frequent sensitive attribute values can be re-identified. In this paper we propose and evaluate on real databases a novel efficient attack on Bloom filters. Our approach is based on the construction principle of Bloom filters of hashing elements of sets into bit positions. The attack is independent of the encoding function and its parameters used, it can correctly re-identify sensitive attribute values even when various recently proposed hardening techniques have been applied, and it runs in a few seconds instead of hours.
引用
收藏
页码:628 / 640
页数:13
相关论文
共 18 条
[1]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[2]  
Boyd J.H., 2015, Medical Data Privacy Handbook, P267
[3]   Association mining [J].
Ceglar, Aaron ;
Roddick, John F. .
ACM COMPUTING SURVEYS, 2006, 38 (02)
[4]  
Christen P., 2012, DATA MATCHING CONCEP, P163, DOI DOI 10.5555/2344108
[5]   Composite Bloom Filters for Secure Record Linkage [J].
Durham, Elizabeth A. ;
Kantarcioglu, Murat ;
Xue, Yuan ;
Toth, Csaba ;
Kuzu, Mehmet ;
Malin, Bradley .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) :2956-2968
[6]   Who Is 1011011111...1110110010? Automated Cryptanalysis of Bloom Filter Encryptions of Databases with Several Personal Identifiers [J].
Kroll, Martin ;
Steinmetzer, Simone .
BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, BIOSTEC 2015, 2015, 574 :341-356
[7]   A practical approach to achieve private medical record linkage in light of public resources [J].
Kuzu, Mehmet ;
Kantarcioglu, Murat ;
Durham, Elizabeth Ashley ;
Toth, Csaba ;
Malin, Bradley .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (02) :285-292
[8]  
Kuzu M, 2011, LECT NOTES COMPUT SC, V6794, P226, DOI 10.1007/978-3-642-22263-4_13
[9]  
Niedermeyer F., 2014, Journal of Privacy and Confidentiality, V6, P3, DOI DOI 10.29012/JPC.V6I2.640
[10]   Hashing-Based Distributed Multi-party Blocking for Privacy-Preserving Record Linkage [J].
Ranbaduge, Thilina ;
Vatsalan, Dinusha ;
Christen, Peter ;
Verykios, Vassilios .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 :415-427