Efficient Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage

被引:38
作者
Christen, Peter [1 ]
Ranbaduge, Thilina [1 ]
Vatsalan, Dinusha [1 ]
Schnell, Rainer [2 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
[2] Univ Duisburg Essen, Methodol Res Grp, Duisburg, Germany
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT I | 2017年 / 10234卷
基金
澳大利亚研究理事会; 英国工程与自然科学研究理事会;
关键词
Privacy; Re-identification; Frequency analysis; Data linkage; Entity resolution; Data matching;
D O I
10.1007/978-3-319-57454-7_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Privacy-preserving record linkage (PPRL) is the process of identifying records that represent the same entity across databases held by different organizations without revealing any sensitive information about these entities. A popular technique used in PPRL is Bloom filter encoding, which has shown to be an efficient and effective way to encode sensitive information into bit vectors while still enabling approximate matching of attribute values. However, the encoded values in Bloom filters are vulnerable to cryptanalysis attacks. Under specific conditions, these attacks are successful in that some frequent sensitive attribute values can be re-identified. In this paper we propose and evaluate on real databases a novel efficient attack on Bloom filters. Our approach is based on the construction principle of Bloom filters of hashing elements of sets into bit positions. The attack is independent of the encoding function and its parameters used, it can correctly re-identify sensitive attribute values even when various recently proposed hardening techniques have been applied, and it runs in a few seconds instead of hours.
引用
收藏
页码:628 / 640
页数:13
相关论文
共 18 条
[11]   Privacy-preserving record linkage on large real world datasets [J].
Randall, Sean M. ;
Ferrante, Anna M. ;
Boyd, James H. ;
Bauer, Jacqueline K. ;
Semmens, James B. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 50 :205-212
[12]  
Schnell R., 2011, NOVEL ERROR TOLERANT
[13]  
Schnell R, 2016, INT CONF DAT MIN WOR, P218, DOI [10.1109/ICDMW.2016.29, 10.1109/ICDMW.2016.0038]
[14]   Privacy-preserving record linkage using Bloom filters [J].
Schnell, Rainer ;
Bachteler, Tobias ;
Reiher, Joerg .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2009, 9
[15]  
Vatsalan D., 2014, C INFORM KNOWLEDG, P1795, DOI 10.1145/2661829.2661875
[16]   Privacy-preserving matching of similar patients [J].
Vatsalan, Dinusha ;
Christen, Peter .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 59 :285-298
[17]   A taxonomy of privacy-preserving record linkage techniques [J].
Vatsalan, Dinusha ;
Christen, Peter ;
Verykios, Vassilios S. .
INFORMATION SYSTEMS, 2013, 38 (06) :946-969
[18]  
Zhichun Fu, 2014, Advances in Knowledge Discovery and Data Mining. 18th Pacific-Asia Conference (PAKDD 2014). Proceedings: LNCS 8443, P485, DOI 10.1007/978-3-319-06608-0_40