An Improved Chinese String Comparator for Bloom Filter Based Privacy-Preserving Record Linkage

被引:1
|
作者
Sun, Siqi [1 ]
Qian, Yining [1 ]
Zhang, Ruoshi [1 ]
Wang, Yanqi [1 ]
Li, Xinran [1 ]
机构
[1] Huazhong Agr Univ, Coll Sci, Dept Math & Stat, Wuhan 430070, Peoples R China
基金
中国国家自然科学基金;
关键词
privacy-preserving record linkage; Chinese characters; SoundShape code; Bloom filter; proportions of SoundShape code;
D O I
10.3390/e23081091
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
With the development of information technology, it has become a popular topic to share data from multiple sources without privacy disclosure problems. Privacy-preserving record linkage (PPRL) can link the data that truly matches and does not disclose personal information. In the existing studies, the techniques of PPRL have mostly been studied based on the alphabetic language, which is much different from the Chinese language environment. In this paper, Chinese characters (identification fields in record pairs) are encoded into strings composed of letters and numbers by using the SoundShape code according to their shapes and pronunciations. Then, the SoundShape codes are encrypted by Bloom filter, and the similarity of encrypted fields is calculated by Dice similarity. In this method, the false positive rate of Bloom filter and different proportions of sound code and shape code are considered. Finally, we performed the above methods on the synthetic datasets, and compared the precision, recall, F1-score and computational time with different values of false positive rate and proportion. The results showed that our method for PPRL in Chinese language environment improved the quality of the classification results and outperformed others with a relatively low additional cost of computation.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Privacy-preserving record linkage using Bloom filters
    Rainer Schnell
    Tobias Bachteler
    Jörg Reiher
    BMC Medical Informatics and Decision Making, 9
  • [2] Differential Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage
    Yin, Weifeng
    Yuan, Lifeng
    Ren, Yizhi
    Meng, Weizhi
    Wang, Dong
    Wang, Qiuhua
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 6665 - 6678
  • [3] Encryption-based sub-string matching for privacy-preserving record linkage
    Vaiwsri, Sirintra
    Ranbaduge, Thilina
    Christen, Peter
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2024, 81
  • [5] A Tutorial on Blocking Methods for Privacy-Preserving Record Linkage
    Karapiperis, Dimitrios
    Verykios, Vassilios S.
    Katsiri, Eleftheria
    Delis, Alex
    ALGORITHMIC ASPECTS OF CLOUD COMPUTING, ALGOCLOUD 2015, 2016, 9511 : 3 - 15
  • [6] A Graph Matching Attack on Privacy-Preserving Record Linkage
    Vidanage, Anushka
    Christen, Peter
    Ranbaduge, Thilina
    Schnell, Rainer
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1485 - 1494
  • [7] A blinded evaluation of privacy preserving record linkage with Bloom filters
    Sean Randall
    Helen Wichmann
    Adrian Brown
    James Boyd
    Tom Eitelhuber
    Alexandra Merchant
    Anna Ferrante
    BMC Medical Research Methodology, 22
  • [8] A blinded evaluation of privacy preserving record linkage with Bloom filters
    Randall, Sean
    Wichmann, Helen
    Brown, Adrian
    Boyd, James
    Eitelhuber, Tom
    Merchant, Alexandra
    Ferrante, Anna
    BMC MEDICAL RESEARCH METHODOLOGY, 2022, 22 (01)
  • [9] Blockchain-based Privacy-Preserving Record Linkage: enhancing data privacy in an untrusted environment
    Nobrega, Thiago
    Pires, Carlos Eduardo S.
    Nascimento, Dimas Cassimiro
    INFORMATION SYSTEMS, 2021, 102 (102)
  • [10] A Parallel Multi-Party Privacy-Preserving Record Linkage Method Based on a Consortium Blockchain
    Han, Shumin
    Wang, Zikang
    Shen, Dengrong
    Wang, Chuang
    MATHEMATICS, 2024, 12 (12)