An Improved Chinese String Comparator for Bloom Filter Based Privacy-Preserving Record Linkage

被引:1
作者
Sun, Siqi [1 ]
Qian, Yining [1 ]
Zhang, Ruoshi [1 ]
Wang, Yanqi [1 ]
Li, Xinran [1 ]
机构
[1] Huazhong Agr Univ, Coll Sci, Dept Math & Stat, Wuhan 430070, Peoples R China
基金
中国国家自然科学基金;
关键词
privacy-preserving record linkage; Chinese characters; SoundShape code; Bloom filter; proportions of SoundShape code;
D O I
10.3390/e23081091
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
With the development of information technology, it has become a popular topic to share data from multiple sources without privacy disclosure problems. Privacy-preserving record linkage (PPRL) can link the data that truly matches and does not disclose personal information. In the existing studies, the techniques of PPRL have mostly been studied based on the alphabetic language, which is much different from the Chinese language environment. In this paper, Chinese characters (identification fields in record pairs) are encoded into strings composed of letters and numbers by using the SoundShape code according to their shapes and pronunciations. Then, the SoundShape codes are encrypted by Bloom filter, and the similarity of encrypted fields is calculated by Dice similarity. In this method, the false positive rate of Bloom filter and different proportions of sound code and shape code are considered. Finally, we performed the above methods on the synthetic datasets, and compared the precision, recall, F1-score and computational time with different values of false positive rate and proportion. The results showed that our method for PPRL in Chinese language environment improved the quality of the classification results and outperformed others with a relatively low additional cost of computation.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] A critique and attack on "Blockchain-based privacy-preserving record
    Christen, Peter
    Schnell, Rainer
    Ranbaduge, Thilina
    Vidanage, Anushka
    INFORMATION SYSTEMS, 2022, 108
  • [32] MapReduce Implementations for Privacy Preserving Record Linkage
    Boussis, Dimitris
    Dritsas, Elias
    Kanavos, Andreas
    Sioutas, Spyros
    Tzimas, Giannis
    Verykios, Vassilios S.
    10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
  • [33] Evaluation of string comparators for record linkage in Chinese environment
    Guo, Heng
    Li, Yalin
    Liu, Ying
    Li, Weifu
    Li, Xinran
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2022, 20 (06)
  • [34] A Scalable and Privacy-Preserving Named Data Networking Architecture based on Bloom Filters
    Massawe, Emmanuel A.
    Du, Suguo
    Zhu, Haojin
    2013 33RD IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2013), 2013, : 22 - 26
  • [35] Enhanced Multi-Party Privacy-Preserving Record Linkage Using Trusted Execution Environments
    Han, Shumin
    Shen, Kuixing
    Shen, Derong
    Wang, Chuang
    MATHEMATICS, 2024, 12 (15)
  • [36] Privacy-Preserving Linkage of Genomic and Clinical Data Sets
    Baker, Dixie B.
    Knoppers, Bartha M.
    Phillips, Mark
    van Enckevort, David
    Kaufmann, Petra
    Lochmuller, Hanns
    Taruscio, Domenica
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, 16 (04) : 1342 - 1348
  • [37] Privacy preserving record linkage in the presence of missing values
    Chi, Yuan
    Hong, Jun
    Jurek, Anna
    Liu, Weiru
    O'Reilly, Dermot
    INFORMATION SYSTEMS, 2017, 71 : 199 - 210
  • [38] Sequence Data Matching and Beyond: New Privacy-Preserving Primitives Based on Bloom Filters
    Xue, Wanli
    Vatsalan, Dinusha
    Hu, Wen
    Seneviratne, Aruna
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 : 2973 - 2987
  • [39] Comprehensive Evaluation Method of Privacy-Preserving Record Linkage Technology Based on the Modified Criteria Importance Through Intercriteria Correlation Method
    Han, Shumin
    Li, Yue
    Shen, Derong
    Wang, Chuang
    MATHEMATICS, 2024, 12 (22)
  • [40] Study and Design of Privacy-Preserving Range Query Protocol in Sensor Networks Based on the Integration Reversal 0-1 Encoding with Bloom Filter
    Deng, Yun
    Zheng, Zitao
    Wang, Yu
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (11)