MapReduce Implementations for Privacy Preserving Record Linkage

被引:1
作者
Boussis, Dimitris [1 ]
Dritsas, Elias [1 ]
Kanavos, Andreas [2 ,3 ]
Sioutas, Spyros [4 ]
Tzimas, Giannis [5 ]
Verykios, Vassilios S. [3 ]
机构
[1] Univ Patras, Comp Engn & Informat Dept, Patras, Greece
[2] Comp Engn & Informat Dept, Patras, Greece
[3] Hellen Open Univ, Patras, Greece
[4] Ionian Univ, Dept Informat, Corfu, Greece
[5] TEI Western Greece, Comp & Informat Engn Dept, Patras, Greece
来源
10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018) | 2018年
关键词
Privacy Preserving Record Linkage; Hadoop; MapReduce; Bloom Filters;
D O I
10.1145/3200947.3201043
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the last decade, the vast explosion of Internet data has fueled the development of Big Data management systems and technologies. The huge amount of data in combination with the need for records linkage under privacy perspective, has led us to current study. To this direction, we describe Privacy Preserving Record Linkage problem based on Bloom Filter encoding techniques which both maintain users' security and permit similarity control. Moreover, we extended our study to the HLSH/FPS private indexing technique and briefly describe four implementations in the MapReduce distributed environment that is capable of processing large scale data. We also conducted experimental evaluation of these four versions in order to evaluate them in terms of job execution time, memory and disk usage (1).
引用
收藏
页数:4
相关论文
共 13 条
  • [1] [Anonymous], 2012, DATA MATCHING CONCEP, DOI DOI 10.1007/978-3-642-31164-2
  • [2] Baxter R., 2003, ACM SIGKDD 03 WORKSH, P25, DOI DOI 10.1007/978-3-319-11257-2
  • [3] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
    BLOOM, BH
    [J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
  • [4] Clifton C., 2004, DMKD 04 P 9 ACM SIGM, P19, DOI [DOI 10.1145/1008694.1008698, 10.1145/1008694.1008698]
  • [5] Durham E. A., 2012, THESIS
  • [6] Composite Bloom Filters for Secure Record Linkage
    Durham, Elizabeth A.
    Kantarcioglu, Murat
    Xue, Yuan
    Toth, Csaba
    Kuzu, Mehmet
    Malin, Bradley
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) : 2956 - 2968
  • [7] A THEORY FOR RECORD LINKAGE
    FELLEGI, IP
    SUNTER, AB
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1969, 64 (328) : 1183 - &
  • [8] A fast and efficient Hamming LSH-based scheme for accurate linkage
    Karapiperis, Dimitrios
    Verykios, Vassilios S.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 49 (03) : 861 - 884
  • [9] An LSH-Based Blocking Approach with a Homomorphic Matching Technique for Privacy-Preserving Record Linkage
    Karapiperis, Dimitrios
    Verykios, Vassilios S.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (04) : 909 - 921
  • [10] Schnell R., 2011, NOVEL ERROR TOLERANT