Privacy preserving interactive record linkage (PPIRL)

被引:30
作者
Kum, Hye-Chung [1 ,2 ,3 ,4 ]
Krishnamurthy, Ashok [1 ,2 ,3 ,5 ]
Machanavajjhala, Ashwin [6 ]
Reiter, Michael K. [3 ]
Ahalt, Stanley [1 ,2 ,3 ,5 ]
机构
[1] UNC CH, Populat Informat Res Grp, Dept Comp Sci, Chapel Hill, NC USA
[2] Texas A&M Hlth Sci Ctr, Dept Hlth Policy & Management, College Stn, TX 77843 USA
[3] UNC CH, Dept Comp Sci, Chapel Hill, NC USA
[4] Texas A&M Hlth Sci Ctr, Coll Med Baylor Scott & White, Dept Pediat, Temple, TX USA
[5] UNC CH, RENCI, Chapel Hill, NC USA
[6] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
关键词
privacy preserving interactive record linkage (PPIRL); decoupled data; entity resolution; medical record linkage; privacy; Electronic Health Records (EHR); ENTITY RESOLUTION; POPULATION; CARE; DESIGN; ISSUES; WORLD; TOOL;
D O I
10.1136/amiajnl-2013-002165
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Record linkage to integrate uncoordinated databases is critical in biomedical research using Big Data. Balancing privacy protection against the need for high quality record linkage requires a human-machine hybrid system to safely manage uncertainty in the ever changing streams of chaotic Big Data. Methods In the computer science literature, private record linkage is the most published area. It investigates how to apply a known linkage function safely when linking two tables. However, in practice, the linkage function is rarely known. Thus, there are many data linkage centers whose main role is to be the trusted third party to determine the linkage function manually and link data for research via a master population list for a designated region. Recently, a more flexible computerized third-party linkage platform, Secure Decoupled Linkage (SDLink), has been proposed based on: (1) decoupling data via encryption, (2) obfuscation via chaffing (adding fake data) and universe manipulation; and (3) minimum information disclosure via recoding. Results We synthesize this literature to formalize a new framework for privacy preserving interactive record linkage (PPIRL) with tractable privacy and utility properties and then analyze the literature using this framework. Conclusions Human-based third-party linkage centers for privacy preserving record linkage are the accepted norm internationally. We find that a computer-based third-party platform that can precisely control the information disclosed at the micro level and allow frequent human interaction during the linkage process, is an effective human-machine hybrid system that significantly improves on the linkage center model both in terms of privacy and utility.
引用
收藏
页码:212 / 220
页数:9
相关论文
共 79 条
[1]  
Agrawal Rakesh, 2003, P 2003 ACM SIGMOD IN, P86, DOI DOI 10.1145/872757.872771
[2]  
Alhaqbani B, 2008, 2008 10TH IEEE INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES, P108, DOI 10.1109/HEALTH.2008.4600120
[3]  
[Anonymous], CSO MAGAZINE
[4]  
Arasu A, 2010, ACM INTERNATIONAL CO
[5]   The impact of record-linkage bias in the Cox model [J].
Baldi, Ileana ;
Ponti, Antonio ;
Zanetti, Roberto ;
Ciccone, Giovannino ;
Merletti, Franco ;
Gregori, Dario .
JOURNAL OF EVALUATION IN CLINICAL PRACTICE, 2010, 16 (01) :92-96
[6]  
Bellar K, 2012, ACM INTERNATIONAL CO
[7]  
Bellare K, 2013, VLDB ENDOWMENT, P6
[8]   Swoosh: a generic approach to entity resolution [J].
Benjelloun, Omar ;
Garcia-Molina, Hector ;
Menestrina, David ;
Su, Qi ;
Whang, Steven Euijong ;
Widom, Jennifer .
VLDB JOURNAL, 2009, 18 (01) :255-276
[9]  
Beygelzimer A, 2010, AGNOSTIC ACTIVE LEAR
[10]  
Bhattacharya I, 2007, LATENT DIRICHLET MOD