Disambiguation data: Extracting information from anonymized sources

被引:6
作者
Dreiseitl, S [1 ]
Vinterbo, S
Ohno-Machado, L
机构
[1] Polytech Univ Upper Austria, Dept Software Engn Med, A-4232 Hagenberg, Austria
[2] Harvard Univ, Brigham & Womens Hosp, Decis Syst Grp, Sch Med, Boston, MA 02115 USA
关键词
D O I
10.1197/jamia.M1240
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Privacy protection is an important consideration when releasing medical databases to the research community. We show that while recent advances in anonymization algorithms provide increased levels of protection, it is still possible to calculate approximations to the original data set. In some cases, one can even uniquely reconstruct entries in a table before anonymization. In this paper, we demonstrate how knowledge of an anonymization algorithm based on ambiguating data cell entries can be used to undo the anonymization process. We investigate the effect of this algorithm and its reversal on data sets of varying sizes and distributions. It is shown that by using a computationally complex disambiguation process, information on individuals can be extracted from an anonymized data set.
引用
收藏
页码:S110 / S114
页数:5
相关论文
共 50 条
[11]   Extracting information from semi-structured Internet sources [J].
Jeong, JS ;
Oh, DI .
ISIE 2001: IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS PROCEEDINGS, VOLS I-III, 2001, :1378-1381
[12]   Extracting information from semi-structured internet sources [J].
Div. of Info. Tech. Eng., College of Engineering, SoonChunHyang University, Asan, Korea, Republic of .
IEEE Int Symp Ind Electron, (1378-1381)
[13]   Entity Disambiguation in Anonymized Graphs Using Graph Kernels [J].
Hermansson, Linus ;
Kerola, Tommi ;
Johansson, Fredrik ;
Jethava, Vinay ;
Dubhashi, Devdatt .
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, :1037-1046
[14]   You Are What You Buy: Personal Information Extraction From Anonymized Data [J].
Cilloni, Thomas ;
Fleming, Charles ;
Walter, Charles .
IEEE ACCESS, 2024, 12 :29714-29722
[15]   Name Disambiguation in Anonymized Graphs using Network Embedding [J].
Zhang, Baichuan ;
Al Hasan, Mohammad .
CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, :1239-1248
[16]   Digital Watermarking for Anonymized Data With Low Information Loss [J].
Nakamura, Yuichi ;
Nishi, Hiroaki .
IEEE ACCESS, 2021, 9 :130570-130585
[17]   Extracting Influential Information Sources For Gossiping [J].
Dong, Wenxiang ;
Zhang, Wenyi ;
Wei, Guo .
2012 50TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2012, :1438-1444
[18]   Disambiguation of biomedical text using diverse sources of information [J].
Stevenson, Mark ;
Guo, Yikun ;
Gaizauskas, Robert ;
Martinez, David .
BMC BIOINFORMATICS, 2008, 9 (Suppl 11)
[19]   Disambiguation of biomedical text using diverse sources of information [J].
Mark Stevenson ;
Yikun Guo ;
Robert Gaizauskas ;
David Martinez .
BMC Bioinformatics, 9
[20]   Extracting novel information from gene expression data [J].
Li, Z ;
Chan, C .
TRENDS IN BIOTECHNOLOGY, 2004, 22 (08) :381-383