How Data Anonymisation Techniques influence Disease Triage in Digital Health: A study on Base Rate Neglect

被引:0
作者
Podlesny, Nikolai J. [1 ]
Kayem, Anne V. D. M. [1 ]
Meinel, Christoph [1 ]
Jungmann, Sven [2 ]
机构
[1] Hasso Plattner Inst, Potsdam, Germany
[2] FoundersLane GmbH, Berlin, Germany
来源
PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON DIGITAL PUBLIC HEALTH (DPH '19) | 2019年
关键词
DIAGNOSIS; PRIVACY; NOISE;
D O I
10.1145/3357729.3357737
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the digital health area, there is a growing trend towards data-driven disease diagnostics and prescription triage. Discussions with health industry partners have revealed that distributed high-dimensional data repositories are not only helpful to medical & drug research but also for algorithms that support day-to-day medical diagnostics such as detecting Atrial fibrillation (AFib) [53]. Yet, recent privacy legislation in Europe requires that such data repositories be anonymised to protect against personal information exposure. Existing anonymisation algorithms work on the premise of transforming data to remove outliers that can result in re-identifications of individual records. While on the one hand this protects against data exposure, on the other hand anonymisation inadvertently results in base rate neglect(1). In the medical diagnostics context, base rate neglect can lead to false diagnostics and prescription triage, which is undesirable. In this paper, we study the impact of different anonymisation techniques on real-world disease diagnostics, and how they potentially influence decision making based on a real-world case as well as a semi-synthetic health data set. We demonstrate that the best results countervailing base rate neglect and ensuring data anonymity are obtained through the composition of several selected but dynamic per-row assigned anonymisation approaches incorporating attribute compartmentation.
引用
收藏
页码:55 / 62
页数:8
相关论文
共 51 条
[1]  
Administration U. S. F. . D, 2017, NAT DRUG COD DIR
[2]  
[Anonymous], 2012, P 25 INT C NEUR INF
[3]  
[Anonymous], 2000, CHOICES VALUES FRAME
[4]  
[Anonymous], 2004, P 23 ACM SIGMOD SIGA, DOI DOI 10.1145/1055558.1055591
[5]  
[Anonymous], 2016, CORONARY ARTERY DIS
[6]  
[Anonymous], 2014, ICD 9 CM DIAGNOSIS P
[7]  
[Anonymous], 2015, ARXIV150400065
[8]  
Aue G., 2015, EHEALTH 2 0 HLTH SYS
[9]   THE BASE-RATE FALLACY IN PROBABILITY JUDGMENTS [J].
BARHILLEL, M .
ACTA PSYCHOLOGICA, 1980, 44 (03) :211-233
[10]  
Bayardo RJ, 2005, PROC INT CONF DATA, P217