How Data Anonymisation Techniques influence Disease Triage in Digital Health: A study on Base Rate Neglect

被引：0

作者：

Podlesny, Nikolai J. ^{[1
]}

Kayem, Anne V. D. M. ^{[1
]}

Meinel, Christoph ^{[1
]}

Jungmann, Sven ^{[2
]}

机构：

[1] Hasso Plattner Inst, Potsdam, Germany

[2] FoundersLane GmbH, Berlin, Germany

来源：

PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON DIGITAL PUBLIC HEALTH (DPH '19) | 2019年

关键词：

DIAGNOSIS; PRIVACY; NOISE;

D O I：

10.1145/3357729.3357737

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In the digital health area, there is a growing trend towards data-driven disease diagnostics and prescription triage. Discussions with health industry partners have revealed that distributed high-dimensional data repositories are not only helpful to medical & drug research but also for algorithms that support day-to-day medical diagnostics such as detecting Atrial fibrillation (AFib) [53]. Yet, recent privacy legislation in Europe requires that such data repositories be anonymised to protect against personal information exposure. Existing anonymisation algorithms work on the premise of transforming data to remove outliers that can result in re-identifications of individual records. While on the one hand this protects against data exposure, on the other hand anonymisation inadvertently results in base rate neglect(1). In the medical diagnostics context, base rate neglect can lead to false diagnostics and prescription triage, which is undesirable. In this paper, we study the impact of different anonymisation techniques on real-world disease diagnostics, and how they potentially influence decision making based on a real-world case as well as a semi-synthetic health data set. We demonstrate that the best results countervailing base rate neglect and ensuring data anonymity are obtained through the composition of several selected but dynamic per-row assigned anonymisation approaches incorporating attribute compartmentation.

引用

页码：55 / 62

页数：8

共 51 条

[41] Attribute Compartmentation and Greedy UCC Discovery for High-Dimensional Data Anonymisation [J].