A data-driven approach to choosing privacy parameters for clinical trial data sharing under differential privacy

被引:3
作者
Chen, Henian [1 ,5 ]
Pang, Jinyong [1 ]
Zhao, Yayi [1 ]
Giddens, Spencer [2 ]
Ficek, Joseph [3 ]
Valente, Matthew J. [1 ]
Cao, Biwei [1 ]
Daley, Ellen [4 ]
机构
[1] Univ S Florida, Coll Publ Hlth, Study Design & Data Anal, Tampa, FL 33612 USA
[2] Univ Notre Dame, Dept Appl & Computat Math & Stat, Notre Dame, IN 46556 USA
[3] GlaxoSmithKline, Oncol Stat, Collegeville, PA 19426 USA
[4] Univ S Florida, Coll Publ Hlth, Lawton & Rhea Chiles Ctr Children & Families, Tampa, FL USA
[5] Univ S Florida, Coll Publ Hlth, Study Design & Data Anal, 13201 Bruce B Downs Blvd, MDC 56, Tampa, FL 33612 USA
关键词
clinical trial; differential privacy; accuracy; data sharing; privacy parameter; RELATION EXTRACTION;
D O I
10.1093/jamia/ocae038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives Clinical trial data sharing is crucial for promoting transparency and collaborative efforts in medical research. Differential privacy (DP) is a formal statistical technique for anonymizing shared data that balances privacy of individual records and accuracy of replicated results through a "privacy budget" parameter, epsilon. DP is considered the state of the art in privacy-protected data publication and is underutilized in clinical trial data sharing. This study is focused on identifying epsilon values for the sharing of clinical trial data. Materials and Methods We analyzed 2 clinical trial datasets with privacy budget epsilon ranging from 0.01 to 10. Smaller values of epsilon entail adding greater amounts of random noise, with better privacy as a result. Comparison of rates, odds ratios, means, and mean differences between the original clinical trial datasets and the empirical distribution of the DP estimator was performed. Results The DP rate closely approximated the original rate of 6.5% when epsilon > 1. The DP odds ratio closely aligned with the original odds ratio of 0.689 when epsilon >= 3. The DP mean closely approximated the original mean of 164.64 when epsilon >= 1. As epsilon increased to 5, both the minimum and maximum DP means converged toward the original mean. Discussion There is no consensus on how to choose the privacy budget epsilon. The definition of DP does not specify the required level of privacy, and there is no established formula for determining epsilon. Conclusion Our findings suggest that the application of DP holds promise in the context of sharing clinical trial data.
引用
收藏
页码:1135 / 1143
页数:9
相关论文
共 50 条
[31]   Personalized privacy in open data sharing scenarios [J].
Sanchez, David ;
Viejo, Alexandre .
ONLINE INFORMATION REVIEW, 2017, 41 (03) :298-310
[32]   Dependent Differential Privacy for Correlated Data [J].
Zhao, Jun ;
Zhang, Junshan ;
Poor, H. Vincent .
2017 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2017,
[33]   Differential Privacy of Big Data: An Overview [J].
Yao, Xiaoming ;
Zhou, Xiaoyi ;
Ma, Jixin .
2016 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC), AND IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2016, :7-12
[34]   Influence of data errors on differential privacy [J].
Wang, Tao ;
Xu, Zhengquan ;
Wang, Dong ;
Wang, Hao .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (02) :S2739-S2746
[35]   An IoT data sharing privacy preserving scheme [J].
Sun, Yan ;
Yin, Lihua ;
Sun, Zhe ;
Tian, Zhihong ;
Du, Xiaojiang .
IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2020, :984-990
[36]   Privacy, Data Sharing, and Other Legal Considerations [J].
Cramer, Jodi .
SURGICAL CLINICS OF NORTH AMERICA, 2023, 103 (02) :347-356
[37]   The Protection of Data Sharing for Privacy in Financial Vision [J].
Wang, Yi-Ren ;
Tsai, Yun-Cheng .
APPLIED SCIENCES-BASEL, 2022, 12 (15)
[38]   Impact of inaccurate data on Differential Privacy [J].
Wang, Dong ;
Xu, Zhengquan .
COMPUTERS & SECURITY, 2019, 82 :68-79
[39]   Bayesian Differential Privacy on Correlated Data [J].
Yang, Bin ;
Sato, Issei ;
Nakagawa, Hiroshi .
SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, :747-762
[40]   Wasserstein Generative Adversarial Networks Based Differential Privacy Metaverse Data Sharing [J].
Liu, Hai ;
Xu, Dequan ;
Tian, Youliang ;
Peng, Changgen ;
Wu, Zhenqiang ;
Wang, Ziyue .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (11) :6348-6359