A data-driven approach to choosing privacy parameters for clinical trial data sharing under differential privacy

被引:0
|
作者
Chen, Henian [1 ,5 ]
Pang, Jinyong [1 ]
Zhao, Yayi [1 ]
Giddens, Spencer [2 ]
Ficek, Joseph [3 ]
Valente, Matthew J. [1 ]
Cao, Biwei [1 ]
Daley, Ellen [4 ]
机构
[1] Univ S Florida, Coll Publ Hlth, Study Design & Data Anal, Tampa, FL 33612 USA
[2] Univ Notre Dame, Dept Appl & Computat Math & Stat, Notre Dame, IN 46556 USA
[3] GlaxoSmithKline, Oncol Stat, Collegeville, PA 19426 USA
[4] Univ S Florida, Coll Publ Hlth, Lawton & Rhea Chiles Ctr Children & Families, Tampa, FL USA
[5] Univ S Florida, Coll Publ Hlth, Study Design & Data Anal, 13201 Bruce B Downs Blvd, MDC 56, Tampa, FL 33612 USA
关键词
clinical trial; differential privacy; accuracy; data sharing; privacy parameter; RELATION EXTRACTION;
D O I
10.1093/jamia/ocae038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives Clinical trial data sharing is crucial for promoting transparency and collaborative efforts in medical research. Differential privacy (DP) is a formal statistical technique for anonymizing shared data that balances privacy of individual records and accuracy of replicated results through a "privacy budget" parameter, epsilon. DP is considered the state of the art in privacy-protected data publication and is underutilized in clinical trial data sharing. This study is focused on identifying epsilon values for the sharing of clinical trial data. Materials and Methods We analyzed 2 clinical trial datasets with privacy budget epsilon ranging from 0.01 to 10. Smaller values of epsilon entail adding greater amounts of random noise, with better privacy as a result. Comparison of rates, odds ratios, means, and mean differences between the original clinical trial datasets and the empirical distribution of the DP estimator was performed. Results The DP rate closely approximated the original rate of 6.5% when epsilon > 1. The DP odds ratio closely aligned with the original odds ratio of 0.689 when epsilon >= 3. The DP mean closely approximated the original mean of 164.64 when epsilon >= 1. As epsilon increased to 5, both the minimum and maximum DP means converged toward the original mean. Discussion There is no consensus on how to choose the privacy budget epsilon. The definition of DP does not specify the required level of privacy, and there is no established formula for determining epsilon. Conclusion Our findings suggest that the application of DP holds promise in the context of sharing clinical trial data.
引用
收藏
页码:1135 / 1143
页数:9
相关论文
共 50 条
  • [1] Genomic Data Sharing under Dependent Local Differential Privacy
    Yilmaz, Emre
    Ji, Tianxi
    Ayday, Erman
    Li, Pan
    CODASPY'22: PROCEEDINGS OF THE TWELVETH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2022, : 77 - 88
  • [2] Data-Driven Spectrum Trading with Secondary Users' Differential Privacy Preservation
    Wang, Jingyi
    Zhang, Xinyue
    Zhang, Qixun
    Li, Ming
    Guo, Yuanxiong
    Feng, Zhiyong
    Pan, Miao
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (01) : 438 - 447
  • [3] Research on Governmental Data Sharing Based on Local Differential Privacy Approach
    Liu, Liping
    Piao, Chunhui
    Jiang, Xuehong
    Zheng, Lijuan
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 39 - 45
  • [4] Data-Driven Small Cell Planning for Traffic Offloading with Users' Differential Privacy
    Chen, Rui
    Zhang, Xinyue
    Wang, Jingyi
    Cui, Qimei
    Xu, Wenjun
    Pan, Miao
    ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
  • [5] Decision Support for Sharing Data Using Differential Privacy
    St John, Mark F.
    Denker, Grit
    Laud, Peeter
    Martiny, Karsten
    Pankova, Alisa
    Pavlovic, Dusko
    2021 IEEE SYMPOSIUM ON VISUALIZATION FOR CYBER SECURITY (VIZSEC 2021), 2021, : 26 - 35
  • [6] Protecting Privacy for Big Data in Body Sensor Networks: A Differential Privacy Approach
    Lin, Chi
    Song, Zihao
    Liu, Qing
    Sun, Weifeng
    Wu, Guowei
    COLLABORATIVE COMPUTING: NETWORKING, APPLICATIONS, AND WORKSHARING, COLLABORATECOM 2015, 2016, 163 : 163 - 172
  • [7] Data Level Privacy Preserving: A Stochastic Perturbation Approach Based on Differential Privacy
    Ma, Chuan
    Yuan, Long
    Han, Li
    Ding, Ming
    Bhaskar, Raghav
    Li, Jun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 3619 - 3631
  • [8] A Differential Privacy Approach to Preserve GWAS Data Sharing based on A Game Theoretic Perspective
    Yan, Jun
    Han, Ziwei
    Zhou, Yihui
    Lu, Laifeng
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (03): : 1028 - 1046
  • [9] Big Data Privacy Based On Differential Privacy a Hope for Big Data
    Shrivastva, Krishna Mohan Pd
    Rizvi, M. A.
    Singh, Shailendra
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 776 - 781
  • [10] Data-Driven Transportation Network Company Vehicle Scheduling With Users' Location Differential Privacy Preservation
    Zhang, Xinyue
    Wang, Jingyi
    Zhang, Haijun
    Li, Lixin
    Pan, Miao
    Han, Zhu
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (02) : 813 - 823