Validity of Privacy-Protecting Analytical Methods That Use Only Aggregate-Level Information to Conduct Multivariable-Adjusted Analysis in Distributed Data Networks

被引:17
作者
Li, Xiaojuan [1 ,2 ]
Fireman, Bruce H. [3 ]
Curtis, Jeffrey R. [4 ]
Arterburn, David E. [5 ]
Fisher, David P. [6 ]
Moyneur, Erick [7 ]
Gallagher, Mia [1 ,2 ]
Raebel, Marsha A. [8 ]
Nowell, W. Benjamin [9 ]
Lagreid, Lindsay [10 ]
Toh, Sengwee [1 ,2 ]
机构
[1] Harvard Med Sch, Dept Populat Med, 401 Pk Dr,Suite 401 East, Boston, MA 02215 USA
[2] Harvard Pilgrim Hlth Care Inst, 401 Pk Dr,Suite 401 East, Boston, MA 02215 USA
[3] Kaiser Permanente Northern Calif, Div Res, Oakland, CA USA
[4] Univ Alabama Birmingham, Div Clin Immunol & Rheumatol, Sch Med, Birmingham, AL 35294 USA
[5] Kaiser Permanente Washington Hlth Res Inst, Seattle, WA USA
[6] Kaiser Permanente Northern Calif, Permanente Med Grp, Oakland, CA USA
[7] StatLog Econometr Inc, Montreal, PQ, Canada
[8] Kaiser Permanente Colorado, Inst Hlth Res, Denver, CO USA
[9] Global Hlth Living Fdn, CreakyJoints, Upper Nyack, NY USA
[10] Limeade, Bellevue, WA USA
基金
美国医疗保健研究与质量局; 美国国家卫生研究院;
关键词
confounding control; data-sharing; disease risk score; distributed data networks; meta-analysis; multicenter studies; privacy protection; propensity score; PROPENSITY SCORES; RISK; REGRESSION; SAFETY; ASSOCIATION; PERFORMANCE; INFECTIONS; WEIGHTS;
D O I
10.1093/aje/kwy265
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Distributed data networks enable large-scale epidemiologic studies, but protecting privacy while adequately adjusting for a large number of covariates continues to pose methodological challenges. Using 2 empirical examples within a 3-site distributed data network, we tested combinations of 3 aggregate-level data-sharing approaches (risk-set, summary-table, and effect-estimate), 4 confounding adjustment methods (matching, stratification, inverse probability weighting, and matching weighting), and 2 summary scores (propensity score and disease risk score) for binary and time-to-event outcomes. We assessed the performance of combinations of these data-sharing and adjustment methods by comparing their results with results from the corresponding pooled individual-level data analysis (reference analysis). For both types of outcomes, the method combinations examined yielded results identical or comparable to the reference results in most scenarios. Within each data-sharing approach, comparability between aggregate- and individual-level data analysis depended on adjustment method; for example, risk-set data-sharing with matched or stratified analysis of summary scores produced identical results, while weighted analysis showed some discrepancies. Across the adjustment methods examined, risk-set data-sharing generally performed better, while summary-table and effect-estimate data-sharing more often produced discrepancies in settings with rare outcomes and small sample sizes. Valid multivariable-adjusted analysis can be performed in distributed data networks without sharing of individual-level data.
引用
收藏
页码:709 / 723
页数:15
相关论文
共 42 条
  • [1] Performance of Disease Risk Scores, Propensity Scores, and Traditional Multivariable Outcome Regression in the Presence of Multiple Confounders
    Arbogast, Patrick G.
    Ray, Wayne A.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2011, 174 (05) : 613 - 620
  • [2] Comparative Effectiveness of Laparoscopic Adjustable Gastric Banding vs Laparoscopic Gastric Bypass
    Arterburn, David
    Powers, J. David
    Toh, Sengwee
    Polsky, Sarit
    Butler, Melissa G.
    Portz, J. Dickman
    Donahoo, William T.
    Herrinton, Lisa
    Williams, Rebecca J.
    Vijayadeva, V.
    Fisher, David
    Bayliss, Elizabeth A.
    [J]. JAMA SURGERY, 2014, 149 (12) : 1279 - 1287
  • [3] Developing the Sentinel System - A National Resource for Evidence Development
    Behrman, Rachel E.
    Benner, Joshua S.
    Brown, Jeffrey S.
    McClellan, Mark
    Woodcock, Janet
    Platt, Richard
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2011, 364 (06) : 498 - 499
  • [4] Distributed Health Data Networks A Practical and Preferred Approach to Multi-Institutional Evaluations of Comparative Effectiveness, Safety, and Quality of Care
    Brown, Jeffrey S.
    Holmes, John H.
    Shah, Kiran
    Hall, Ken
    Lazarus, Ross
    Platt, Richard
    [J]. MEDICAL CARE, 2010, 48 (06) : S45 - S51
  • [5] THE COMBINATION OF ESTIMATES FROM DIFFERENT EXPERIMENTS
    COCHRAN, WG
    [J]. BIOMETRICS, 1954, 10 (01) : 101 - 129
  • [6] COOK EF, 1989, J CLIN EPIDEMIOL, V42, P317
  • [7] Risk of serious bacterial infections among rheumatoid arthritis patients exposed to tumor necrosis factor α antagonists
    Curtis, Jeffrey R.
    Patkar, Nivedita
    Xie, Aiyuan
    Martin, Carolyn
    Allison, Jeroan J.
    Saag, Michael
    Shatin, Deborah
    Saag, Kenneth G.
    [J]. ARTHRITIS AND RHEUMATISM, 2007, 56 (04): : 1125 - 1133
  • [8] Derivation and preliminary validation of an administrative claims-based algorithm for the effectiveness of medications for rheumatoid arthritis
    Curtis, Jeffrey R.
    Baddley, John W.
    Yang, Shuo
    Patkar, Nivedita
    Chen, Lang
    Delzell, Elizabeth
    Mikuls, Ted R.
    Saag, Kenneth G.
    Singh, Jasvinder
    Safford, Monika
    Cannon, Grant W.
    [J]. ARTHRITIS RESEARCH & THERAPY, 2011, 13 (05)
  • [9] Four Health Data Networks Illustrate The Potential For A Shared National Multipurpose Big-Data Network
    Curtis, Lesley H.
    Brown, Jeffrey
    Platt, Richard
    [J]. HEALTH AFFAIRS, 2014, 33 (07) : 1178 - 1186
  • [10] Department of Population Medicine Harvard Medical School and Harvard Pilgrim Health Care Institute, 2018, PRIV PROT METH ED MA