Three-phase generalized raking and multiple imputation estimators to address error-prone data

被引:1
作者
Amorim, Gustavo [1 ,7 ]
Tao, Ran [1 ,2 ]
Lotspeich, Sarah [1 ,3 ]
Shaw, Pamela A. [4 ]
Lumley, Thomas [5 ]
Patel, Rena C. [6 ]
Shepherd, Bryan E. [1 ]
机构
[1] Vanderbilt Univ, Med Ctr, Dept Biostat, Nashville, TN USA
[2] Vanderbilt Univ, Vanderbilt Genet Inst, Med Ctr, Nashville, TN USA
[3] Wake Forest Univ, Dept Stat Sci, Winston Salem, NC USA
[4] Kaiser Permanente, Washington Hlth Res Inst, Biostat Div, Seattle, WA USA
[5] Univ Auckland, Dept Stat, Auckland, New Zealand
[6] Univ Washington, Dept Med, Seattle, WA USA
[7] Vanderbilt Univ, Med Ctr, Dept Biostat, 2525 West End Ave, Nashville, TN 37203 USA
基金
美国国家卫生研究院;
关键词
data audits; design-based estimator; electronic medical records; measurement error; multiple imputation; three-phase design; CALIBRATION ESTIMATORS; EXPOSURE; DESIGNS; SAMPLES;
D O I
10.1002/sim.9967
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Validation studies are often used to obtain more reliable information in settings with error-prone data. Validated data on a subsample of subjects can be used together with error-prone data on all subjects to improve estimation. In practice, more than one round of data validation may be required, and direct application of standard approaches for combining validation data into analyses may lead to inefficient estimators since the information available from intermediate validation steps is only partially considered or even completely ignored. In this paper, we present two novel extensions of multiple imputation and generalized raking estimators that make full use of all available data. We show through simulations that incorporating information from intermediate steps can lead to substantial gains in efficiency. This work is motivated by and illustrated in a study of contraceptive effectiveness among 83 671 women living with HIV, whose data were originally extracted from electronic medical records, of whom 4732 had their charts reviewed, and a subsequent 1210 also had a telephone interview to validate key study variables.
引用
收藏
页码:379 / 394
页数:16
相关论文
共 37 条
  • [1] Two-phase sampling designs for data validation in settings with covariate measurement error and continuous outcome
    Amorim, Gustavo
    Tao, Ran
    Lotspeich, Sarah
    Shaw, Pamela A.
    Lumley, Thomas
    Shepherd, Bryan E.
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2021, 184 (04) : 1368 - 1389
  • [2] An, 2020, SAS MACROCALIBRATION
  • [3] Using the Whole Cohort in the Analysis of Case-Control Data: Application to the Women's Health Initiative
    Breslow N.E.
    Amorim G.
    Pettinger M.B.
    Rossouw J.
    [J]. Statistics in Biosciences, 2013, 5 (2) : 232 - 249
  • [4] Breslow NE, 2009, STAT BIOSCI, V1, P32, DOI 10.1007/s12561-009-9001-6
  • [5] Carroll Raymond J, 2006, Measurement error in nonlinear models: a modern perspective
  • [6] Efavirenz decreases etonogestrel exposure: a pharmacokinetic evaluation of implantable contraception with antiretroviral therapy
    Chappell, Catherine A.
    Lamorde, Mohammed
    Nakalema, Shadia
    Chen, Beatrice A.
    Mackline, Hope
    Riddler, Sharon A.
    Cohn, Susan E.
    Darin, Kristin M.
    Achilles, Sharon L.
    Scarsi, Kimberly K.
    [J]. AIDS, 2017, 31 (14) : 1965 - 1972
  • [7] Multiple-imputation for measurement-error correction
    Cole, Stephen R.
    Chu, Haitao
    Greenland, Sander
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2006, 35 (04) : 1074 - 1081
  • [8] CALIBRATION ESTIMATORS IN SURVEY SAMPLING
    DEVILLE, JC
    SARNDAL, CE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (418) : 376 - 382
  • [9] Multiple-Imputation Variance Estimation in Studies With Missing or Misclassified Inclusion Criteria
    Giganti, Mark J.
    Shepherd, Bryan E.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2020, 189 (12) : 1628 - 1632
  • [10] Giganti MJ, 2020, ANN APPL STAT, V14, P1045, DOI [10.1214/20-AOAS1343, 10.1214/20-aoas1343]