Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome

被引:1
作者
Middleton, Melissa [1 ,2 ]
Nguyen, Cattram [1 ,2 ]
Moreno-Betancur, Margarita [1 ,2 ]
Carlin, John B. [1 ,2 ]
Lee, Katherine J. [1 ,2 ]
机构
[1] Royal Childrens Hosp, Murdoch Childrens Res Inst, Clin Epidemiol & Biostat Unit, 50 Flemington Rd, Melbourne, Vic 3052, Australia
[2] Univ Melbourne, Dept Paediat, Parkville, Vic, Australia
基金
英国医学研究理事会; 澳大利亚研究理事会;
关键词
Multiple imputation; Case-cohort study; Simulation study; Missing data; Unequal sampling probability; Inverse probability weighting;
D O I
10.1186/s12874-021-01495-4
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background In case-cohort studies a random subcohort is selected from the inception cohort and acts as the sample of controls for several outcome investigations. Analysis is conducted using only the cases and the subcohort, with inverse probability weighting (IPW) used to account for the unequal sampling probabilities resulting from the study design. Like all epidemiological studies, case-cohort studies are susceptible to missing data. Multiple imputation (MI) has become increasingly popular for addressing missing data in epidemiological studies. It is currently unclear how best to incorporate the weights from a case-cohort analysis in MI procedures used to address missing covariate data. Method A simulation study was conducted with missingness in two covariates, motivated by a case study within the Barwon Infant Study. MI methods considered were: using the outcome, a proxy for weights in the simple case-cohort design considered, as a predictor in the imputation model, with and without exposure and covariate interactions; imputing separately within each weight category; and using a weighted imputation model. These methods were compared to a complete case analysis (CCA) within the context of a standard IPW analysis model estimating either the risk or odds ratio. The strength of associations, missing data mechanism, proportion of observations with incomplete covariate data, and subcohort selection probability varied across the simulation scenarios. Methods were also applied to the case study. Results There was similar performance in terms of relative bias and precision with all MI methods across the scenarios considered, with expected improvements compared with the CCA. Slight underestimation of the standard error was seen throughout but the nominal level of coverage (95%) was generally achieved. All MI methods showed a similar increase in precision as the subcohort selection probability increased, irrespective of the scenario. A similar pattern of results was seen in the case study. Conclusions How weights were incorporated into the imputation model had minimal effect on the performance of MI; this may be due to case-cohort studies only having two weight categories. In this context, inclusion of the outcome in the imputation model was sufficient to account for the unequal sampling probabilities in the analysis model.
引用
收藏
页数:12
相关论文
共 27 条
  • [1] Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model
    Bartlett, Jonathan W.
    Seaman, Shaun R.
    White, Ian R.
    Carpenter, James R.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2015, 24 (04) : 462 - 487
  • [2] Exposure stratified case-cohort designs
    Borgan, O
    Langholz, B
    Samuelsen, SO
    Goldstein, L
    Pogoda, J
    [J]. LIFETIME DATA ANALYSIS, 2000, 6 (01) : 39 - 58
  • [3] Using the Whole Cohort in the Analysis of Case-Cohort Data
    Breslow, Norman E.
    Lumley, Thomas
    Ballantyne, Christie M.
    Chambless, Lloyd E.
    Kulich, Michal
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2009, 169 (11) : 1398 - 1405
  • [4] Carpenter J., 2012, MULTIPLE IMPUTATION
  • [5] Conventional case-cohort design and analysis for studies of interaction
    Cologne, John
    Preston, Dale L.
    Imai, Kazue
    Misumi, Munechika
    Yoshida, Kengo
    Hayashi, Tomonori
    Nakachi, Kei
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2012, 41 (04) : 1174 - 1186
  • [6] Multiple imputation of missing data in nested case-control and case-cohort studies
    Keogh, Ruth H.
    Seaman, Shaun R.
    Bartlett, Jonathan W.
    Wood, Angela M.
    [J]. BIOMETRICS, 2018, 74 (04) : 1438 - 1449
  • [7] Using full-cohort data in nested case-control and case-cohort studies by multiple imputation
    Keogh, Ruth H.
    White, Ian R.
    [J]. STATISTICS IN MEDICINE, 2013, 32 (23) : 4021 - 4043
  • [8] On the bias of the multiple-imputation variance estimator in survey sampling
    Kim, Jae Kwang
    Brick, J. Michael
    Fuller, Wayne A.
    Kalton, Graham
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2006, 68 : 509 - 521
  • [9] Introduction to multiple imputation for dealing with missing data
    Lee, Katherine J.
    Simpson, Julie A.
    [J]. RESPIROLOGY, 2014, 19 (02) : 162 - 167
  • [10] Lumley TS., 2010, COMPLEX SURVEYS, DOI [10.1002/9780470580066, DOI 10.1002/9780470580066]