A method for imputing missing data in longitudinal studies

被引:8
作者
Youk, AO [1 ]
Stone, RA [1 ]
Marsh, GM [1 ]
机构
[1] Univ Pittsburgh, Grad Sch Publ Hlth, Dept Biostat, Pittsburgh, PA 15261 USA
关键词
cohort mortality; person years; vital rates; EM algorithm; bootstrap; standardized mortality ratio;
D O I
10.1016/j.annepidem.2003.09.010
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
PURPOSE: In a cohort in which racial data are unknown for some persons, race-specific persons and person-years are imputed using a model-based iterative allocation algorithm (IAA). METHODS: An EM algorithm-based approach to address misclassification in a censored data regression setting can be adapted to estimate the probability that a person of unknown race is white. The corresponding race-specific person-years are obtained as a by-product of the estimation procedure. Variance estimates are computed using the bootstrap. The proposed approach is compared with the proportional allocation method (PAM). RESULTS: In an occupational cohort where racial data were missing for 41% of the workers, the age-time-race-specific person-years were estimated within a relative variation of approximately 20%, using the IAA. The deaths were less reliably estimated. The standardized mortality ratios (SMRs) for all-cause mortality estimated using the IAA and the PAM were more similar for the non-white workers than for a smaller subgroup of white workers. CONCLUSIONS: The IAA provides a method to reliably estimate race-specific person-year denominators in cohort studies with missing racial data. This method is applicable to other incompletely observed non-time-dependent categorical covariaies. Internal cohort rates or SMRs can be computed and modeled, with bootstrap confidence intervals that account for the uncertainty in the determination of race. (C) 2004 Elsevier Inc. All rights reserved.
引用
收藏
页码:354 / 361
页数:8
相关论文
共 25 条
[1]   ASSIGNING RACE TO OCCUPATIONAL COHORTS USING CENSUS BLOCK STATISTICS [J].
ANDJELKOVICH, DA ;
RICHARDSON, RB ;
ENTERLINE, PE ;
LEVINE, RJ .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 1990, 131 (05) :928-934
[2]  
Baker S.G., 1992, J COMPUTATIONAL GRAP, V1, P63, DOI DOI 10.2307/1390600
[3]   COMPOSITE LINEAR-MODELS FOR INCOMPLETE MULTINOMIAL DATA [J].
BAKER, SG .
STATISTICS IN MEDICINE, 1994, 13 (5-7) :609-622
[4]   REGRESSION-ANALYSIS OF GROUPED SURVIVAL-DATA WITH INCOMPLETE COVARIATES - NONIGNORABLE MISSING-DATA AND CENSORING MECHANISMS [J].
BAKER, SG .
BIOMETRICS, 1994, 50 (03) :821-826
[5]  
BRESLOW NE, 1987, PUBLICATION IARC, V82
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]   MISSING DATA, IMPUTATION, AND THE BOOTSTRAP [J].
EFRON, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (426) :463-475
[8]  
Efron B., 1994, INTRO BOOTSTRAP, DOI DOI 10.1201/9780429246593
[9]  
Efron B., 1982, SOC IND APPL MATH CB, V38, DOI [10.1137/1.9781611970319, DOI 10.1137/1.9781611970319]
[10]   MORTALITY UPDATE OF A COHORT OF UNITED-STATES MAN-MADE MINERAL FIBER WORKERS [J].
ENTERLINE, PE ;
MARSH, GM ;
HENDERSON, V ;
CALLAHAN, C .
ANNALS OF OCCUPATIONAL HYGIENE, 1987, 31 (4B) :625-656