Sample-weighted semiparametric estimation of cause-specific cumulative risk and incidence using left- or interval-censored data from electronic health records

被引:0
作者
Hyun, Noorie [1 ]
Katki, Hormuzd A. [2 ]
Graubard, Barry I. [2 ]
机构
[1] Med Coll Wisconsin, Div Biostat, Wauwatosa, WI 53226 USA
[2] NCI, Div Canc Epidemiol & Genet, Rockville, MD USA
关键词
competing risks; left; interval censoring; nonparametric maximum likelihood estimation; stratified random sample; MAXIMUM-LIKELIHOOD-ESTIMATION; COMPETING RISKS; HUMAN-PAPILLOMAVIRUS; MODEL; INFERENCE;
D O I
10.1002/sim.8544
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Electronic health records (EHRs) can be a cost-effective data source for forming cohorts and developing risk models in the context of disease screening. However, important issues need to be handled: competing outcomes, left-censoring of prevalent disease, interval-censoring of incident disease, and uncertainty of prevalent disease when accurate disease ascertainment is not conducted at baseline. Furthermore, novel tests that are costly and limited in availability can be conducted on stored biospecimens selected as samples from EHRs by using different sampling fractions. We extend sample-weighted semiparametric marginal mixture models to estimating competing risks. For flexible modeling of relative risks, a general transformation of the subdistribution hazard function and regression parameters is used. We propose a numerical algorithm for nonparametrically calculating the maximum likelihood estimates for subdistribution hazard functions and regression parameters. Methods for calculating the consistent confidence intervals for relative and absolute risk estimates are presented. The proposed algorithm and methods show reliable finite sample performance through simulation studies. We apply our methods to a cohort assembled from EHRs at a health maintenance organization where we estimate cumulative risk of cervical precancer/cancer and incidence of infection-clearance by HPV genotype among human papillomavirus (HPV) positive women. There is no significant difference in 3-year HPV-clearance rates across different HPV types, but 3-year cumulative risk of progression-to-precancer/cancer from HPV-16 is relatively higher than the other HPV genotypes.
引用
收藏
页码:2387 / 2402
页数:16
相关论文
共 34 条
  • [1] Weighted likelihood for semiparametric models and two-phase stratified samples, with application to cox regression
    Breslow, Norman E.
    Wellner, Jon A.
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2007, 34 (01) : 86 - 102
  • [2] Resampling Procedures for Making Inference Under Nested Case-Control Studies
    Cai, Tianxi
    Zheng, Yingye
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (504) : 1532 - 1544
  • [3] Using Electronic Health Records for Population Health Research: A Review of Methods and Applications
    Casey, Joan A.
    Schwartz, Brian S.
    Stewart, Walter F.
    Adler, Nancy E.
    [J]. ANNUAL REVIEW OF PUBLIC HEALTH, VOL 37, 2016, 37 : 61 - 81
  • [4] Five-Year Experience of Human Papillomavirus DNA and Papanicolaou Test Cotesting
    Castle, Philip E.
    Fetterman, Barbara
    Poitras, Nanty
    Lorey, Yhomas
    Shaber, Ruth
    Kinney, Walter
    [J]. OBSTETRICS AND GYNECOLOGY, 2009, 113 (03) : 595 - 600
  • [5] A linear transformation model for multivariate interval-censored failure time data
    Chen, Man-Hua
    Tong, Xingwei
    Zhu, Liang
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2013, 41 (02): : 275 - 290
  • [6] MULTIPLE IMPUTATION FOR THRESHOLD-CROSSING DATA WITH INTERVAL CENSORING
    DOREY, FJ
    LITTLE, RJA
    SCHENKER, N
    [J]. STATISTICS IN MEDICINE, 1993, 12 (17) : 1589 - 1603
  • [7] A proportional hazards model for the subdistribution of a competing risk
    Fine, JP
    Gray, RJ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (446) : 496 - 509
  • [8] A CLASS OF K-SAMPLE TESTS FOR COMPARING THE CUMULATIVE INCIDENCE OF A COMPETING RISK
    GRAY, RJ
    [J]. ANNALS OF STATISTICS, 1988, 16 (03) : 1141 - 1154
  • [9] Groeneboom, 1992, INFORM BOUNDS NONPAR
  • [10] Sieve estimation for the proportional-odds failure-time regression model with interval censoring
    Huang, J
    Rossini, AJ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (439) : 960 - 967