Parametric assumptions equate to hidden observations: comparing the efficiency of nonparametric and parametric models for estimating time to AIDS or death in a cohort of HIV-positive women

被引:7
作者
Rudolph, Jacqueline E. [1 ]
Cole, Stephen R. [1 ]
Edwards, Jessie K. [1 ]
机构
[1] Univ N Carolina, Dept Epidemiol, 135 Dauer Dr,2101 McGavran Greenberg Hall,CB 7435, Chapel Hill, NC 27599 USA
来源
BMC MEDICAL RESEARCH METHODOLOGY | 2018年 / 18卷
基金
美国国家卫生研究院;
关键词
Survival analysis; Nonparametric model; Parametric model; Statistical efficiency;
D O I
10.1186/s12874-018-0605-8
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
When conducting a survival analysis, researchers might consider two broad classes of models: nonparametric models and parametric models. While nonparametric models are more flexible because they make few assumptions regarding the shape of the data distribution, parametric models are more efficient. Here we sought to make concrete the difference in efficiency between these two model types using effective sample size. We compared cumulative risk of AIDS or death estimated using four survival models - nonparametric, generalized gamma, Weibull, and exponential - and data from 1164 HIV patients who were alive and AIDS-free in 1995. We added pseudo-observations to the sample until the spread of the 95% confidence limits for the nonparametric model became less than that for the parametric models. We found the 3-parameter generalized gamma to be a good fit to the nonparametric risk curve, but the 1-parameter exponential both underestimated and overestimated the risk at different times. Using two year-risk as an example, we had to add 354, 593, and 3960 observations for the nonparametric model to be as efficient as the generalized gamma, Weibull, and exponential models, respectively. These added observations represent the hidden observations underlying the efficiency gained through parametric model form assumptions. If the model is correctly specified, the efficiency gain may be justified, as appeared to be the case for the generalized gamma model. Otherwise, precision will be improved, but at the cost of specification bias, as was the case for the exponential model.
引用
收藏
页数:5
相关论文
共 11 条
[1]   The Women's Interagency HIV Study: an observational cohort brings clinical sciences to the bench [J].
Bacon, MC ;
von Wyl, V ;
Alden, C ;
Sharp, G ;
Robison, E ;
Hessol, N ;
Gange, S ;
Barranday, Y ;
Holman, S ;
Weber, K ;
Young, MA .
CLINICAL AND DIAGNOSTIC LABORATORY IMMUNOLOGY, 2005, 12 (09) :1013-1019
[2]  
Casella G., 2002, STAT INFERENCE
[3]   Maximum Likelihood, Profile Likelihood, and Penalized Likelihood: A Primer [J].
Cole, Stephen R. ;
Chu, Haitao ;
Greenland, Sander .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2014, 179 (02) :252-260
[4]  
Delta Method CC, 1998, ENCY BIOSTATISTICS, P1125
[5]  
Horowitz JL, 2009, SPRINGER SER STAT, P1, DOI 10.1007/978-0-387-92870-8_1
[6]   NONPARAMETRIC-ESTIMATION FROM INCOMPLETE OBSERVATIONS [J].
KAPLAN, EL ;
MEIER, P .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1958, 53 (282) :457-481
[7]   Competing Risk Regression Models for Epidemiologic Data [J].
Lau, Bryan ;
Cole, Stephen R. ;
Gange, Stephen J. .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2009, 170 (02) :244-256
[8]  
Rose S, 2011, SPRINGER SER STAT, P3, DOI 10.1007/978-1-4419-9782-1
[9]  
SAS Institute I, 2015, NLMIXED PROC PRED 20
[10]  
Tsiatis A. A., 2006, Semiparametric Theory and Missing Data