Two-wave two-phase outcome-dependent sampling designs, with applications to longitudinal binary data

被引:6
作者
Tao, Ran [1 ,2 ]
Mercaldo, Nathaniel D. [3 ,4 ,5 ]
Haneuse, Sebastien [6 ]
Maronge, Jacob M. [7 ]
Rathouz, Paul J. [8 ]
Heagerty, Patrick J. [9 ]
Schildcrout, Jonathan S. [1 ]
机构
[1] Vanderbilt Univ, Med Ctr, Dept Biostat, Nashville, TN 37232 USA
[2] Vanderbilt Univ, Med Ctr, Vanderbilt Genet Inst, Nashville, TN 37232 USA
[3] Massachusetts Gen Hosp, Dept Radiol, Boston, MA USA
[4] Massachusetts Gen Hosp, Dept Neurol, Boston, MA 02114 USA
[5] Harvard Univ, Boston, MA 02115 USA
[6] Harvard Univ, Dept Biostat, Boston, MA 02115 USA
[7] Univ Wisconsin Madison, Dept Stat, Madison, WI USA
[8] Univ Texas Austin, Dept Populat Hlth, Austin, TX 78712 USA
[9] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
ascertainment corrected maximum likelihood; marginal model; marginalized model; multiwave design; multiple imputation; time-dependent covariate; CASE-COHORT DESIGN; REGRESSION-MODELS; LOGISTIC-REGRESSION; RESPONSE DATA; SEMIPARAMETRIC INFERENCE; LIKELIHOOD METHOD; EXPOSURE; ENVIRONMENT; PREGNANCY; MODERATE;
D O I
10.1002/sim.8876
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Two-phase outcome-dependent sampling (ODS) designs are useful when resource constraints prohibit expensive exposure ascertainment on all study subjects. One class of ODS designs for longitudinal binary data stratifies subjects into three strata according to those who experience the event at none, some, or all follow-up times. For time-varying covariate effects, exclusively selecting subjects with response variation can yield highly efficient estimates. However, if interest lies in the association of a time-invariant covariate, or the joint associations of time-varying and time-invariant covariates with the outcome, then the optimal design is unknown. Therefore, we propose a class of two-wave two-phase ODS designs for longitudinal binary data. We split the second-phase sample selection into two waves, between which an interim design evaluation analysis is conducted. The interim design evaluation analysis uses first-wave data to conduct a simulation-based search for the optimal second-wave design that will improve the likelihood of study success. Although we focus on longitudinal binary response data, the proposed design is general and can be applied to other response distributions. We believe that the proposed designs can be useful in settings where (1) the expected second-phase sample size is fixed and one must tailor stratum-specific sampling probabilities to maximize estimation efficiency, or (2) relative sampling probabilities are fixed across sampling strata and one must tailor sample size to achieve a desired precision. We describe the class of designs, examine finite sampling operating characteristics, and apply the designs to an exemplar longitudinal cohort study, the Lung Health Study.
引用
收藏
页码:1863 / 1876
页数:14
相关论文
共 48 条
[1]   SEPARATE SAMPLE LOGISTIC DISCRIMINATION [J].
ANDERSON, JA .
BIOMETRIKA, 1972, 59 (01) :19-35
[2]  
Anthonisen Nicholas R, 2004, Proc Am Thorac Soc, V1, P143, DOI 10.1513/pats.2306033
[3]   Exposure stratified case-cohort designs [J].
Borgan, O ;
Langholz, B ;
Samuelsen, SO ;
Goldstein, L ;
Pogoda, J .
LIFETIME DATA ANALYSIS, 2000, 6 (01) :39-58
[4]  
Breslow N, 2003, ANN STAT, V31, P1110
[5]  
BRESLOW NE, 1988, BIOMETRIKA, V75, P11
[6]   Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis [J].
Breslow, NE ;
Chatterjee, N .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1999, 48 :457-468
[7]   Maximum likelihood estimation of logistic regression parameters under two-phase, outcome-dependent sampling [J].
Breslow, NE ;
Holubkov, R .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1997, 59 (02) :447-461
[8]   APPROXIMATE INFERENCE IN GENERALIZED LINEAR MIXED MODELS [J].
BRESLOW, NE ;
CLAYTON, DG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :9-25
[9]  
Cai J., 2001, INDIAN J STAT, V63, P326
[10]   A pseudoscore estimator for regression problems with two-phase sampling [J].
Chatterjee, N ;
Chen, YH ;
Breslow, NE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (461) :158-168