Simultaneous model selection and estimation for mean and association structures with clustered binary data

被引:1
作者
Gao, Xin [1 ]
Yi, Grace Y. [2 ]
机构
[1] York Univ, Dept Math & Stat, Toronto, ON M3J 1P3, Canada
[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo N2L 3G1, ON, Canada
关键词
association; clustered binary data; generalized estimating equation; logistic regression; variable selection;
D O I
10.1002/sta4.21
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper investigates the property of the penalized estimating equations when both the mean and association structures are modelled. To select variables for the mean and association structures sequentially, we propose a hierarchical penalized generalized estimating equations (HPGEE2) approach. The first set of penalized estimating equations is solved for the selection of significant mean parameters. Conditional on the selected mean model, the second set of penalized estimating equations is solved for the selection of significant association parameters. The hierarchical approach is designed to accommodate possible model constraints relating the inclusion of covariates into the mean and the association models. This two-step penalization strategy enjoys a compelling advantage of easing computational burdens compared to solving the two sets of penalized equations simultaneously. HPGEE2 with a smoothly clipped absolute deviation (SCAD) penalty is shown to have the oracle property for the mean and association models. The asymptotic behavior of the penalized estimator under this hierarchical approach is established. An efficient two-stage penalized weighted least square algorithm is developed to implement the proposed method. The empirical performance of the proposed HPGEE2 is demonstrated through Monte-Carlo studies and the analysis of a clinical data set. Copyright (C) 2013 John Wiley & Sons, Ltd.
引用
收藏
页码:102 / 118
页数:17
相关论文
共 28 条
[1]   Joint Variable Selection for Fixed and Random Effects in Linear Mixed-Effects Models [J].
Bondell, Howard D. ;
Krishna, Arun ;
Ghosh, Sujit K. .
BIOMETRICS, 2010, 66 (04) :1069-1077
[2]   MODELING MULTIVARIATE BINARY DATA WITH ALTERNATING LOGISTIC REGRESSIONS [J].
CAREY, V ;
ZEGER, SL ;
DIGGLE, P .
BIOMETRIKA, 1993, 80 (03) :517-526
[3]   New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis [J].
Fan, JQ ;
Li, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (467) :710-723
[4]  
Fan JQ, 2002, ANN STAT, V30, P74
[5]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[6]   A MODEL FOR BINARY TIME-SERIES DATA WITH SERIAL ODDS RATIO PATTERNS [J].
FITZMAURICE, GM ;
LIPSITZ, SR .
APPLIED STATISTICS-JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C, 1995, 44 (01) :51-61
[7]   REGRESSION-MODELS FOR DISCRETE LONGITUDINAL RESPONSES [J].
FITZMAURICE, GM ;
LAIRD, NM ;
ROTNITZKY, AG .
STATISTICAL SCIENCE, 1993, 8 (03) :284-299
[8]   Sparse inverse covariance estimation with the graphical lasso [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Robert .
BIOSTATISTICS, 2008, 9 (03) :432-441
[9]  
Garcia RI, 2010, STAT SINICA, V20, P149
[10]   EFFECTS OF SOCIAL SUPPORT AND RELAPSE PREVENTION TRAINING AS ADJUNCTS TO A TELEVISED SMOKING-CESSATION INTERVENTION [J].
GRUDER, CL ;
MERMELSTEIN, RJ ;
KIRKENDOL, S ;
HEDEKER, D ;
WONG, SC ;
SCHRECKENGOST, J ;
WARNECKE, RB ;
BURZETTE, R ;
MILLER, TQ .
JOURNAL OF CONSULTING AND CLINICAL PSYCHOLOGY, 1993, 61 (01) :113-120