Variable selection via the composite likelihood method for multilevel longitudinal data with missing responses and covariates

被引:1
作者
Li, Haocheng [1 ]
Shu, Di [2 ]
He, Wenqing [3 ]
Yi, Grace Y. [2 ]
机构
[1] Univ Calgary, Dept Math & Stat, Calgary, AB, Canada
[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON, Canada
[3] Western Univ, Dept Stat & Actuarial Sci, London, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Composite likelihood; Longitudinal data; Missing data; Missing not at random; Multilevel structure; Variable selection; MODEL SELECTION; BINARY DATA; DROP-OUT;
D O I
10.1016/j.csda.2019.01.011
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Longitudinal data with multilevel structures are commonly collected when following up subjects in clusters over a period of time. Missing values and variable selection issues are common for such data. Biased results may be produced if incompleteness of data is ignored in the analysis. On the other hand, incorporating a large number of irrelevant covariates into inferential procedures may lead to difficulty in computation and interpretation. A unified penalized composite likelihood framework is developed to handle data with missingness and variable selection issues. It is flexible to handle the situation where responses and covariates are missing not simultaneously under the assumption of missing not at random. The method is justified both rigorously with theoretical results and numerically with simulation studies. The method is also applied to the Waterloo Smoking Prevention Project data. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:25 / 34
页数:10
相关论文
共 30 条
[1]   Joint Variable Selection for Fixed and Random Effects in Linear Mixed-Effects Models [J].
Bondell, Howard D. ;
Krishna, Arun ;
Ghosh, Sujit K. .
BIOMETRICS, 2010, 66 (04) :1069-1077
[2]   Effectiveness of a social influences smoking prevention program as a function of provider type, training method, and school risk [J].
Cameron, R ;
Brown, KS ;
Best, JA ;
Pelkman, CL ;
Madill, CL ;
Manske, SR ;
Payne, ME .
AMERICAN JOURNAL OF PUBLIC HEALTH, 1999, 89 (12) :1827-1831
[3]   Variable Selection With the Strong Heredity Constraint and Its Oracle Property [J].
Choi, Nam Hee ;
Li, William ;
Zhu, Ji .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (489) :354-364
[4]  
DIGGLE P, 1994, J ROY STAT SOC C, V43, P49
[5]   New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis [J].
Fan, JQ ;
Li, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (467) :710-723
[6]  
Fan JQ, 2002, ANN STAT, V30, P74
[7]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[8]   A protective estimator for longitudinal binary data subject to non-ignorable non-monotone missingness [J].
Fitzmaurice, GM ;
Lipsitz, SR ;
Molenberghs, G ;
Ibrahim, JG .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2005, 168 :723-735
[9]   Multivariate logistic models for incomplete binary responses [J].
Fitzmaurice, GM ;
Laird, NM ;
Zahner, GEP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (433) :99-108
[10]   Simultaneous model selection and estimation for mean and association structures with clustered binary data [J].
Gao, Xin ;
Yi, Grace Y. .
STAT, 2013, 2 (01) :102-118