Feature screening and variable selection for partially linear models with ultrahigh-dimensional longitudinal data

被引:7
作者
Liu, Jingyuan [1 ,2 ]
机构
[1] Xiamen Univ, Dept Stat, Sch Econ, Wang Yanan Inst Studies Econ, 422 Siming South Rd, Xiamen 361005, Peoples R China
[2] Xiamen Univ, Fujian Key Lab Stat Sci, 422 Siming South Rd, Xiamen 361005, Peoples R China
基金
中国国家自然科学基金;
关键词
Partially linear model; Ultrahigh dimensionality; Longitudinal data; Partial residual two-stage approach; Sure screening property; LIKELIHOOD; QTL;
D O I
10.1016/j.neucom.2015.09.122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with longitudinal partially linear models (LPLM) with ultrahigh-dimensional covariates and predictors. As flexible extension of linear regression models by allowing nonparametric intercept function to capture the overall trend over time, the LPLM are expected to be highly potential statistical models for analyzing high-dimensional longitudinal data such as longitudinal genetic data and functional magnetic resonance image data. Feature screening and variable selection are indispensable for LPLM in the presence of ultrahigh-dimensional covariates such as genetic markers and all pixels in image data. This paper proposes a two-stage variable selection procedure that consists of a quick screening stage and a post-screening refining stage, for the ultrahigh dimensional longitudinal partially linear models. The proposed approach is based on the partial residual method for dealing with the nonparametric baseline function. We establish the sure screening property of the proposed screening procedure in the first stage. Simulation results demonstrate the validity of this two-stage method. We further demonstrate the proposed methodology by an empirical analysis of a real data set collected in a soybean plant longitudinal genetic study. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:202 / 210
页数:9
相关论文
共 21 条
[1]  
[Anonymous], 2003, Monographs on Statistics and Applied Probability
[2]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[3]   Profile likelihood inferences on semiparametric varying-coefficient partially linear models [J].
Fan, JQ ;
Huang, T .
BERNOULLI, 2005, 11 (06) :1031-1057
[4]   New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis [J].
Fan, JQ ;
Li, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (467) :710-723
[5]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[6]  
Green P.J., 1994, NONPARAMETRIC REGRES, V58
[7]   Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems [J].
Hall, Peter ;
Miller, Hugh .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2009, 18 (03) :533-550
[8]  
Hardle W., 2000, CONTRIB STAT PHYS
[9]   QUANTILE-ADAPTIVE MODEL-FREE VARIABLE SCREENING FOR HIGH-DIMENSIONAL HETEROGENEOUS DATA [J].
He, Xuming ;
Wang, Lan ;
Hong, Hyokyoung Grace .
ANNALS OF STATISTICS, 2013, 41 (01) :342-369
[10]   ROBUST RANK CORRELATION BASED SCREENING [J].
Li, Gaorong ;
Peng, Heng ;
Zhang, Jun ;
Zhu, Lixing .
ANNALS OF STATISTICS, 2012, 40 (03) :1846-1877