Variable selection in high-dimensional partially linear additive models for composite quantile regression

被引:60
|
作者
Guo, Jie [1 ]
Tang, Manlai [2 ]
Tian, Maozai [1 ]
Zhu, Kai [3 ]
机构
[1] Renmin Univ China, Ctr Appl Stat, Sch Stat, Beijing, Peoples R China
[2] Hong Kong Baptist Univ, Dept Math, Hong Kong, Hong Kong, Peoples R China
[3] Chinese Acad Sci, Natl Astron Observ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive Lasso; Composite quantile regression; High-dimension; Semiparametric additive partial linear model; Spline approximation; Variable selection;
D O I
10.1016/j.csda.2013.03.017
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A new estimation procedure based on the composite quantile regression is proposed for the semiparametric additive partial linear models, of which the nonparametric components are approximated by polynomial splines. The proposed estimation method can simultaneously estimate both the parametric regression coefficients and nonparametric components without any specification of the error distributions. The proposed estimation method is empirically shown to be much more efficient than the popular least-squares-based estimation method for non-normal random errors, especially for Cauchy error, and almost as efficient for normal random errors. To achieve sparsity in high-dimensional and sparse additive partial linear models, of which the number of linear covariates is much larger than the sample size but that of significant covariates is small relative to the sample size, a variable selection procedure based on adaptive Lasso is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property, and is much superior to the adaptive Lasso penalized least-squares-based method regardless of the random error distributions. In particular, two kinds of weights in the penalty are considered, namely the composite quantile regression estimates and Lasso penalized composite quantile regression estimates. Both types of weights perform very well with the latter performing especially well in terms of precisely selecting significant variables. The simulation results are consistent with the theoretical properties. A real data example is used to illustrate the application of the proposed methods. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:56 / 67
页数:12
相关论文
共 50 条