A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates

被引:0
作者
Hu Yang
Ning Li
Jing Yang
机构
[1] Chongqing University,College of Mathematics and Statistics
[2] Hunan Normal University,Key Laboratory of High Performance Computing and Stochastic Information Processing (Ministry of Education of China), College of Mathematics and Statistics
来源
Statistical Papers | 2020年 / 61卷
关键词
Partially linear models; Robust estimation; Variable selection; Oracle property;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a new robust and efficient estimation approach based on local modal regression is proposed for partially linear models with large-dimensional covariates. We show that the resulting estimators for both parametric and nonparametric components are more efficient in the presence of outliers or heavy-tail error distribution, and as asymptotically efficient as the corresponding least squares estimators when there are no outliers and the error distribution is normal. We also establish the asymptotic properties of proposed estimators when the covariate dimension diverges at the rate of on\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$o\left( {\sqrt{n} } \right) \mathrm{{ }}$$\end{document}. To achieve sparsity and enhance interpretability, we develop a variable selection procedure based on SCAD penalty to select significant parametric covariates and show that the method enjoys the oracle property under mild regularity conditions. Moreover, we propose a practical modified MEM algorithm for the proposed procedures. Some Monte Carlo simulations and a real data are conducted to illustrate the finite sample performance of the proposed estimators. Finally, based on the idea of sure independence screening procedure proposed by Fan and Lv (J R Stat Soc 70:849–911, 2008), a robust two-step approach is introduced to deal with ultra-high dimensional data.
引用
收藏
页码:1911 / 1937
页数:26
相关论文
共 77 条
[1]  
Akaike H(1973)Maximum likelihood Identification of Gaussian autoregressive moving average models Biometrika 60 255-265
[2]  
Breiman L(1995)Better subset regression using the nonnegative garrote Technometrics 37 373-384
[3]  
Chen B(2012)Profiled adaptive Elastic-Net procedure for partially linear models with high-dimensional covariates J Stat Plann Inference 142 1733-1745
[4]  
Yu Y(1977)Maximum likelihood from incomplete data via the EM algorithm J R Stat Soc 39 1-21
[5]  
Zou H(1986)Semiparametric estimates of the relation between weather and electricity sales J Am Stat Assoc 81 310-320
[6]  
Liang H(2001)Variable selection via nonconcave penalized likelihood and its oracle properties J Am Stat Assoc 96 1348-1360
[7]  
Dempster AP(2008)Sure independence screening for ultra-high-dimensional feature space J R Stat Soc 70 849-911
[8]  
Laird NM(1994)Robust nonparametric function estimation Scand J Stat 21 433-446
[9]  
Rubin DB(2003)Kiplingers personal finance J Stat Educ 57 104-123
[10]  
Engle R(2008)Variable selection in semiparametric regression modeling Ann Stat 36 261-286