Enmsp: an elastic-net multi-step screening procedure for high-dimensional regression

被引:0
作者
Yushan Xue
Jie Ren
Bin Yang
机构
[1] Central University of Finance and Economics,School of Statistics and Mathematics
[2] Cinda Securities Co.,undefined
[3] Ltd.,undefined
[4] Research Center for International Inspection and Quarantine Standards and Technical Regulations,undefined
来源
Statistics and Computing | 2024年 / 34卷
关键词
High-dimensional data; Correlated effects; Elastic-net; Iterative algorithm; EnMSP;
D O I
暂无
中图分类号
学科分类号
摘要
To improve the estimation efficiency of high-dimensional regression problems, penalized regularization is routinely used. However, accurately estimating the model remains challenging, particularly in the presence of correlated effects, wherein irrelevant covariates exhibit strong correlation with relevant ones. This situation, referred to as correlated data, poses additional complexities for model estimation. In this paper, we propose the elastic-net multi-step screening procedure (EnMSP), an iterative algorithm designed to recover sparse linear models in the context of correlated data. EnMSP uses a small repeated penalty strategy to identify truly relevant covariates in a few iterations. Specifically, in each iteration, EnMSP enhances the adaptive lasso method by adding a weighted l2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_2$$\end{document} penalty, which improves the selection of relevant covariates. The method is shown to select the true model and achieve the l2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_2$$\end{document}-norm error bound under certain conditions. The effectiveness of EnMSP is demonstrated through numerical comparisons and applications in financial data.
引用
收藏
相关论文
共 56 条
[1]  
Bühlmann P(2013)Statistical significance in high-dimensional linear models Bernoulli 19 1212-1242
[2]  
Bühlmann P(2010)Variable selection in high-dimensional linear models: partially faithful distributions and the pc-simple algorithm Biometrika 97 261-278
[3]  
Kalisch M(2007)The dantzig selector: statistical estimation when p is much larger than n Ann. Stat. 35 2313-2351
[4]  
Maathuis MH(2012)High dimensional variable selection via tilting J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 74 593-622
[5]  
Candes E(2018)Broken adaptive ridge regression and its asymptotic properties J. Multivar. Anal. 168 334-351
[6]  
Tao T(2001)Variable selection via nonconcave penalized likelihood and its oracle properties J. Am. Stat. Assoc. 96 1348-1360
[7]  
Cho H(2011)Nonconcave penalized likelihood with np-dimensionality IEEE Trans. Inf. Theory 57 5467-5484
[8]  
Fryzlewicz P(2014)Strong oracle optimality of folded concave penalized estimation Ann. Stat. 42 819-849
[9]  
Dai L(2017)Sufficient dimension reduction and variable selection for large-p-small-n data with highly correlated predictors J. Comput. Graph. Stat. 26 26-34
[10]  
Chen K(2014)Optimality of graphlet screening in high dimensional variable selection J. Mach. Learn. Res. 15 2723-2772