On constrained and regularized high-dimensional regression

被引：55

作者：

Shen, Xiaotong ^{[1
]}

Pan, Wei ^{[2
]}

Zhu, Yunzhang ^{[1
]}

Zhou, Hui ^{[2
]}

机构：

[1] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA

[2] Univ Minnesota, Div Biostat, Minneapolis, MN 55455 USA

来源：

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS | 2013年 / 65卷 / 05期

关键词：

Constrained regression; Parameter and nonparametric models; Nonconvex regularization; Difference convex programming; (p; n) versus fixed p-asymptotics; NONCONCAVE PENALIZED LIKELIHOOD; MODEL SELECTION; VARIABLE SELECTION; LASSO;

D O I：

10.1007/s10463-012-0396-3

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

High-dimensional feature selection has become increasingly crucial for seeking parsimonious models in estimation. For selection consistency, we derive one necessary and sufficient condition formulated on the notion of degree of separation. The minimal degree of separation is necessary for any method to be selection consistent. At a level slightly higher than the minimal degree of separation, selection consistency is achieved by a constrained -method and its computational surrogate-the constrained truncated -method. This permits up to exponentially many features in the sample size. In other words, these methods are optimal in feature selection against any selection method. In contrast, their regularization counterparts-the -regularization and truncated -regularization methods enable so under slightly stronger assumptions. More importantly, sharper parameter estimation/prediction is realized through such selection, leading to minimax parameter estimation. This, otherwise, is impossible in the absence of a good selection method for high-dimensional analysis.

引用

页码：807 / 832

页数：26

共 25 条

[1]

Akaike H., 1973, 2 INT S INFORM THEOR, P267

[2] Extended Bayesian information criteria for model selection with large model spaces [J].