Elastic-net regularization in learning theory

被引：204

作者：

De Mol, Christine ^{[2
,3
]}

De Vito, Ernesto ^{[4
,5
]}

Rosasco, Lorenzo ^{[1
,6
]}

机构：

[1] MIT, Ctr Biol & Computat Learning, Cambridge, MA 02139 USA

[2] Univ Libre Bruxelles, Dept Math, B-1050 Brussels, Belgium

[3] Univ Libre Bruxelles, ECARES, B-1050 Brussels, Belgium

[4] Univ Genoa, Dipartimento Sci Architettura, I-16123 Genoa, Italy

[5] Ist Nazl Fis Nucl, Sez Genova, I-16146 Genoa, Italy

[6] Univ Genoa, Dipartimento Informat & Sci Infromaz, I-16146 Genoa, Italy

来源：

JOURNAL OF COMPLEXITY | 2009年 / 25卷 / 02期

关键词：

Learning; Regularization; Sparsity; Elastic net; ADAPTIVE ESTIMATION; MODEL SELECTION; VECTOR; ALGORITHMS; REGRESSION; LASSO;

D O I：

10.1016/j.jco.2009.01.002

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Within the framework of statistical learning theory we analyze in detail the so-called elastic-net regularization scheme proposed by Zou and Hastie [H. Zou,T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B, 67(2) (2005) 301-320] for the selection of groups of correlated variables. To investigate the statistical properties of this scheme and in particular its consistency properties, we set up a suitable mathematical framework. Our setting is random-design regression where we allow the response variable to be vector-valued and we consider prediction functions which are linear combinations of elements (features) in an infinite-dimensional dictionary. Under the assumption that the regression function admits a sparse representation on the dictionary, we prove that there exists a particular "elastic-net representation" of the regression function such that, if the number of data increases, the elastic-net estimator is consistent not only for prediction but also for variable/feature selection. Our results include finite-sample bounds and an adaptive scheme to select the regularization parameter. Moreover, using convex analysis tools, we derive an iterative thresholding algorithm for computing the elastic-net solution which is different from the optimization procedure originally proposed in the above-cited work. (C) 2009 Elsevier Inc. All rights reserved.

引用

页码：201 / 230

页数：30

共 44 条

[1] Wavelet kernel penalized estimation for non-equispaced design regression [J].

Amato, U ;

Antoniadis, A ;

Pensky, M .

STATISTICS AND COMPUTING, 2006, 16 (01) :37-55

[2]

[Anonymous], 1983, INFINITE DIMENSIONAL

[3]

Argyriou A., 2007, ADV NEURAL INFORM PR, V19, P41

[4]

BALDASSARRE L, 2008, P ICPR 2008 TAMP FL

[5]

BARLA A, 2008, ESANN 2008

[6] Approximation and learning by greedy algorithms [J].

Barron, Andrew R. ;

Cohen, Albert ;

Dahmen, Wolfgang ;

DeVore, Ronald A. .

ANNALS OF STATISTICS, 2008, 36 (01) :64-94

[7] Regularization without preliminary knowledge of smoothness and error behaviour [J].

Bauer, F ;

Pereverzev, S .

EUROPEAN JOURNAL OF APPLIED MATHEMATICS, 2005, 16 :303-317

[8] On regularization algorithms in learning theory [J].

Bauer, Frank ;

Pereverzev, Sergei ;

Rosasco, Lorenzo .

JOURNAL OF COMPLEXITY, 2007, 23 (01) :52-72

[9] Aggregation and sparsity via l1 penalized least squares [J].

Bunea, Florentina ;

Tsybakov, Alexandre B. ;

Wegkamp, Marten H. .

LEARNING THEORY, PROCEEDINGS, 2006, 4005 :379-391

[10]

Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523

← 1 2 3 4 5 →