Bilevel Learning of the Group Lasso Structure

被引：0

作者：

Frecon, Jordan ^{[1
]}

Salzo, Saverio ^{[1
]}

Pontil, Massimiliano ^{[1
,2
]}

机构：

[1] Ist Italiano Tecnol, Computat Stat & Machine Learning, Genoa, Italy

[2] UCL, Dept Comp Sci, London, England

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷

关键词：

OPTIMIZATION; SPARSITY; REGULARIZERS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Regression with group-sparsity penalty plays a central role in high-dimensional prediction problems. However, most existing methods require the group structure to be known a priori. In practice, this may be a too strong assumption, potentially hampering the effectiveness of the regularization method. To circumvent this issue, we present a method to estimate the group structure by means of a continuous bilevel optimization problem where the data is split into training and validation sets. Our approach relies on an approximation scheme where the lower level problem is replaced by a smooth dual forward-backward algorithm with Bregman distances. We provide guarantees regarding the convergence of the approximate procedure to the exact problem and demonstrate the well behaviour of the proposed method on synthetic experiments. Finally, a preliminary application to genes expression data is tackled with the purpose of unveiling functional groups.

引用

页数：11

共 31 条

[1] Recovering time-varying networks of dependencies in social and biological studies [J].

Ahmed, Amr ;

Xing, Eric P. .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (29) :11878-11883

[2]

[Anonymous], 2017, CONVEX ANAL MONOTONE, DOI [DOI 10.1007/978-3-319-48311-5, 10.1007/978-3-319-48311-5]

[3]

[Anonymous], VARIATIONAL METHODS

[4]

[Anonymous], 2009, P 26 ANN INT C MACH

[5]

[Anonymous], 2006, Journal of the Royal Statistical Society, Series B

[6]

[Anonymous], 2008, EVALUATING DERIVATIV

[7]

[Anonymous], INT M COMP INT METH

[8] A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications [J].

Bauschke, Heinz H. ;

Bolte, Jerome ;

Teboulle, Marc .

MATHEMATICS OF OPERATIONS RESEARCH, 2017, 42 (02) :330-348

[9] Gradient-based optimization of hyperparameters [J].

Bengio, Y .

NEURAL COMPUTATION, 2000, 12 (08) :1889-1900

[10] Convergence rates in forward-backward splitting [J].

Chen, GHG ;

Rockafellar, RT .

SIAM JOURNAL ON OPTIMIZATION, 1997, 7 (02) :421-444

← 1 2 3 4 →