Sparse optimization on measures with over-parameterized gradient descent

被引:27
作者
Chizat, Lenaic [1 ]
机构
[1] Univ Paris Saclay, Lab Math Orsay, CNRS, F-91405 Orsay, France
关键词
EMPIRICAL MEASURES; TRANSPORT; INEQUALITIES; CONVERGENCE; DISTANCE;
D O I
10.1007/s10107-021-01636-z
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Minimizing a convex function of a measure with a sparsity-inducing penalty is a typical problem arising, e.g., in sparse spikes deconvolution or two-layer neural networks training. We show that this problem can be solved by discretizing the measure and running non-convex gradient descent on the positions and weights of the particles. For measures on a d-dimensional manifold and under some non-degeneracy assumptions, this leads to a global optimization algorithm with a complexity scaling as log(1/epsilon) in the desired accuracy epsilon, instead of epsilon(-d) for convex methods. The key theoretical tools are a local convergence analysis in Wasserstein space and an analysis of a perturbed mirror descent in the space of measures. Our bounds involve quantities that are exponential in d which is unavoidable under our assumptions.
引用
收藏
页码:487 / 532
页数:46
相关论文
共 60 条
  • [1] Absil PA, 2008, OPTIMIZATION ALGORITHMS ON MATRIX MANIFOLDS, P1
  • [2] Natural gradient works efficiently in learning
    Amari, S
    [J]. NEURAL COMPUTATION, 1998, 10 (02) : 251 - 276
  • [3] Ambrosio L, 2008, LECT MATH, P1
  • [4] Bach F, 2017, J MACH LEARN RES, V18
  • [5] Optimization with Sparsity-Inducing Penalties
    Bach, Francis
    Jenatton, Rodolphe
    Mairal, Julien
    Obozinski, Guillaume
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 4 (01): : 1 - 106
  • [6] Mirror descent and nonlinear projected subgradient methods for convex optimization
    Beck, A
    Teboulle, M
    [J]. OPERATIONS RESEARCH LETTERS, 2003, 31 (03) : 167 - 175
  • [7] A family of functional inequalities: Lojasiewicz inequalities and displacement convex functions
    Blanchet, Adrien
    Bolte, Jerome
    [J]. JOURNAL OF FUNCTIONAL ANALYSIS, 2018, 275 (07) : 1650 - 1673
  • [8] THE ALTERNATING DESCENT CONDITIONAL GRADIENT METHOD FOR SPARSE INVERSE PROBLEMS
    Boyd, Nicholas
    Schiebinger, Geoffrey
    Recht, Benjamin
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2017, 27 (02) : 616 - 639
  • [9] Boyd S., 2004, Convex Optimization, DOI 10.1017/CBO9780511804441
  • [10] ON REPRESENTER THEOREMS AND CONVEX REGULARIZATION
    Boyer, Claire
    Chambolle, Antonin
    De Castro, Yohann
    Duval, Vincent
    De Gournay, Frederic
    Weiss, Pierre
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2019, 29 (02) : 1260 - 1281