An outer–inner linearization method for non-convex and nondifferentiable composite regularization problems

被引:0
作者
Minh Pham
Xiaodong Lin
Andrzej Ruszczyński
Yu Du
机构
[1] San Francisco State University,Department of Decision Sciences
[2] Rutgers University,Department of Management Science and Information Systems
[3] University of Colorado Denver,Business School
来源
Journal of Global Optimization | 2021年 / 81卷
关键词
Nonsmooth optimization; Non-convex loss; Non-convex regularization; Machine learning; Statistics;
D O I
暂无
中图分类号
学科分类号
摘要
We propose a new outer–inner linearization method for non-convex statistical learning problems involving non-convex structural penalties and non-convex loss. Many important statistical problems fall in this category, including the robust M-estimators, generalized linear models, and different types of structured learning problems. Our method exploits the fact that many such loss and penalty functions can be represented as compositions of smooth concave functions and nonsmooth convex functions. It linearizes the outer concave functions and solves the resulting convex, but still nonsmooth, subproblems by a special alternating linearization method. We provide proof of convergence to a stationary point of the method under mild conditions. Finally, numerical examples involving non-convex structural penalties and non-convex loss functions demonstrate the efficacy and accuracy of the method.
引用
收藏
页码:179 / 202
页数:23
相关论文
共 100 条
  • [1] Attouch H(2013)Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods Math. Program. 137 91-129
  • [2] Bolte J(2016)Multimodal task-driven dictionary learning for image classification IEEE Trans. Image Process. 25 24-38
  • [3] Svaiter BF(2016)Mocca: mirrored convex/concave optimization for nonconvex composite functions J. Mach. Learn. Res. 5056 17-5006
  • [4] Bahrampour S(2016)An algorithm for constrained one-step inversion of spectral CT data Phys. Med. Biol. 61 3784-253
  • [5] Nasrabadi NM(2011)Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection Ann. Appl. Stat. 5 232-905
  • [6] Ray A(2008)Enhancing sparsity by reweighted l1 minimization J. Fourier Anal. Appl. 14 877-3517
  • [7] Jenkins WK(2015)On the learning behavior of adaptive networks-Part I: transient analysis IEEE Trans. Inf. Theory 61 3487-1117
  • [8] Barber R(2017)A selective linearization method for multiblock convex optimization SIAM J. Optim. 27 1102-1360
  • [9] Sidky E(2001)Variable selection via nonconcave penalized likelihood and its oracle properties J. Am. Stat. Assoc. 96 1348-441
  • [10] Barber RF(2008)Sparse inverse covariance estimation with the graphical lasso Biostatistics 9 432-77