An outer–inner linearization method for non-convex and nondifferentiable composite regularization problems

被引:0
作者
Minh Pham
Xiaodong Lin
Andrzej Ruszczyński
Yu Du
机构
[1] San Francisco State University,Department of Decision Sciences
[2] Rutgers University,Department of Management Science and Information Systems
[3] University of Colorado Denver,Business School
来源
Journal of Global Optimization | 2021年 / 81卷
关键词
Nonsmooth optimization; Non-convex loss; Non-convex regularization; Machine learning; Statistics;
D O I
暂无
中图分类号
学科分类号
摘要
We propose a new outer–inner linearization method for non-convex statistical learning problems involving non-convex structural penalties and non-convex loss. Many important statistical problems fall in this category, including the robust M-estimators, generalized linear models, and different types of structured learning problems. Our method exploits the fact that many such loss and penalty functions can be represented as compositions of smooth concave functions and nonsmooth convex functions. It linearizes the outer concave functions and solves the resulting convex, but still nonsmooth, subproblems by a special alternating linearization method. We provide proof of convergence to a stationary point of the method under mild conditions. Finally, numerical examples involving non-convex structural penalties and non-convex loss functions demonstrate the efficacy and accuracy of the method.
引用
收藏
页码:179 / 202
页数:23
相关论文
共 100 条
  • [11] Sidky EY(2016)A unified algorithmic framework for block-structured optimization involving big data: with applications in machine learning and signal processing IEEE Signal Process. Mag. 33 57-101
  • [12] Schmidt TG(1964)Robust estimation of a location parameter Ann. Math. Stat. 35 73-172
  • [13] Pan X(1999)Proximal decomposition via alternating linearization SIAM J. Optim. 9 153-3481
  • [14] Breheny P(2014)Alternating linearization for structured regularization problems J. Mach. Learn. Res. 15 3447-896
  • [15] Huang J(2015)Statistical consistency and asymptotic normality for high-dimensional robust m-estimators Ann. Stat. 45 866-307
  • [16] Candes E(2014)Iterative reweighted minimization methods for lp regularization unconstrained nonlinear programming Math. Program. 147 277-855
  • [17] Wakin M(2015)Incremental majorization-minimization optimization with application to large-scale machine learning SIAM J. Optim. 25 829-60
  • [18] Boyd M(2010)Online learning for matrix factorization and sparse coding J. Mach. Learn. Res. 11 19-2485
  • [19] Chen J(2013)Supervised feature selection in graphs with path coding penalties and network flows J. Mach. Learn. Res. 14 2449-1138
  • [20] Sayed AH(2011)Sparsenet: coordinate descent with non-convex penalties J. Am. Stat. Assoc. 106 1125-71