Convergence analysis of stochastic higher-order majorization-minimization algorithms

被引：1

作者：

Lupu, Daniela ^{[1
]}

Necoara, Ion ^{[2
,3
]}

机构：

[1] Univ Politehn Bucuresti, Automat Control & Syst Engn Dept, Bucharest, Romania

[2] Romanian Acad, Gheorghe Mihoc Caius Iacob Inst Math Stat & Appl M, Bucharest, Romania

[3] Romanian Acad, Gheorghe Mihoc Caius Iacob Inst Math Stat & Appl M, Bucharest 050711, Romania

来源：

OPTIMIZATION METHODS & SOFTWARE | 2024年 / 39卷 / 02期

关键词：

Finite-sum optimization; majorization-minimization; stochastic higher-order algorithms; minibatch; convergence rates; NEWTON METHOD; OPTIMIZATION;

D O I：

10.1080/10556788.2023.2256447

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Majorization-minimization schemes are a broad class of iterative methods targeting general optimization problems, including nonconvex, nonsmooth and stochastic. These algorithms minimize successively a sequence of upper bounds of the objective function so that along the iterations the objective value decreases. We present a stochastic higher-order algorithmic framework for minimizing the average of a very large number of sufficiently smooth functions. Our stochastic framework is based on the notion of stochastic higher-order upper bound approximations of the finite-sum objective function and minibatching. We derive convergence results for nonconvex and convex optimization problems when the higher-order approximation of the objective function yields an error that is p times differentiable and has Lipschitz continuous p derivative. More precisely, for general nonconvex problems we present asymptotic stationary point guarantees and under Kurdyka-Lojasiewicz property we derive local convergence rates ranging from sublinear to linear. For convex problems with uniformly convex objective function, we derive local (super)linear convergence results for our algorithm. Compared to existing stochastic (first-order) methods, our algorithm adapts to the problem's curvature and allows using any batch size. Preliminary numerical tests support the effectiveness of our algorithmic framework.

引用

页码：384 / 413

页数：30

共 35 条

[1]

Agafonov A., 2020, ARXIV

[2] Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models [J].

Birgin, E. G. ;

Gardenghi, J. L. ;

Martinez, J. M. ;

Santos, S. A. ;

Toint, Ph. L. .

MATHEMATICAL PROGRAMMING, 2017, 163 (1-2) :359-368

[3] Exact and inexact subsampled Newton methods for optimization [J].

Bollapragada, Raghu ;

Byrd, Richard H. ;

Nocedal, Jorge .

IMA JOURNAL OF NUMERICAL ANALYSIS, 2019, 39 (02) :545-578

[4] THE LOJASIEWICZ INEQUALITY FOR NONSMOOTH SUBANALYTIC FUNCTIONS WITH APPLICATIONS TO SUBGRADIENT DYNAMICAL SYSTEMS [J].

Bolte, Jerome ;

Daniilidis, Aris ;

Lewis, Adrian .

SIAM JOURNAL ON OPTIMIZATION, 2007, 17 (04) :1205-1223

[5] Optimization Methods for Large-Scale Machine Learning [J].

Bottou, Leon ;

Curtis, Frank E. ;

Nocedal, Jorge .

SIAM REVIEW, 2018, 60 (02) :223-311

[6]

Boyd S, 2011, FOUND TRENDS MACH LE, V3, P1, DOI DOI 10.1561/2200000016

[7] A concise second-order complexity analysis for unconstrained optimization using high-order regularized models [J].

Cartis, C. ;

Gould, N. I. M. ;

Toint, Ph L. .

OPTIMIZATION METHODS & SOFTWARE, 2020, 35 (02) :243-256

[8] LIBSVM: A Library for Support Vector Machines [J].

Chang, Chih-Chung ;

Lin, Chih-Jen .

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)

[9]

Defazio A, 2014, ADV NEUR IN, V27

[10]

DOIKOV N., 2020, ICML, P2577

← 1 2 3 4 →