GGD: Grafting Gradient Descent

被引：0

作者：

Feng, Yanjing ^{[1
]}

Zhou, Yongdao ^{[1
]}

机构：

[1] Nankai Univ, Sch Stat & Data Sci, NITFID, Tianjin 300071, Peoples R China

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2024年 / 25卷

基金：

中国国家自然科学基金;

关键词：

stochastic optimization; importance sampling; minibatching; variance reduc- tion; adaptive stepsize method; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Simple random sampling has been widely used in traditional stochastic optimization algorithms. Although the gradient sampled by simple random sampling is a descent direction in expectation, it may have a relatively high variance which will cause the descent curve wiggling and slow down the optimization process. In this paper, we propose a novel stochastic from minibatching and importance sampling, and provide the convergence results of GGD. We show that the grafting gradient possesses a doubly robust property which ensures that the performance of GGD method is superior to the worse one of SGD with importance sampling method and mini-batch SGD method. Combined with advanced variance reduction techniques such as stochastic variance reduced gradient and adaptive stepsize methods such as Adam, these composite GGD-based methods and their theoretical bounds are provided. The real data studies also show that GGD achieves an intermediate performance among SGD with importance sampling and mini-batch SGD, and outperforms original SGD method. Then the proposed GGD is a better and more robust stochastic optimization framework in practice.

引用

页数：87

共 54 条

[1]

Allen-Zhu Z, 2016, PR MACH LEARN RES, V48

[2] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods [J].

Allen-Zhu, Zeyuan .

STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, :1200-1205

[3]

[Anonymous], 1973, Mathematical programming, DOI [10.1007/BF01584660, DOI 10.1007/BF01584660]

[4]

Blanchard P, 2017, ADV NEUR IN, V30

[5] Optimization Methods for Large-Scale Machine Learning [J].

Bottou, Leon ;

Curtis, Frank E. ;

Nocedal, Jorge .

SIAM REVIEW, 2018, 60 (02) :223-311

[6]

Cotter A, 2011, Arxiv, DOI arXiv:1106.4574

[7]

Csiba D, 2018, J MACH LEARN RES, V19

[8]

Zeiler MD, 2012, Arxiv, DOI arXiv:1212.5701

[9]

Defazio A, 2014, ADV NEUR IN, V27

[10]

Defazio A, 2019, ADV NEUR IN, V32

← 1 2 3 4 5 6 →