Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引：0

作者：

Qiu, Zi-Hao ^{[1
]}

Hu, Quanqi ^{[2
]}

Zhong, Yongjian ^{[2
]}

Zhang, Lijun ^{[1
]}

Yang, Tianbao ^{[2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Univ Iowa, Iowa City, IA 52242 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.

引用

页数：31

共 50 条

[41] On Efficient Training of Large-Scale Deep Learning Models
Shen, Li
Sun, Yan
Yu, Zhiyuan
Ding, Liang
Tian, Xinmei
Tao, Dacheng
ACM COMPUTING SURVEYS, 2025, 57 (03)
[42] Stochastic Sequential Minimal Optimization for Large-Scale Linear SVM
Peng, Shili
Hu, Qinghua
Dang, Jianwu
Peng, Zhichao
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 279 - 288
[43] HammingMesh: A Network Topology for Large-Scale Deep Learning
Hoefler, Torsten
Bonoto, Tommaso
De Sensi, Daniele
Di Girolamo, Salvatore
Li, Shigang
Heddes, Marco
Goel, Deepak
Castro, Miguel
Scott, Steve
COMMUNICATIONS OF THE ACM, 2024, 67 (12) : 97 - 105
[44] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
Mokhtari, Aryan
Koppel, Alec
Takac, Martin
Ribeiro, Alejandro
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[45] Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning
Yang, Zhuang
IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (06) : 1598 - 1606
[46] Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning
Yang, Zhuang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14645 - 14658
[47] Hybrid systems: Convergence and stability analysis of stochastic large-scale approximation schemes
Ladde, GS
DYNAMIC SYSTEMS AND APPLICATIONS, 2004, 13 (3-4): : 487 - 511
[48] GECCO 2023 Tutorial: Large-Scale Optimization and Learning
Omidvar, Nabi
Sun, Yuan
Li, Xiaodong
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 1477 - 1502
[49] THE EFFECT OF LARGE-SCALE CONVERGENCE ON THE GENERATION AND MAINTENANCE OF DEEP MOIST CONVECTION
CROOK, NA
MONCRIEFF, MW
JOURNAL OF THE ATMOSPHERIC SCIENCES, 1988, 45 (23) : 3606 - 3624
[50] A global convergence analysis of an algorithm for large-scale nonlinear optimization problems
Boggs, PT
Kearsley, AJ
Tolle, JW
SIAM JOURNAL ON OPTIMIZATION, 1999, 9 (04) : 833 - 862

← 1 2 3 4 5 →