Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引：0

作者：

Qiu, Zi-Hao ^{[1
]}

Hu, Quanqi ^{[2
]}

Zhong, Yongjian ^{[2
]}

Zhang, Lijun ^{[1
]}

Yang, Tianbao ^{[2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Univ Iowa, Iowa City, IA 52242 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.

引用

页数：31

共 50 条

[1] Optimal large-scale stochastic optimization of NDCG surrogates for deep learning
Qiu, Zi-Hao
Hu, Quanqi
Zhong, Yongjian
Tu, Wei-Wei
Zhang, Lijun
Yang, Tianbao
MACHINE LEARNING, 2025, 114 (02)
[2] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
Yang, Zhuang
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[3] Powered stochastic optimization with hypergradient descent for large-scale learning systems
Yang, Zhuang
Li, Xiaotian
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[4] Adaptive step size rules for stochastic optimization in large-scale learning
Zhuang Yang
Li Ma
Statistics and Computing, 2023, 33
[5] MEAN-NORMALIZED STOCHASTIC GRADIENT FOR LARGE-SCALE DEEP LEARNING
Wiesler, Simon
Richard, Alexander
Schlueter, Ralf
Ney, Hermann
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[6] Stability and convergence of large-scale stochastic approximation procedures
Ladde, G.S.
Lawrence, Bonita A.
International Journal of Systems Science, 1995, 26 (03): : 595 - 618
[7] Adaptive step size rules for stochastic optimization in large-scale learning
Yang, Zhuang
Ma, Li
STATISTICS AND COMPUTING, 2023, 33 (02)
[8] Doubly Stochastic Algorithms for Large-Scale Optimization
Koppel, Alec
Mokhtari, Aryan
Ribeiro, Alejandro
2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 1705 - 1709
[9] Stochastic Optimization for Large-scale Optimal Transport
Genevay, Aude
Cuturi, Marco
Peyre, Gabriel
Bach, Francis
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[10] Designing Reconfigurable Large-Scale Deep Learning Systems Using Stochastic Computing
Ren, Ao
Li, Zhe
Wang, Yanzhi
Qiu, Qinru
Yuan, Bo
2016 IEEE INTERNATIONAL CONFERENCE ON REBOOTING COMPUTING (ICRC), 2016,

← 1 2 3 4 5 →