Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引:0
作者
Qiu, Zi-Hao [1 ]
Hu, Quanqi [2 ]
Zhong, Yongjian [2 ]
Zhang, Lijun [1 ]
Yang, Tianbao [2 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Univ Iowa, Iowa City, IA 52242 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.
引用
收藏
页数:31
相关论文
共 50 条
  • [41] On Efficient Training of Large-Scale Deep Learning Models
    Shen, Li
    Sun, Yan
    Yu, Zhiyuan
    Ding, Liang
    Tian, Xinmei
    Tao, Dacheng
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [42] Stochastic Sequential Minimal Optimization for Large-Scale Linear SVM
    Peng, Shili
    Hu, Qinghua
    Dang, Jianwu
    Peng, Zhichao
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 279 - 288
  • [43] HammingMesh: A Network Topology for Large-Scale Deep Learning
    Hoefler, Torsten
    Bonoto, Tommaso
    De Sensi, Daniele
    Di Girolamo, Salvatore
    Li, Shigang
    Heddes, Marco
    Goel, Deepak
    Castro, Miguel
    Scott, Steve
    COMMUNICATIONS OF THE ACM, 2024, 67 (12) : 97 - 105
  • [44] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
    Mokhtari, Aryan
    Koppel, Alec
    Takac, Martin
    Ribeiro, Alejandro
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [45] Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning
    Yang, Zhuang
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (06) : 1598 - 1606
  • [46] Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning
    Yang, Zhuang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14645 - 14658
  • [47] Hybrid systems: Convergence and stability analysis of stochastic large-scale approximation schemes
    Ladde, GS
    DYNAMIC SYSTEMS AND APPLICATIONS, 2004, 13 (3-4): : 487 - 511
  • [48] GECCO 2023 Tutorial: Large-Scale Optimization and Learning
    Omidvar, Nabi
    Sun, Yuan
    Li, Xiaodong
    PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 1477 - 1502
  • [49] THE EFFECT OF LARGE-SCALE CONVERGENCE ON THE GENERATION AND MAINTENANCE OF DEEP MOIST CONVECTION
    CROOK, NA
    MONCRIEFF, MW
    JOURNAL OF THE ATMOSPHERIC SCIENCES, 1988, 45 (23) : 3606 - 3624
  • [50] A global convergence analysis of an algorithm for large-scale nonlinear optimization problems
    Boggs, PT
    Kearsley, AJ
    Tolle, JW
    SIAM JOURNAL ON OPTIMIZATION, 1999, 9 (04) : 833 - 862