Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引:0
作者
Qiu, Zi-Hao [1 ]
Hu, Quanqi [2 ]
Zhong, Yongjian [2 ]
Zhang, Lijun [1 ]
Yang, Tianbao [2 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Univ Iowa, Iowa City, IA 52242 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.
引用
收藏
页数:31
相关论文
共 50 条
  • [11] Large-scale Deep Learning at Baidu
    Yu, Kai
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2211 - 2211
  • [12] Large-Scale Stochastic Learning using GPUs
    Parnell, Thomas
    Dunner, Celestine
    Atasu, Kubilay
    Sifalakis, Manolis
    Pozidis, Haris
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 419 - 428
  • [13] Variance Counterbalancing for Stochastic Large-scale Learning
    Lagari, Pola Lydia
    Tsoukalas, Lefteri H.
    Lagaris, Isaac E.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (05)
  • [14] Provable Stochastic Algorithm for Large-Scale Fully-Connected Tensor Network Decomposition
    Zheng, Wen-Jie
    Zhao, Xi-Le
    Zheng, Yu-Bang
    Huang, Ting-Zhu
    JOURNAL OF SCIENTIFIC COMPUTING, 2024, 98 (01)
  • [15] Provable Stochastic Algorithm for Large-Scale Fully-Connected Tensor Network Decomposition
    Wen-Jie Zheng
    Xi-Le Zhao
    Yu-Bang Zheng
    Ting-Zhu Huang
    Journal of Scientific Computing, 2024, 98
  • [16] STABILITY AND CONVERGENCE OF LARGE-SCALE STOCHASTIC-APPROXIMATION PROCEDURES
    LADDE, GS
    LAWRENCE, BA
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1995, 26 (03) : 595 - 618
  • [17] Embedding Optimization for Training Large-scale Deep Learning Recommendation Systems with EMBark
    Liu, Shijie
    Zheng, Nan
    Kang, Hui
    Simmons, Xavier
    Zhang, Junjie
    Langer, Matthias
    Zhu, Wenjing
    Lee, Minseok
    Wang, Zehuan
    PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 622 - 632
  • [18] Large-scale transport simulation by deep learning
    Jie Pan
    Nature Computational Science, 2021, 1 : 306 - 306
  • [19] Learning Deep Representation with Large-scale Attributes
    Ouyang, Wanli
    Li, Hongyang
    Zeng, Xingyu
    Wang, Xiaogang
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1895 - 1903
  • [20] Large-scale Pollen Recognition with Deep Learning
    de Geus, Andre R.
    Barcelos, Celia A. Z.
    Batista, Marcos A.
    da Silva, Sergio F.
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,