Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引:0
作者
Qiu, Zi-Hao [1 ]
Hu, Quanqi [2 ]
Zhong, Yongjian [2 ]
Zhang, Lijun [1 ]
Yang, Tianbao [2 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Univ Iowa, Iowa City, IA 52242 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.
引用
收藏
页数:31
相关论文
共 50 条
  • [31] Distributed Computational Framework for Large-Scale Stochastic Convex Optimization
    Rostampour, Vahab
    Keviczky, Tamas
    ENERGIES, 2021, 14 (01)
  • [32] LARGE-SCALE NONCONVEX STOCHASTIC OPTIMIZATION BY DOUBLY STOCHASTIC SUCCESSIVE CONVEX APPROXIMATION
    Mokhtari, Aryan
    Koppel, Alec
    Scutari, Gesualdo
    Ribeiro, Alejandro
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4701 - 4705
  • [33] New scenario decomposition method for large-scale stochastic optimization
    Mulvey, John M.
    Ruszczynski, Andrzej
    Operations Research, 1995, 43 (03):
  • [34] Provable Super-Convergence With a Large Cyclical Learning Rate
    Oymak, Samet
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 (28) : 1645 - 1649
  • [35] A NEW SCENARIO DECOMPOSITION METHOD FOR LARGE-SCALE STOCHASTIC OPTIMIZATION
    MULVEY, JM
    RUSZCZYNSKI, AJ
    OPERATIONS RESEARCH, 1995, 43 (03) : 477 - 490
  • [36] A STOCHASTIC QUASI-NEWTON METHOD FOR LARGE-SCALE OPTIMIZATION
    Byrd, R. H.
    Hansen, S. L.
    Nocedal, Jorge
    Singer, Y.
    SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (02) : 1008 - 1031
  • [37] Deep learning for the large-scale cancer data analysis
    Tsuji, Shingo
    Aburatani, Hiroyuki
    CANCER RESEARCH, 2015, 75 (22)
  • [38] Deep Reinforcement Learning for Large-Scale Epidemic Control
    Libin, Pieter J. K.
    Moonens, Arno
    Verstraeten, Timothy
    Perez-Sanjines, Fabian
    Hens, Niel
    Lemey, Philippe
    Nowe, Ann
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2020, PT V, 2021, 12461 : 155 - 170
  • [39] Deep learning large-scale drug discovery and repurposing
    Yu, Min
    Li, Weiming
    Yu, Yunru
    Zhao, Yu
    Xiao, Lizhi
    Lauschke, Volker M.
    Cheng, Yiyu
    Zhang, Xingcai
    Wang, Yi
    NATURE COMPUTATIONAL SCIENCE, 2024, 4 (08): : 600 - 614
  • [40] HammingMesh: A Network Topology for Large-Scale Deep Learning
    Hoefler, Torsten
    Bonato, Tommaso
    De Sensi, Daniele
    Di Girolamo, Salvatore
    Li, Shigang
    Heddes, Marco
    Belk, Jon
    Goel, Deepak
    Castro, Miguel
    Scott, Steve
    SC22: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2022,