Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引：0

作者：

Qiu, Zi-Hao ^{[1
]}

Hu, Quanqi ^{[2
]}

Zhong, Yongjian ^{[2
]}

Zhang, Lijun ^{[1
]}

Yang, Tianbao ^{[2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Univ Iowa, Iowa City, IA 52242 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.

引用

页数：31

共 50 条

[21] Deep Learning on Large-scale Muticore Clusters
Sakiyama, Kazumasa
Kato, Shinpei
Ishikawa, Yutaka
Hori, Atsushi
Monrroy, Abraham
2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 314 - 321
[22] Tractable large-scale deep reinforcement learning
Sarang, Nima
Poullis, Charalambos
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 232
[23] Large-scale transport simulation by deep learning
Pan, Jie
NATURE COMPUTATIONAL SCIENCE, 2021, 1 (05): : 306 - 306
[24] The three pillars of large-scale deep learning
Hoefler, Torsten
2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 908 - 908
[25] Large-Scale Machine Learning with Stochastic Gradient Descent
Bottou, Leon
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
[26] Optimization Methods for Large-Scale Machine Learning
Bottou, Leon
Curtis, Frank E.
Nocedal, Jorge
SIAM REVIEW, 2018, 60 (02) : 223 - 311
[27] On-site surrogates for large-scale calibration
Huang, Jiangeng
Gramacy, Robert B.
Binois, Mickael
Libraschi, Mirko
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2020, 36 (02) : 283 - 304
[28] Particle swarm optimization with convergence speed controller for large-scale numerical optimization
Huang, Han
Lv, Liang
Ye, Shujin
Hao, Zhifeng
SOFT COMPUTING, 2019, 23 (12) : 4421 - 4437
[29] Particle swarm optimization with convergence speed controller for large-scale numerical optimization
Han Huang
Liang Lv
Shujin Ye
Zhifeng Hao
Soft Computing, 2019, 23 : 4421 - 4437
[30] Spatial Convergence of Federated Learning in Large-Scale Cellular Networks
Lin, Zhenyi
Li, Xiaoyang
Lau, Vincent K. N.
Gong, Yi
Huang, Kaibin
SPAWC 2021: 2021 IEEE 22ND INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC 2021), 2020, : 231 - 235

← 1 2 3 4 5 →