Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

被引：0

作者：

Qiu, Zi-Hao ^{[1
]}

Hu, Quanqi ^{[2
]}

Zhong, Yongjian ^{[2
]}

Zhang, Lijun ^{[1
]}

Yang, Tianbao ^{[2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Univ Iowa, Iowa City, IA 52242 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

NDCG, namely Normalized Discounted Cumulative Gain, is a widely used ranking metric in information retrieval and machine learning. However, efficient and provable stochastic methods for maximizing NDCG are still lacking, especially for deep models. In this paper, we propose a principled approach to optimize NDCG and its top-K variant. First, we formulate a novel compositional optimization problem for optimizing the NDCG surrogate, and a novel bilevel compositional optimization problem for optimizing the top-K NDCG surrogate. Then, we develop efficient stochastic algorithms with provable convergence guarantees for the non-convex objectives. Different from existing NDCG optimization methods, the per-iteration complexity of our algorithms scales with the mini-batch size instead of the number of total items. To improve the effectiveness for deep learning, we further propose practical strategies by using initial warmup and stop gradient operator. Experimental results on multiple datasets demonstrate that our methods outperform prior ranking approaches in terms of NDCG. To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee. Our proposed methods are implemented in the LibAUC library at https://libauc.org/.

引用

页数：31

共 50 条

[11] Large-scale Deep Learning at Baidu
Yu, Kai
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2211 - 2211
[12] Large-Scale Stochastic Learning using GPUs
Parnell, Thomas
Dunner, Celestine
Atasu, Kubilay
Sifalakis, Manolis
Pozidis, Haris
2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 419 - 428
[13] Variance Counterbalancing for Stochastic Large-scale Learning
Lagari, Pola Lydia
Tsoukalas, Lefteri H.
Lagaris, Isaac E.
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (05)
[14] Provable Stochastic Algorithm for Large-Scale Fully-Connected Tensor Network Decomposition
Zheng, Wen-Jie
Zhao, Xi-Le
Zheng, Yu-Bang
Huang, Ting-Zhu
JOURNAL OF SCIENTIFIC COMPUTING, 2024, 98 (01)
[15] Provable Stochastic Algorithm for Large-Scale Fully-Connected Tensor Network Decomposition
Wen-Jie Zheng
Xi-Le Zhao
Yu-Bang Zheng
Ting-Zhu Huang
Journal of Scientific Computing, 2024, 98
[16] STABILITY AND CONVERGENCE OF LARGE-SCALE STOCHASTIC-APPROXIMATION PROCEDURES
LADDE, GS
LAWRENCE, BA
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1995, 26 (03) : 595 - 618
[17] Embedding Optimization for Training Large-scale Deep Learning Recommendation Systems with EMBark
Liu, Shijie
Zheng, Nan
Kang, Hui
Simmons, Xavier
Zhang, Junjie
Langer, Matthias
Zhu, Wenjing
Lee, Minseok
Wang, Zehuan
PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 622 - 632
[18] Large-scale transport simulation by deep learning
Jie Pan
Nature Computational Science, 2021, 1 : 306 - 306
[19] Learning Deep Representation with Large-scale Attributes
Ouyang, Wanli
Li, Hongyang
Zeng, Xingyu
Wang, Xiaogang
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1895 - 1903
[20] Large-scale Pollen Recognition with Deep Learning
de Geus, Andre R.
Barcelos, Celia A. Z.
Batista, Marcos A.
da Silva, Sergio F.
2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,

← 1 2 3 4 5 →