Learning Rates for Nonconvex Pairwise Learning

被引:2
|
作者
Li, Shaojie [1 ]
Liu, Yong [1 ]
机构
[1] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing 100872, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Convergence; Stability analysis; Measurement; Training; Statistics; Sociology; Optimization; Generalization performance; learning rates; nonconvex optimization; pairwise learning; EMPIRICAL RISK; ALGORITHMS; STABILITY; RANKING; MINIMIZATION;
D O I
10.1109/TPAMI.2023.3259324
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pairwise learning is receiving increasing attention since it covers many important machine learning tasks, e.g., metric learning, AUC maximization, and ranking. Investigating the generalization behavior of pairwise learning is thus of great significance. However, existing generalization analysis mainly focuses on the convex objective functions, leaving the nonconvex pairwise learning far less explored. Moreover, the current learning rates of pairwise learning are mostly of slower order. Motivated by these problems, we study the generalization performance of nonconvex pairwise learning and provide improved learning rates. Specifically, we develop different uniform convergence of gradients for pairwise learning under different assumptions, based on which we characterize empirical risk minimizer, gradient descent, and stochastic gradient descent. We first establish learning rates for these algorithms in a general nonconvex setting, where the analysis sheds insights on the trade-off between optimization and generalization and the role of early-stopping. We then derive faster learning rates of order O(1/n) for nonconvex pairwise learning with a gradient dominance curvature condition, where n is the sample size. Provided that the optimal population risk is small, we further improve the learning rates to O(1/n(2)), which, to the best of our knowledge, are the first O(1/n(2)) rates for pairwise learning.
引用
收藏
页码:9996 / 10011
页数:16
相关论文
共 50 条
  • [1] Learning Rates for Stochastic Gradient Descent With Nonconvex Objectives
    Lei, Yunwen
    Tang, Ke
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4505 - 4511
  • [2] Clustered federated learning based on nonconvex pairwise fusion
    Yu, Xue
    Liu, Ziyi
    Wang, Wu
    Sun, Yifan
    INFORMATION SCIENCES, 2024, 678
  • [3] Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity
    Yao, Quanming
    Kwok, James T.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [4] Generalization Guarantee of SGD for Pairwise Learning
    Lei, Yunwen
    Liu, Mingrui
    Ying, Yiming
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Stability and optimization error of stochastic gradient descent for pairwise learning
    Shen, Wei
    Yang, Zhenhuan
    Ying, Yiming
    Yuan, Xiaoming
    ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 887 - 927
  • [6] Stochastic Gradient Descent for Nonconvex Learning Without Bounded Gradient Assumptions
    Lei, Yunwen
    Hu, Ting
    Li, Guiying
    Tang, Ke
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (10) : 4394 - 4400
  • [7] Generalization bounds for pairwise learning with the Huber loss
    Huang, Shouyou
    Zeng, Zhiyi
    Jiang, Siying
    NEUROCOMPUTING, 2025, 622
  • [8] ERROR ANALYSIS OF KERNEL REGULARIZED PAIRWISE LEARNING WITH A STRONGLY CONVEX LOSS
    Wang, Shuhua
    Sheng, Baohuai
    MATHEMATICAL FOUNDATIONS OF COMPUTING, 2023, 6 (04): : 625 - 650
  • [9] SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
    Gower, Robert M.
    Sebbouh, Othmane
    Loizou, Nicolas
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [10] Online regularized pairwise learning with least squares loss
    Wang, Cheng
    Hu, Ting
    ANALYSIS AND APPLICATIONS, 2020, 18 (01) : 49 - 78