Learning Rates for Nonconvex Pairwise Learning

被引:2
|
作者
Li, Shaojie [1 ]
Liu, Yong [1 ]
机构
[1] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing 100872, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Convergence; Stability analysis; Measurement; Training; Statistics; Sociology; Optimization; Generalization performance; learning rates; nonconvex optimization; pairwise learning; EMPIRICAL RISK; ALGORITHMS; STABILITY; RANKING; MINIMIZATION;
D O I
10.1109/TPAMI.2023.3259324
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pairwise learning is receiving increasing attention since it covers many important machine learning tasks, e.g., metric learning, AUC maximization, and ranking. Investigating the generalization behavior of pairwise learning is thus of great significance. However, existing generalization analysis mainly focuses on the convex objective functions, leaving the nonconvex pairwise learning far less explored. Moreover, the current learning rates of pairwise learning are mostly of slower order. Motivated by these problems, we study the generalization performance of nonconvex pairwise learning and provide improved learning rates. Specifically, we develop different uniform convergence of gradients for pairwise learning under different assumptions, based on which we characterize empirical risk minimizer, gradient descent, and stochastic gradient descent. We first establish learning rates for these algorithms in a general nonconvex setting, where the analysis sheds insights on the trade-off between optimization and generalization and the role of early-stopping. We then derive faster learning rates of order O(1/n) for nonconvex pairwise learning with a gradient dominance curvature condition, where n is the sample size. Provided that the optimal population risk is small, we further improve the learning rates to O(1/n(2)), which, to the best of our knowledge, are the first O(1/n(2)) rates for pairwise learning.
引用
收藏
页码:9996 / 10011
页数:16
相关论文
共 50 条
  • [31] Online pairwise learning algorithms with convex loss functions
    Lin, Junhong
    Lei, Yunwen
    Zhang, Bo
    Zhou, Ding-Xuan
    INFORMATION SCIENCES, 2017, 406 : 57 - 70
  • [32] New Scalable and Efficient Online Pairwise Learning Algorithm
    Gu, Bin
    Bao, Runxue
    Zhang, Chenkang
    Huang, Heng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17099 - 17110
  • [33] On the robustness of regularized pairwise learning methods based on kernels
    Christmann, Andreas
    Zhou, Ding-Xuan
    JOURNAL OF COMPLEXITY, 2016, 37 : 1 - 33
  • [34] Scalable and Efficient Pairwise Learning to Achieve Statistical Accuracy
    Gu, Bin
    Huo, Zhouyuan
    Huang, Heng
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3697 - 3704
  • [35] Optimizing Visual Search Reranking via Pairwise Learning
    Liu, Yuan
    Mei, Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2011, 13 (02) : 280 - 291
  • [36] Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
    Xu, Dongpo
    Zhang, Shengdong
    Zhang, Huisheng
    Mandic, Danilo P.
    NEURAL NETWORKS, 2021, 139 : 17 - 23
  • [37] Accelerated gradient methods for sparse statistical learning with nonconvex penalties
    Yang, Kai
    Asgharian, Masoud
    Bhatnagar, Sahir
    STATISTICS AND COMPUTING, 2024, 34 (01)
  • [38] Fast Low-Rank Matrix Learning with Nonconvex Regularization
    Yao, Quanming
    Kwok, James T.
    Zhong, Wenliang
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 539 - 548
  • [39] Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks
    Iiduka, Hideaki
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13250 - 13261
  • [40] Generalization Bounds for Regularized Pairwise Learning
    Le, Yunwen
    Lin, Shao-Bo
    Tang, Ke
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2376 - 2382