SALR: Sharpness-Aware Learning Rate Scheduler for Improved Generalization

被引:0
|
作者
Yue, Xubo [1 ]
Nouiehed, Maher [2 ]
Al Kontar, Raed [1 ]
机构
[1] Univ Michigan, Dept Ind & Operat Engn, Ann Arbor, MI 48109 USA
[2] Amer Univ Beirut, Dept Ind Engn & Management, Beirut 1072020, Lebanon
基金
美国国家科学基金会;
关键词
Schedules; Deep learning; Neural networks; Convergence; Bayes methods; Training; Stochastic processes; generalization; learning rate schedule; sharpness;
D O I
10.1109/TNNLS.2023.3263393
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In an effort to improve generalization in deep learning and automate the process of learning rate scheduling, we propose SALR: a sharpness-aware learning rate update technique designed to recover flat minimizers. Our method dynamically updates the learning rate of gradient-based optimizers based on the local sharpness of the loss function. This allows optimizers to automatically increase learning rates at sharp valleys to increase the chance of escaping them. We demonstrate the effectiveness of SALR when adopted by various algorithms over a broad range of networks. Our experiments indicate that SALR improves generalization, converges faster, and drives solutions to significantly flatter regions.
引用
收藏
页码:12518 / 12527
页数:10
相关论文
共 50 条
  • [21] LETFORMER: LIGHTWEIGHT TRANSFORMER PRE-TRAINING WITH SHARPNESS-AWARE OPTIMIZATION FOR EFFICIENT ENCRYPTED TRAFFIC ANALYSIS
    Meng, Zhiyan
    Liu, Dan
    Meng, Jintao
    International Journal of Innovative Computing, Information and Control, 2025, 21 (02): : 359 - 371
  • [22] Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
    Shim, Hye-jin
    Jung, Jee-weon
    Kinnunen, Tomi
    INTERSPEECH 2023, 2023, : 3804 - 3808
  • [23] VI-PANN: Harnessing Transfer Learning and Uncertainty-Aware Variational Inference for Improved Generalization in Audio Pattern Recognition
    Fischer, John
    Orescanin, Marko
    Eckstrand, Eric
    IEEE ACCESS, 2024, 12 : 33347 - 33360
  • [24] Convolutional Neural Network With Automatic Learning Rate Scheduler for Fault Classification
    Wen, Long
    Gao, Liang
    Li, Xinyu
    Zeng, Bing
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71 : 13 - 13
  • [25] SCHEDTUNE: A Heterogeneity-Aware GPU Scheduler for Deep Learning
    Albahar, Hadeel
    Dongare, Shruti
    Du, Yanlin
    Zhao, Nannan
    Paul, Arnab K.
    Butt, Ali R.
    2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 695 - 705
  • [26] Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent
    Wang, Zenghui
    Zhang, Jun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 7060 - 7071
  • [27] An Energy and Temperature Aware Deep Reinforcement Learning Workflow Scheduler in Cloud Computing
    Mangalampalli, S. Sudheer
    Karri, Ganesh Reddy
    Ch, Pradeep Reddy
    Pokkuluri, Kiran Sree
    Chakrabarti, Prasun
    Chakrabarti, Tulika
    IEEE ACCESS, 2024, 12 : 163424 - 163443
  • [28] Feature Stylization and Domain-aware Contrastive Learning for Domain Generalization
    Jeon, Seogkyu
    Hong, Kibeom
    Lee, Pilhyeon
    Lee, Jewook
    Byun, Hyeran
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 22 - 31
  • [29] A New Reinforcement Learning Based Learning Rate Scheduler for Convolutional Neural Network in Fault Classification
    Wen, Long
    Li, Xinyu
    Gao, Liang
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (12) : 12890 - 12900
  • [30] Fed-ensemble: Ensemble Models in Federated Learning for Improved Generalization and Uncertainty Quantification
    Shi, Naichen
    Lai, Fan
    Al Kontar, Raed
    Chowdhury, Mosharaf
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 2792 - 2803