SALR: Sharpness-Aware Learning Rate Scheduler for Improved Generalization

被引：0

作者：

Yue, Xubo ^{[1
]}

Nouiehed, Maher ^{[2
]}

Al Kontar, Raed ^{[1
]}

机构：

[1] Univ Michigan, Dept Ind & Operat Engn, Ann Arbor, MI 48109 USA

[2] Amer Univ Beirut, Dept Ind Engn & Management, Beirut 1072020, Lebanon

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 09期

基金：

美国国家科学基金会;

关键词：

Schedules; Deep learning; Neural networks; Convergence; Bayes methods; Training; Stochastic processes; generalization; learning rate schedule; sharpness;

D O I：

10.1109/TNNLS.2023.3263393

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In an effort to improve generalization in deep learning and automate the process of learning rate scheduling, we propose SALR: a sharpness-aware learning rate update technique designed to recover flat minimizers. Our method dynamically updates the learning rate of gradient-based optimizers based on the local sharpness of the loss function. This allows optimizers to automatically increase learning rates at sharp valleys to increase the chance of escaping them. We demonstrate the effectiveness of SALR when adopted by various algorithms over a broad range of networks. Our experiments indicate that SALR improves generalization, converges faster, and drives solutions to significantly flatter regions.

引用

页码：12518 / 12527

页数：10

共 50 条

[21] LETFORMER: LIGHTWEIGHT TRANSFORMER PRE-TRAINING WITH SHARPNESS-AWARE OPTIMIZATION FOR EFFICIENT ENCRYPTED TRAFFIC ANALYSIS
Meng, Zhiyan
Liu, Dan
Meng, Jintao
International Journal of Innovative Computing, Information and Control, 2025, 21 (02): : 359 - 371
[22] Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
Shim, Hye-jin
Jung, Jee-weon
Kinnunen, Tomi
INTERSPEECH 2023, 2023, : 3804 - 3808
[23] VI-PANN: Harnessing Transfer Learning and Uncertainty-Aware Variational Inference for Improved Generalization in Audio Pattern Recognition
Fischer, John
Orescanin, Marko
Eckstrand, Eric
IEEE ACCESS, 2024, 12 : 33347 - 33360
[24] Convolutional Neural Network With Automatic Learning Rate Scheduler for Fault Classification
Wen, Long
Gao, Liang
Li, Xinyu
Zeng, Bing
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71 : 13 - 13
[25] SCHEDTUNE: A Heterogeneity-Aware GPU Scheduler for Deep Learning
Albahar, Hadeel
Dongare, Shruti
Du, Yanlin
Zhao, Nannan
Paul, Arnab K.
Butt, Ali R.
2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 695 - 705
[26] Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent
Wang, Zenghui
Zhang, Jun
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 7060 - 7071
[27] An Energy and Temperature Aware Deep Reinforcement Learning Workflow Scheduler in Cloud Computing
Mangalampalli, S. Sudheer
Karri, Ganesh Reddy
Ch, Pradeep Reddy
Pokkuluri, Kiran Sree
Chakrabarti, Prasun
Chakrabarti, Tulika
IEEE ACCESS, 2024, 12 : 163424 - 163443
[28] Feature Stylization and Domain-aware Contrastive Learning for Domain Generalization
Jeon, Seogkyu
Hong, Kibeom
Lee, Pilhyeon
Lee, Jewook
Byun, Hyeran
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 22 - 31
[29] A New Reinforcement Learning Based Learning Rate Scheduler for Convolutional Neural Network in Fault Classification
Wen, Long
Li, Xinyu
Gao, Liang
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (12) : 12890 - 12900
[30] Fed-ensemble: Ensemble Models in Federated Learning for Improved Generalization and Uncertainty Quantification
Shi, Naichen
Lai, Fan
Al Kontar, Raed
Chowdhury, Mosharaf
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 2792 - 2803

← 1 2 3 4 5 →