SALR: Sharpness-Aware Learning Rate Scheduler for Improved Generalization

被引:0
|
作者
Yue, Xubo [1 ]
Nouiehed, Maher [2 ]
Al Kontar, Raed [1 ]
机构
[1] Univ Michigan, Dept Ind & Operat Engn, Ann Arbor, MI 48109 USA
[2] Amer Univ Beirut, Dept Ind Engn & Management, Beirut 1072020, Lebanon
基金
美国国家科学基金会;
关键词
Schedules; Deep learning; Neural networks; Convergence; Bayes methods; Training; Stochastic processes; generalization; learning rate schedule; sharpness;
D O I
10.1109/TNNLS.2023.3263393
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In an effort to improve generalization in deep learning and automate the process of learning rate scheduling, we propose SALR: a sharpness-aware learning rate update technique designed to recover flat minimizers. Our method dynamically updates the learning rate of gradient-based optimizers based on the local sharpness of the loss function. This allows optimizers to automatically increase learning rates at sharp valleys to increase the chance of escaping them. We demonstrate the effectiveness of SALR when adopted by various algorithms over a broad range of networks. Our experiments indicate that SALR improves generalization, converges faster, and drives solutions to significantly flatter regions.
引用
收藏
页码:12518 / 12527
页数:10
相关论文
共 50 条
  • [31] Actor-Critic Learning Based QoS-Aware Scheduler for Reconfigurable Wireless Networks
    Mollahasani, Shahram
    Erol-Kantarci, Melike
    Hirab, Mahdi
    Dehghan, Hoda
    Wilson, Rodney
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (01): : 45 - 54
  • [32] Style Augmentation and Domain-Aware Parametric Contrastive Learning for Domain Generalization
    Li, Mingkang
    Zhang, Jiali
    Zhang, Wen
    Gong, Lu
    Zhang, Zili
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2023, 2023, 14120 : 211 - 224
  • [33] GAN Supervised Seismic Data Reconstruction: An Enhanced Learning for Improved Generalization
    Goyes-Penafiel, Paul
    Suarez-Rodriguez, Leon
    Correa, Claudia V.
    Arguello, Henry
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [34] Self-Organized Mutual Information Maximization Learning for Improved Generalization Performance
    Kamimura, Ryotaro
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 1613 - 1618
  • [35] A Reinforcement Learning Approach for Scheduling Problems with Improved Generalization through Order Swapping
    Vivekanandan, Deepak
    Wirth, Samuel
    Karlbauer, Patrick
    Klarmann, Noah
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2023, 5 (02): : 418 - 430
  • [36] The Improved Training Algorithm of Deep Learning with Self-Adaptive Learning Rate
    Ongart, Sutit
    Jearanaitanakij, Kietikul
    Sangthong, Jirapat
    2018 18TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2018, : 463 - 466
  • [37] Rapid Reconstruction of Acoustic Holograms Using Improved IU-Net With Adaptive Learning Rate Scheduling
    Cui, Mingzhe
    Li, Yang
    Wang, Xuewei
    Wang, Jia
    Rong, Haoxuan
    IEEE ACCESS, 2024, 12 : 178199 - 178208
  • [38] Fast Specific Absorption Rate Aware Beamforming for Downlink SWIPT via Deep Learning
    Zhang, Juping
    Zheng, Gan
    Krikidis, Ioannis
    Zhang, Rui
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (12) : 16178 - 16182
  • [39] A novel framework using 3D-CNN and BiLSTM model with dynamic learning rate scheduler for visual speech recognition
    Chandrabanshi, Vishnu
    Domnic, S.
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (6-7) : 5433 - 5448
  • [40] Texture aware autoencoder pre-training and pairwise learning refinement for improved iris recognition
    Manashi Chakraborty
    Aritri Chakraborty
    Prabir Kumar Biswas
    Pabitra Mitra
    Multimedia Tools and Applications, 2023, 82 : 25381 - 25401