Dynamic batch size tuning based on stopping criterion for neural network training

被引:17
|
作者
Takase, Tomoumi [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;
D O I
10.1016/j.neucom.2020.11.054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [1] Dynamic batch size tuning based on stopping criterion for neural network training
    Takase, Tomoumi
    Neurocomputing, 2021, 429 : 1 - 11
  • [2] Surpassing early stopping: A novel correlation-based stopping criterion for neural networks
    Miseta, Tamas
    Fodor, Attila
    Vathy-Fogarassy, Agnes
    NEUROCOMPUTING, 2024, 567
  • [3] A self tuning controller for multicomponent batch distillation with soft sensor inference based on a neural network
    Fileti, AMF
    Pedrosa, LS
    Pereira, JAFR
    COMPUTERS & CHEMICAL ENGINEERING, 1999, 23 : S261 - S264
  • [4] Fast and accurate variable batch size convolution neural network training on large scale distributed systems
    Hu, Zhongzhe
    Xiao, Junmin
    Sun, Ninghui
    Tan, Guangming
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (21)
  • [5] An. Attitude Control Method of Aircraft Based on Neural Network Batch Training Strategy
    Li, Yitong
    Wang, Xiaodong
    Zhang, Huiping
    Liu, Xiaodong
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1788 - 1793
  • [6] Neural network-based optimal control of a batch crystallizer
    Paengjuntuek, Woranee
    Thanasinthana, Linda
    Arpornwichanop, Amornchai
    NEUROCOMPUTING, 2012, 83 : 158 - 164
  • [7] Evaluation of the logarithmic-sensitivity index as a neural network stopping criterion for rare outcomes
    Ennett, CM
    Frize, M
    Scales, N
    ITAB 2003: 4TH INTERNATIONAL IEEE EMBS SPECIAL TOPIC CONFERENCE ON INFORMATION TECHNOLOGY APPLICATIONS IN BIOMEDICINE, CONFERENCE PROCEEDINGS: NEW SOLUTIONS FOR NEW CHALLENGES, 2003, : 207 - 210
  • [8] Bayesian optimization-derived batch size and learning rate scheduling in deep neural network training for head and neck tumor segmentation
    Douglas, Zachariah
    Wang, Haifeng
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 48 - 54
  • [9] Neural Network Training with Safe Regularization in the Null Space of Batch Activations
    Kissel, Matthias
    Gottwald, Martin
    Diepold, Klaus
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 217 - 228
  • [10] Convolutional Neural Network Training with Dynamic Epoch Ordering
    Plana Rius, Ferran
    Angulo Bahon, Cecilio
    Casas, Marc
    Mirats Tur, Josep Maria
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 105 - 114