Dynamic batch size tuning based on stopping criterion for neural network training

被引:17
|
作者
Takase, Tomoumi [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;
D O I
10.1016/j.neucom.2020.11.054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [21] Auto-tuning Spark Configurations Based on Neural Network
    Gu, Jing
    Li, Ying
    Tang, Hongyan
    Wu, Zhonghai
    2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2018,
  • [22] Effect of Hidden Neuron Size on Different Training Algorithm in Neural Network
    Kumar, Arvind
    Sodhi, Sartaj Singh
    COMMUNICATIONS IN MATHEMATICS AND APPLICATIONS, 2020, 13 (01): : 351 - 365
  • [23] Comparison of Early Stopping Criteria for Neural-Network-Based Subpixel Classification
    Shao, Yang
    Taff, Gregory N.
    Walsh, Stephen J.
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2011, 8 (01) : 113 - 117
  • [24] Tuning parameters of deep neural network training algorithms pays off: a computational study
    Coppola, Corrado
    Papa, Lorenzo
    Boresta, Marco
    Amerini, Irene
    Palagi, Laura
    TOP, 2024, 32 (03) : 579 - 620
  • [25] Neural network based estimation of a semi-batch polymerisation reactor
    Yang, SH
    Chung, PWH
    Brooks, BW
    COMPUTERS & CHEMICAL ENGINEERING, 1999, 23 : S443 - S446
  • [26] Optimize neural network based in-loop filters through iterative training
    Wang, Liqiang
    Xu, Xiaozhong
    Liu, Shan
    2022 PICTURE CODING SYMPOSIUM (PCS), 2022, : 367 - 371
  • [27] Enhancing Distributed Neural Network Training Through Node-Based Communications
    Moreno-Alvarez, Sergio
    Paoletti, Mercedes E.
    Cavallaro, Gabriele
    Haut, Juan M.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (12) : 17893 - 17907
  • [28] FPGA-based Acceleration of Neural Network Training
    Sang, Ruoyu
    Liu, Qiang
    Zhang, Qijun
    2016 IEEE MTT-S INTERNATIONAL CONFERENCE ON NUMERICAL ELECTROMAGNETIC AND MULTIPHYSICS MODELING AND OPTIMIZATION (NEMO), 2016,
  • [29] An Adaptive Memory Multi-Batch L-BFGS Algorithm for Neural Network Training
    Zocco, Federico
    McLoone, Sean
    IFAC PAPERSONLINE, 2020, 53 (02): : 8199 - 8204
  • [30] Agent based Adaptive Firefly Back-propagation Neural Network Training Method for Dynamic Systems
    Nandy, Sudarshan
    Karmakar, Manoj
    Sarkar, Partha Pratim
    Das, Achintya
    Abraham, Ajith
    Paul, Diptarup
    2012 12TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2012, : 449 - 454