Dynamic batch size tuning based on stopping criterion for neural network training

被引:17
|
作者
Takase, Tomoumi [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;
D O I
10.1016/j.neucom.2020.11.054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [31] Expert judgement-based tuning of the system reliability neural network
    Brandowski, A.
    Hoang Nguyen
    Frackowiak, Wojciech
    POLISH MARITIME RESEARCH, 2014, 21 (01) : 28 - 34
  • [32] Tuning the structure and parameters of a neural network by a new network model based on genetic algorithms
    Li, Xiangmei
    International Journal of Digital Content Technology and its Applications, 2012, 6 (11) : 29 - 36
  • [33] Event-based Neural Network for ECG Classification with Delta Encoding and Early Stopping
    Jobst, Matthias
    Liu, Chen
    Partzsch, Johannes
    Yan, Yexin
    Kappel, David
    Gonzalez, Hector A.
    Ji, Yue
    Vogginger, Bernhard
    Mayr, Christian
    2020 6TH INTERNATIONAL CONFERENCE ON EVENT-BASED CONTROL, COMMUNICATION, AND SIGNAL PROCESSING (EBCCSP), 2020,
  • [34] AntiDoteX: Attention-Based Dynamic Optimization for Neural Network Runtime Efficiency
    Yu, Fuxun
    Xu, Zirui
    Liu, Chenchen
    Stamoulis, Dimitrios
    Wang, Di
    Wang, Yanzhi
    Chen, Xiang
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4694 - 4707
  • [35] Hyperparameters optimization for neural network training using Fractal Decomposition-based Algorithm
    Souquet, Leo
    Shvai, Nadiya
    Llanza, Arcadi
    Nakib, Amir
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [36] Effective neural network training with adaptive learning rate based on training loss
    Takase, Tomoumi
    Oyama, Satoshi
    Kurihara, Masahito
    NEURAL NETWORKS, 2018, 101 : 68 - 78
  • [37] Dynamic digital watermark technique based on neural network
    Gu Tao
    Li Xu
    INDEPENDENT COMPONENT ANALYSES, WAVELETS, UNSUPERVISED NANO-BIOMIMETIC SENSORS, AND NEURAL NETWORKS VI, 2008, 6979
  • [38] Dynamic System Modeling Based on Recurrent Neural Network
    Cao, Wenjie
    Zhang, Cheng
    Xiong, Zhenzhen
    Wang, Ting
    Chen, Junchao
    Zhang, Bengong
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 37 - 41
  • [39] Blind Estimation of Spreading Sequence Based on Neural Network by A Novel Information Criterion
    Lin, Guo
    Ming, Lv
    Bin, Tang
    ICCEE 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, 2008, : 78 - 81
  • [40] Differential Evolution-based Neural Network Training Incorporating a Centroid-based Strategy and Dynamic Opposition-based Learning
    Mousavirad, Seyed Jalaleddin
    Oliva, Diego
    Hinojosa, Salvador
    Schaefer, Gerald
    2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 1233 - 1240