Dynamic batch size tuning based on stopping criterion for neural network training

被引：17

作者：

Takase, Tomoumi ^{[1
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan

来源：

NEUROCOMPUTING | 2021年 / 429卷

关键词：

Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;

D O I：

10.1016/j.neucom.2020.11.054

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：1 / 11

页数：11

共 50 条

[1] Dynamic batch size tuning based on stopping criterion for neural network training
Takase, Tomoumi
Neurocomputing, 2021, 429 : 1 - 11
[2] Surpassing early stopping: A novel correlation-based stopping criterion for neural networks
Miseta, Tamas
Fodor, Attila
Vathy-Fogarassy, Agnes
NEUROCOMPUTING, 2024, 567
[3] A self tuning controller for multicomponent batch distillation with soft sensor inference based on a neural network
Fileti, AMF
Pedrosa, LS
Pereira, JAFR
COMPUTERS & CHEMICAL ENGINEERING, 1999, 23 : S261 - S264
[4] Fast and accurate variable batch size convolution neural network training on large scale distributed systems
Hu, Zhongzhe
Xiao, Junmin
Sun, Ninghui
Tan, Guangming
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (21)
[5] An. Attitude Control Method of Aircraft Based on Neural Network Batch Training Strategy
Li, Yitong
Wang, Xiaodong
Zhang, Huiping
Liu, Xiaodong
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1788 - 1793
[6] Neural network-based optimal control of a batch crystallizer
Paengjuntuek, Woranee
Thanasinthana, Linda
Arpornwichanop, Amornchai
NEUROCOMPUTING, 2012, 83 : 158 - 164
[7] Evaluation of the logarithmic-sensitivity index as a neural network stopping criterion for rare outcomes
Ennett, CM
Frize, M
Scales, N
ITAB 2003: 4TH INTERNATIONAL IEEE EMBS SPECIAL TOPIC CONFERENCE ON INFORMATION TECHNOLOGY APPLICATIONS IN BIOMEDICINE, CONFERENCE PROCEEDINGS: NEW SOLUTIONS FOR NEW CHALLENGES, 2003, : 207 - 210
[8] Bayesian optimization-derived batch size and learning rate scheduling in deep neural network training for head and neck tumor segmentation
Douglas, Zachariah
Wang, Haifeng
2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 48 - 54
[9] Neural Network Training with Safe Regularization in the Null Space of Batch Activations
Kissel, Matthias
Gottwald, Martin
Diepold, Klaus
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 217 - 228
[10] Convolutional Neural Network Training with Dynamic Epoch Ordering
Plana Rius, Ferran
Angulo Bahon, Cecilio
Casas, Marc
Mirats Tur, Josep Maria
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 105 - 114

← 1 2 3 4 5 →