Dynamic batch size tuning based on stopping criterion for neural network training

被引：17

作者：

Takase, Tomoumi ^{[1
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan

来源：

NEUROCOMPUTING | 2021年 / 429卷

关键词：

Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;

D O I：

10.1016/j.neucom.2020.11.054

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：1 / 11

页数：11

共 50 条

[41] Improving and appraising on training algorithm of neural network in soft measurement of dynamic flow
Wang, YQ
Tang, Y
Jiang, WL
Wang, HY
Proceedings of the Sixth International Conference on Fluid Power Transmission and Control, 2005, : 32 - 36
[42] Efficient Mini-Batch Training on Memristor Neural Network Integrating Gradient Calculation and Weight Update
Yamamori, Satoshi
Hiromoto, Masayuki
Sato, Takashi
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (07) : 1092 - 1100
[43] Developing a Robust Prediction Interval Based Criterion for Neural Network Model Selection
Khosravi, Abbas
Nahavandi, Saeid
Creighton, Doug
NEURAL INFORMATION PROCESSING: MODELS AND APPLICATIONS, PT II, 2010, 6444 : 727 - 734
[44] A Convolutional Neural Network with Hyperparameter Tuning for Packet Payload-Based Network Intrusion Detection
Boulaiche, Ammar
Haddad, Sofiane
Lemouari, Ali
SYMMETRY-BASEL, 2024, 16 (09):
[45] Edge FPGA-based Onsite Neural Network Training
Chen, Ruiqi
Zhang, Haoyang
Li, Yu
Zhang, Runzhou
Li, Guoyu
Yu, Jun
Wang, Kun
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
[46] TRAINING A NEURAL NETWORK FOR MOMENT BASED IMAGE EDGE DETECTION
王洪玉
李宏东
叶秀清
顾伟康
Journal of Zhejiang University Science, 2000, (04) : 41 - 44
[47] Evolutionary Based Weight Decaying Method for Neural Network Training
Tsoulos, Ioannis G.
Tzallas, Alexandros
Tsalikakis, Dimitris
NEURAL PROCESSING LETTERS, 2018, 47 (02) : 463 - 473
[48] On Neural Network Training Algorithm Based on the Unscented Kalman Filter
Li Hongli
Wang Jiang
Che Yanqiu
Wang Haiyang
Chen Yingyuan
PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 1447 - 1450
[49] Training a neural network for moment based image edge detection
Hong-yu, Wang
Hong-dong, Li
Xiu-qing, Ye
Wei-kang, Gu
Journal of Zhejiang University-SCIENCE A, 2000, Zhejiang University (01): : 398 - 401
[50] Neural network based self-tuning control for overhead crane systems
Burananda, A
Ngamwiwit, J
Pamudomsup, S
Benjanarasuth, T
Komine, N
SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5, 2002, : 1944 - 1947

← 1 2 3 4 5 →