Dynamic batch size tuning based on stopping criterion for neural network training

被引：17

作者：

Takase, Tomoumi ^{[1
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan

来源：

NEUROCOMPUTING | 2021年 / 429卷

关键词：

Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;

D O I：

10.1016/j.neucom.2020.11.054

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：1 / 11

页数：11

共 50 条

[31] Expert judgement-based tuning of the system reliability neural network
Brandowski, A.
Hoang Nguyen
Frackowiak, Wojciech
POLISH MARITIME RESEARCH, 2014, 21 (01) : 28 - 34
[32] Tuning the structure and parameters of a neural network by a new network model based on genetic algorithms
Li, Xiangmei
International Journal of Digital Content Technology and its Applications, 2012, 6 (11) : 29 - 36
[33] Event-based Neural Network for ECG Classification with Delta Encoding and Early Stopping
Jobst, Matthias
Liu, Chen
Partzsch, Johannes
Yan, Yexin
Kappel, David
Gonzalez, Hector A.
Ji, Yue
Vogginger, Bernhard
Mayr, Christian
2020 6TH INTERNATIONAL CONFERENCE ON EVENT-BASED CONTROL, COMMUNICATION, AND SIGNAL PROCESSING (EBCCSP), 2020,
[34] AntiDoteX: Attention-Based Dynamic Optimization for Neural Network Runtime Efficiency
Yu, Fuxun
Xu, Zirui
Liu, Chenchen
Stamoulis, Dimitrios
Wang, Di
Wang, Yanzhi
Chen, Xiang
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4694 - 4707
[35] Hyperparameters optimization for neural network training using Fractal Decomposition-based Algorithm
Souquet, Leo
Shvai, Nadiya
Llanza, Arcadi
Nakib, Amir
2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
[36] Effective neural network training with adaptive learning rate based on training loss
Takase, Tomoumi
Oyama, Satoshi
Kurihara, Masahito
NEURAL NETWORKS, 2018, 101 : 68 - 78
[37] Dynamic digital watermark technique based on neural network
Gu Tao
Li Xu
INDEPENDENT COMPONENT ANALYSES, WAVELETS, UNSUPERVISED NANO-BIOMIMETIC SENSORS, AND NEURAL NETWORKS VI, 2008, 6979
[38] Dynamic System Modeling Based on Recurrent Neural Network
Cao, Wenjie
Zhang, Cheng
Xiong, Zhenzhen
Wang, Ting
Chen, Junchao
Zhang, Bengong
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 37 - 41
[39] Blind Estimation of Spreading Sequence Based on Neural Network by A Novel Information Criterion
Lin, Guo
Ming, Lv
Bin, Tang
ICCEE 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, 2008, : 78 - 81
[40] Differential Evolution-based Neural Network Training Incorporating a Centroid-based Strategy and Dynamic Opposition-based Learning
Mousavirad, Seyed Jalaleddin
Oliva, Diego
Hinojosa, Salvador
Schaefer, Gerald
2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 1233 - 1240

← 1 2 3 4 5 →