Dynamic batch size tuning based on stopping criterion for neural network training

被引：17

作者：

Takase, Tomoumi ^{[1
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan

来源：

NEUROCOMPUTING | 2021年 / 429卷

关键词：

Neural network; Hyper-parameter tuning; Deep learning; Early stopping; Batch size; Optimization;

D O I：

10.1016/j.neucom.2020.11.054

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the neural network training, the selection of the minibatch size affects not only the computational cost but also the training performance, as it underlies the loss function. Generally, an approach based on increasing the batch size according to the linear and step functions during the training process is known to be effective in improving the generalization performance of a network. However, it requires specifying the way of increasing the batch size beforehand. In this study, we propose a more flexible method that implies temporarily varying a small batch size to destabilize the loss function when a change in the training loss satisfies the predefined stopping criterion. Repeating the destabilization step allows a parameter to avoid being trapped at the local minima and to converge at a robust minimum, thereby improving generalization performance. We experimentally demonstrate the superiority of the proposed method considering several benchmark datasets and neural network models. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：1 / 11

页数：11

共 50 条

[21] Auto-tuning Spark Configurations Based on Neural Network
Gu, Jing
Li, Ying
Tang, Hongyan
Wu, Zhonghai
2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2018,
[22] Effect of Hidden Neuron Size on Different Training Algorithm in Neural Network
Kumar, Arvind
Sodhi, Sartaj Singh
COMMUNICATIONS IN MATHEMATICS AND APPLICATIONS, 2020, 13 (01): : 351 - 365
[23] Comparison of Early Stopping Criteria for Neural-Network-Based Subpixel Classification
Shao, Yang
Taff, Gregory N.
Walsh, Stephen J.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2011, 8 (01) : 113 - 117
[24] Tuning parameters of deep neural network training algorithms pays off: a computational study
Coppola, Corrado
Papa, Lorenzo
Boresta, Marco
Amerini, Irene
Palagi, Laura
TOP, 2024, 32 (03) : 579 - 620
[25] Neural network based estimation of a semi-batch polymerisation reactor
Yang, SH
Chung, PWH
Brooks, BW
COMPUTERS & CHEMICAL ENGINEERING, 1999, 23 : S443 - S446
[26] Optimize neural network based in-loop filters through iterative training
Wang, Liqiang
Xu, Xiaozhong
Liu, Shan
2022 PICTURE CODING SYMPOSIUM (PCS), 2022, : 367 - 371
[27] Enhancing Distributed Neural Network Training Through Node-Based Communications
Moreno-Alvarez, Sergio
Paoletti, Mercedes E.
Cavallaro, Gabriele
Haut, Juan M.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (12) : 17893 - 17907
[28] FPGA-based Acceleration of Neural Network Training
Sang, Ruoyu
Liu, Qiang
Zhang, Qijun
2016 IEEE MTT-S INTERNATIONAL CONFERENCE ON NUMERICAL ELECTROMAGNETIC AND MULTIPHYSICS MODELING AND OPTIMIZATION (NEMO), 2016,
[29] An Adaptive Memory Multi-Batch L-BFGS Algorithm for Neural Network Training
Zocco, Federico
McLoone, Sean
IFAC PAPERSONLINE, 2020, 53 (02): : 8199 - 8204
[30] Agent based Adaptive Firefly Back-propagation Neural Network Training Method for Dynamic Systems
Nandy, Sudarshan
Karmakar, Manoj
Sarkar, Partha Pratim
Das, Achintya
Abraham, Ajith
Paul, Diptarup
2012 12TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2012, : 449 - 454

← 1 2 3 4 5 →