A Hessian-Free Gradient Flow (HFGF) method for the optimisation of deep learning neural networks

被引:3
作者
Zhang, Sushen [1 ]
Chen, Ruijuan [2 ]
Du, Wenyu [3 ]
Yuan, Ye [2 ]
Vassiliadis, Vassilios S. [4 ]
机构
[1] Univ Cambridge, Dept Chem Engn & Biotechnol, Cambridge, England
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China
[3] Westlake Univ, Sch Engn, Inst Adv Technol, Westlake Inst Adv Study, Hangzhou, Zhejiang, Peoples R China
[4] Cambridge Simulat Solut LTD, Cambridge, England
关键词
Optimisation; Neural Networks; Deep learning; Truncated Newton; Gradient Flow; Hessian-free;
D O I
10.1016/j.compchemeng.2020.107008
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a novel optimisation method, termed Hessian-free Gradient Flow, for the optimisation of deep neural networks. The algorithm entails the design characteristics of the Truncated Newton, Conjugate Gradient and Gradient Flow method. It employs a finite difference approximation scheme to make the algorithm Hessian-free and makes use of Armijo conditions to determine the descent condition. The method is first tested on standard testing functions with a high optimisation model dimensionality. Performance on the testing functions has demonstrated the potential of the algorithm to be applied to large-scale optimisation problems. The algorithm is then tested on classification and regression tasks using real-world datasets. Comparable performance to conventional optimisers has been obtained in both cases. (C) 2020 Published by Elsevier Ltd.
引用
收藏
页数:8
相关论文
共 41 条
[11]   INEXACT NEWTON METHODS [J].
DEMBO, RS ;
EISENSTAT, SC ;
STEIHAUG, T .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 1982, 19 (02) :400-408
[12]  
Denevi Giulia, 2019, ARXIV190310399
[13]  
Du S. S., 2018, INT C MACH LEARN ICM
[14]  
Duchi J, 2011, J MACH LEARN RES, V12, P2121
[15]  
Floudas C.A., 2013, STATE ART GLOBAL OPT, V7
[16]   Stochastic optimization: a review [J].
Fouskakis, D ;
Draper, D .
INTERNATIONAL STATISTICAL REVIEW, 2002, 70 (03) :315-349
[17]  
Ghorbani B., 2019, arXiv:1901.10159
[18]  
Jin C., 2019, ARXIV190204811, V31
[19]  
Kingma DP, 2014, ADV NEUR IN, V27
[20]   Adaptive Stochastic Gradient Descent Optimisation for Image Registration [J].
Klein, Stefan ;
Pluim, Josien P. W. ;
Staring, Marius ;
Viergever, Max A. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 81 (03) :227-239