A Hessian-Free Gradient Flow (HFGF) method for the optimisation of deep learning neural networks

被引：3

作者：

Zhang, Sushen ^{[1
]}

Chen, Ruijuan ^{[2
]}

Du, Wenyu ^{[3
]}

Yuan, Ye ^{[2
]}

Vassiliadis, Vassilios S. ^{[4
]}

机构：

[1] Univ Cambridge, Dept Chem Engn & Biotechnol, Cambridge, England

[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China

[3] Westlake Univ, Sch Engn, Inst Adv Technol, Westlake Inst Adv Study, Hangzhou, Zhejiang, Peoples R China

[4] Cambridge Simulat Solut LTD, Cambridge, England

来源：

COMPUTERS & CHEMICAL ENGINEERING | 2020年 / 141卷

关键词：

Optimisation; Neural Networks; Deep learning; Truncated Newton; Gradient Flow; Hessian-free;

D O I：

10.1016/j.compchemeng.2020.107008

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper presents a novel optimisation method, termed Hessian-free Gradient Flow, for the optimisation of deep neural networks. The algorithm entails the design characteristics of the Truncated Newton, Conjugate Gradient and Gradient Flow method. It employs a finite difference approximation scheme to make the algorithm Hessian-free and makes use of Armijo conditions to determine the descent condition. The method is first tested on standard testing functions with a high optimisation model dimensionality. Performance on the testing functions has demonstrated the potential of the algorithm to be applied to large-scale optimisation problems. The algorithm is then tested on classification and regression tasks using real-world datasets. Comparable performance to conventional optimisers has been obtained in both cases. (C) 2020 Published by Elsevier Ltd.

引用

页数：8

共 41 条

[11] INEXACT NEWTON METHODS [J].

DEMBO, RS ;

EISENSTAT, SC ;

STEIHAUG, T .

SIAM JOURNAL ON NUMERICAL ANALYSIS, 1982, 19 (02) :400-408

[12]

Denevi Giulia, 2019, ARXIV190310399

[13]

Du S. S., 2018, INT C MACH LEARN ICM

[14]

Duchi J, 2011, J MACH LEARN RES, V12, P2121

[15]

Floudas C.A., 2013, STATE ART GLOBAL OPT, V7

[16] Stochastic optimization: a review [J].

Fouskakis, D ;

Draper, D .

INTERNATIONAL STATISTICAL REVIEW, 2002, 70 (03) :315-349

[17]

Ghorbani B., 2019, arXiv:1901.10159

[18]

Jin C., 2019, ARXIV190204811, V31

[19]

Kingma DP, 2014, ADV NEUR IN, V27

[20] Adaptive Stochastic Gradient Descent Optimisation for Image Registration [J].

Klein, Stefan ;

Pluim, Josien P. W. ;

Staring, Marius ;

Viergever, Max A. .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 81 (03) :227-239

← 1 2 3 4 5 →