A Hessian-Free Gradient Flow (HFGF) method for the optimisation of deep learning neural networks

被引:3
作者
Zhang, Sushen [1 ]
Chen, Ruijuan [2 ]
Du, Wenyu [3 ]
Yuan, Ye [2 ]
Vassiliadis, Vassilios S. [4 ]
机构
[1] Univ Cambridge, Dept Chem Engn & Biotechnol, Cambridge, England
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China
[3] Westlake Univ, Sch Engn, Inst Adv Technol, Westlake Inst Adv Study, Hangzhou, Zhejiang, Peoples R China
[4] Cambridge Simulat Solut LTD, Cambridge, England
关键词
Optimisation; Neural Networks; Deep learning; Truncated Newton; Gradient Flow; Hessian-free;
D O I
10.1016/j.compchemeng.2020.107008
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a novel optimisation method, termed Hessian-free Gradient Flow, for the optimisation of deep neural networks. The algorithm entails the design characteristics of the Truncated Newton, Conjugate Gradient and Gradient Flow method. It employs a finite difference approximation scheme to make the algorithm Hessian-free and makes use of Armijo conditions to determine the descent condition. The method is first tested on standard testing functions with a high optimisation model dimensionality. Performance on the testing functions has demonstrated the potential of the algorithm to be applied to large-scale optimisation problems. The algorithm is then tested on classification and regression tasks using real-world datasets. Comparable performance to conventional optimisers has been obtained in both cases. (C) 2020 Published by Elsevier Ltd.
引用
收藏
页数:8
相关论文
共 41 条
[1]  
Agrawal A., 2019, DEEP LEARNING BASED
[2]  
[Anonymous], 2012, Neural Networks Machine Learn.
[3]  
[Anonymous], 2013, ARXIV13084008
[4]  
[Anonymous], 2012, ADADELTA ADAPTIVE LE
[5]   Computer vision and deep learning-based data anomaly detection method for structural health monitoring [J].
Bao, Yuequan ;
Tang, Zhiyi ;
Li, Hui ;
Zhang, Yufeng .
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2019, 18 (02) :401-421
[6]   DIFFERENTIAL GRADIENT METHODS [J].
BOTSARIS, CA .
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1978, 63 (01) :177-198
[7]   SOME EFFECTIVE METHODS FOR UNCONSTRAINED OPTIMIZATION BASED ON THE SOLUTION OF SYSTEMS OF ORDINARY DIFFERENTIAL-EQUATIONS [J].
BROWN, AA ;
BARTHOLOMEWBIGGS, MC .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1989, 62 (02) :211-224
[8]   Adversarial image detection in deep neural networks [J].
Carrara, Fabio ;
Falchi, Fabrizio ;
Caldelli, Roberto ;
Amato, Giuseppe ;
Becarelli, Rudy .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (03) :2815-2835
[9]  
Chen J., 2018, PADAM CLOSING GEN GA
[10]  
Defossez A., 2017, ARXIV171101761