A novel neural network training framework with data assimilation

被引:0
作者
Chong Chen
Yixuan Dou
Jie Chen
Yaru Xue
机构
[1] China University of Petroleum-Beijing,College of Information Science and Engineering
来源
The Journal of Supercomputing | 2022年 / 78卷
关键词
Neural networks; Training algorithm; Data assimilation; EnKF; ESMDA;
D O I
暂无
中图分类号
学科分类号
摘要
In recent years, the prosperity of deep learning has revolutionized the Artificial Neural Networks. However, the dependence of gradients and the offline training mechanism in the learning algorithms prevents the Artificial Neural Networks from further improvement. In this study, a gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients. In data assimilation algorithms, the error covariance between the forecasts and observations is used to optimize the states. The Feedforward Neural Networks are trained by gradient decent, data assimilation algorithms (Ensemble Kalman Filter and Ensemble Smoother with Multiple Data Assimilation), respectively. Ensemble Smoother with Multiple Data Assimilation trains Feedforward Neural Networks with pre-defined iterations by updating the parameters (i.e. states) using all the available observations which can be regarded as offline learning. Ensemble Kalman Filter optimizes Feedforward Neural Networks when new observation available by updating parameters which can be regarded as real-time learning. Two synthetic cases with the regression of a Sine function and a Mexican Hat function are conducted to validate the effectiveness of the proposed framework. Quantitative comparison with the root mean square error and coefficient of determination show that better performance is obtained by the proposed framework than the gradient decent method. Furthermore, the uncertainty of the parameters is quantified which shows the reduction in uncertainty along with the iterations in Ensemble Smoother with Multiple Data Assimilation. The proposed framework explores alternatives for real-time/offline training the existing Artificial Neural Networks (e.g. Convolutional Neural Networks, Recurrent Neural Networks) without the dependence of gradients and conducting uncertainty analysis at the same time.
引用
收藏
页码:19020 / 19045
页数:25
相关论文
共 100 条
[1]  
Huang X(2019)Groundwater recharge prediction using linear regression Multi-Layer Perception Network, and Deep Learning, Water 11 19-444
[2]  
Gao L(2015)Deep learning Nature 521 436-4522
[3]  
Crosbie RS(2017)Deep convolutional neural network for inverse problems in imaging IEEE Trans Image Process 26 4509-133
[4]  
Zhang N(1943)A logical calculus of the ideas immanent in nervous activity Bull Math Biophys 5 115-2558
[5]  
Fu GB(1982)Neural networks and physical systems with emergent collective computational abilities Proc Natl Acad Sci USA 79 2554-536
[6]  
Doble R(1986)Learning representations by back-propagating errors Nature 323 533-314
[7]  
LeCun Y(1989)Approximation by superpositions of a sigmoidal function Math Control Signals Systems 2 303-257
[8]  
Bengio Y(1991)Approximation capabilities of multilayer feedforward networks Neural Netw 4 251-1296
[9]  
Hinton G(2019)Tracking the chaotic behaviour of fractional-order Chua's system by Mexican hat wavelet-based artificial neural network J Low Freq Noise Vib Act Control 38 1279-3146
[10]  
Jin KH(2022)From inference to design: A comprehensive framework for uncertainty quantification in engineering with limited information Mech Syst Signal Process 165 108210-221