Neural Network Training Loss Optimization Utilizing the Sliding Innovation Filter

被引：14

作者：

Alsadi, Naseem ^{[1
]}

Hilal, Waleed ^{[1
]}

Surucu, Onur ^{[1
]}

Giuliano, Alessandro ^{[1
]}

Gadsden, Stephen A. ^{[1
]}

Yawney, John ^{[2
]}

AlShabi, Mohammad ^{[3
]}

机构：

[1] McMaster Univ, 1280 Main St W, Hamilton, ON L8S 4L8, Canada

[2] Adastra Corp, 8500 Leslie St 600, Thornhill, ON L3T 7M8, Canada

[3] Univ Sharjah, Sharjah, U Arab Emirates

来源：

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS IV | 2022年 / 12113卷

关键词：

Optimization; Estimation Theory; Kalman Filter; Gradient Descent; Smooth Variable Structure Filter; Sliding Modes; State-Space Methods; Deep Learning; Machine Learning; Neural Networks; EXTENDED KALMAN FILTER; BACKPROPAGATION;

D O I：

10.1117/12.2619029

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Artificial feedforward neural networks (ANN) have been traditionally trained by backpropagation algorithms involving gradient descent algorithms. This is in order to optimize the network's weights and parameters in the training phase to minimize the out of sample error in the output during testing. However, gradient descent (GD) has been proven to be slow and computationally inefficient in comparison with studies implementing the extended Kalman filter (EKF) and unscented Kalman filter (UKF) as optimizers in ANNs. In this paper, a new method of training ANNs is proposed utilizing the sliding innovation filter (SIF). The SIF by Gadsden et al. has demonstrated to be a more robust predictor-corrector than the Kalman filters, especially in ill-conditioned situations or the presence of modelling uncertainties. In this paper, we propose implementing the SIF as an optimizer for training ANNs. The ANN proposed is trained with the SIF to predict the MackeyGlass Chaotic series, and results demonstrate that the proposed method results in improved computation time compared to current estimation strategies for training ANNs while achieving results comparable to a UKF-trained neural network.

引用

页数：13

共 52 条

[1]

Aggarwal C. C., 2020, Linear Algebra and Optimization for Machine Learning-A Textbook.

[2]

Ahmad F., 2010, Proceedings 10th International Conference on Intelligent Systems Design and Applications (ISDA 2010), P604, DOI 10.1109/ISDA.2010.5687199

[3]

Ambadan JT, 2009, J ATMOS SCI, V66, P261, DOI 10.1175/200SJAS2681.1

[4]

[Anonymous], 2017, Deep learning with Python

[5]

[Anonymous], 2017, P INT EL S KNOWL CRE

[6]

[Anonymous], 2009, Kalman Filtering: With Real-Time Applications, DOI DOI 10.1007/978-3-540-87849-0

[7] Improved Geometric Path Enumeration for Verifying ReLU Neural Networks [J].

Bak, Stanley ;

Hoang-Dung Tran ;

Hobbs, Kerianne ;

Johnson, Taylor T. .

COMPUTER AIDED VERIFICATION (CAV 2020), PT I, 2020, 12224 :66-96

[8]

Bellec G., 2019, ARXIV NEURAL EVOLUTI

[9] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].

BENGIO, Y ;

SIMARD, P ;

FRASCONI, P .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166

[10]

Benromdhane S, 1995, C RECORD 20 9 ASILOM

← 1 2 3 4 5 6 →