Linear Quadratic Gaussian using Kalman Network and Reinforcement Learning for Discrete-Time System

被引：2

作者：

Putri, Adi Novitarini ^{[1
]}

Machbub, Carmadi ^{[1
]}

Mahayana, Dimitri ^{[1
]}

Hidayat, Egi M. Idris ^{[1
]}

机构：

[1] Inst Teknol Bandung, Sch Elect Engn & Informat, Control & Comp Syst Res Grp, Jl Ganesha 10, Bandung 40132, Indonesia

来源：

2022 12TH INTERNATIONAL CONFERENCE ON SYSTEM ENGINEERING AND TECHNOLOGY (ICSET 2022) | 2022年

关键词：

Linear Quadratic Gaussian (LQG); optimal control; reinforcement learning; Kalman filter; estimator; OPTIMAL TRACKING CONTROL;

D O I：

10.1109/ICSET57543.2022.10010824

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A problem arises when implementing the control scheme into a system, the limitation of the number of sensors used causes not all state information from the plant can be obtained. In addition, the data measurement process also often contains noise. Therefore, implementing the optimal control scheme directly becomes difficult. The status estimation method proposed in this study utilizes the use of an Artificial Neural Network (ANN) as a function approximation. On the other hand, the development of optimal control implementation has also experienced a fairly massive increase. Some of the existing optimal control solutions can only be carried out offline and depend on the dynamics' model of the system to be controlled. The combination of KalmanNet and the Value Iteration (VI) algorithm is used to adapt the role of the Linear Quadratic Gaussian (LQG) method in designing control procedures to produce high performance. Tests were carried out on a cart-pole system in a discrete-time domain. The use of the combination of these methods could make the closed loop system towards a value of 0 and the evolution of the estimator gain is smoother, namely with the norm of 2.33S4E-04.

引用

页码：54 / 60

页数：7

共 28 条

[1] A DIRECT DERIVATION OF OPTIMAL LINEAR FILTER USING MAXIMUM PRINCIPLE
ATHANS, M
TSE, E
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1967, AC12 (06) : 690 - +
[2] Bertsekas D., 1996, NEURO DYNAMIC PROGRA
[3] Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise
Bian, Tao
Jiang, Yu
Jiang, Zhong-Ping
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 4170 - 4175
[4] Brammer K., 1989, KALMAN BUCY FILTERS
[5] Bryson AE, 2018, Applied Optimal Control: Optimization, Estimation and Control
[6] Cahuantzi R., 2021, arXiv, DOI [10.48550/arXiv.2107.02248, DOI 10.48550/ARXIV.2107.02248]
[7] Gnecco G, 2017, NEURAL COMPUT, V29, P2203, DOI [10.1162/NECO_a_00976, 10.1162/neco_a_00976]
[8] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
[9] Ha M., 2020, VALUE ITERATION BASE
[10] Stability Analysis of Optimal Adaptive Control Under Value Iteration Using a Stabilizing Initial Policy
Heydari, Ali
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) : 4522 - 4527

← 1 2 3 →