Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations

被引：2

作者：

Yaghmaie, Farnaz Adib ^{[1
]}

Modares, Hamidreza ^{[2
]}

Gustafsson, Fredrik ^{[1
]}

机构：

[1] Linkoping Univ, Fac Elect Engn, S-58183 Linkoping, Sweden

[2] Michigan State Univ, Coll Engn, E Lansing, MI 48824 USA

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2024年 / 69卷 / 09期

基金：

瑞典研究理事会; 美国国家科学基金会;

关键词：

Costs; History; Noise; Dynamical systems; Noise measurement; Heuristic algorithms; Data models; Linear quadratic Gaussian; partiially observable dynamical systems; reinforcement learning;

D O I：

10.1109/TAC.2024.3385680

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning algorithms are commonly used to control dynamical systems with measurable state variables. If the dynamical system is partially observable, reinforcement learning algorithms are modified to compensate for the effect of partial observability. One common approach is to feed a finite history of input-output data instead of the state variable. In this article, we study and quantify the effect of this approach in linear Gaussian systems with quadratic costs. We coin the concept of L-Extra-Sampled-dynamics to formalize the idea of using a finite history of input-output data instead of state and show that this approach increases the average cost.

引用

页码：6397 / 6404

页数：8

共 50 条

[21] Linear Quadratic Control Using Model-Free Reinforcement Learning
Yaghmaie, Farnaz Adib
Gustafsson, Fredrik
Ljung, Lennart
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (02) : 737 - 752
[22] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
Sharma, Rajneesh
Spaan, Matthijs T. J.
IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429
[23] On using reinforcement learning to solve sparse linear systems
Kuefler, Erik
Chen, Tzu-Yi
COMPUTATIONAL SCIENCE - ICCS 2008, PT 1, 2008, 5101 : 955 - 964
[24] A reinforcement learning scheme for a partially-observable multi-agent game
Ishii, S
Fujita, H
Mitsutake, M
Yamazaki, T
Matsuda, J
Matsuno, Y
MACHINE LEARNING, 2005, 59 (1-2) : 31 - 54
[25] A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game
Shin Ishii
Hajime Fujita
Masaoki Mitsutake
Tatsuya Yamazaki
Jun Matsuda
Yoichiro Matsuno
Machine Learning, 2005, 59 : 31 - 54
[26] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
Shang, Wenjie
Li, Qingyang
Qin, Zhiwei
Yu, Yang
Meng, Yiping
Ye, Jieping
MACHINE LEARNING, 2021, 110 (09) : 2603 - 2640
[27] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
Wenjie Shang
Qingyang Li
Zhiwei Qin
Yang Yu
Yiping Meng
Jieping Ye
Machine Learning, 2021, 110 : 2603 - 2640
[28] Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning
Modares, Hamidreza
Lewis, Frank L.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (11) : 3051 - 3056
[29] Wasserstein Distributionally Robust Control of Partially Observable Linear Stochastic Systems
Hakobyan, Astghik
Yang, Insoon
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (09) : 6121 - 6136
[30] A novel approach for self-driving car in partially observable environment using life long reinforcement learning
Quadir, Md Abdul
Jaiswal, Dibyanshu
Mohan, Senthilkumar
Innab, Nisreen
Sulaiman, Riza
Alaoui, Mohammed Kbiri
Ahmadian, Ali
SUSTAINABLE ENERGY GRIDS & NETWORKS, 2024, 38

← 1 2 3 4 5 →