Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations

被引:2
|
作者
Yaghmaie, Farnaz Adib [1 ]
Modares, Hamidreza [2 ]
Gustafsson, Fredrik [1 ]
机构
[1] Linkoping Univ, Fac Elect Engn, S-58183 Linkoping, Sweden
[2] Michigan State Univ, Coll Engn, E Lansing, MI 48824 USA
基金
瑞典研究理事会; 美国国家科学基金会;
关键词
Costs; History; Noise; Dynamical systems; Noise measurement; Heuristic algorithms; Data models; Linear quadratic Gaussian; partiially observable dynamical systems; reinforcement learning;
D O I
10.1109/TAC.2024.3385680
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning algorithms are commonly used to control dynamical systems with measurable state variables. If the dynamical system is partially observable, reinforcement learning algorithms are modified to compensate for the effect of partial observability. One common approach is to feed a finite history of input-output data instead of the state variable. In this article, we study and quantify the effect of this approach in linear Gaussian systems with quadratic costs. We coin the concept of L-Extra-Sampled-dynamics to formalize the idea of using a finite history of input-output data instead of state and show that this approach increases the average cost.
引用
收藏
页码:6397 / 6404
页数:8
相关论文
共 50 条
  • [21] Linear Quadratic Control Using Model-Free Reinforcement Learning
    Yaghmaie, Farnaz Adib
    Gustafsson, Fredrik
    Ljung, Lennart
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (02) : 737 - 752
  • [22] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
    Sharma, Rajneesh
    Spaan, Matthijs T. J.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429
  • [23] On using reinforcement learning to solve sparse linear systems
    Kuefler, Erik
    Chen, Tzu-Yi
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 1, 2008, 5101 : 955 - 964
  • [24] A reinforcement learning scheme for a partially-observable multi-agent game
    Ishii, S
    Fujita, H
    Mitsutake, M
    Yamazaki, T
    Matsuda, J
    Matsuno, Y
    MACHINE LEARNING, 2005, 59 (1-2) : 31 - 54
  • [25] A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game
    Shin Ishii
    Hajime Fujita
    Masaoki Mitsutake
    Tatsuya Yamazaki
    Jun Matsuda
    Yoichiro Matsuno
    Machine Learning, 2005, 59 : 31 - 54
  • [26] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
    Shang, Wenjie
    Li, Qingyang
    Qin, Zhiwei
    Yu, Yang
    Meng, Yiping
    Ye, Jieping
    MACHINE LEARNING, 2021, 110 (09) : 2603 - 2640
  • [27] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
    Wenjie Shang
    Qingyang Li
    Zhiwei Qin
    Yang Yu
    Yiping Meng
    Jieping Ye
    Machine Learning, 2021, 110 : 2603 - 2640
  • [28] Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning
    Modares, Hamidreza
    Lewis, Frank L.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (11) : 3051 - 3056
  • [29] Wasserstein Distributionally Robust Control of Partially Observable Linear Stochastic Systems
    Hakobyan, Astghik
    Yang, Insoon
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (09) : 6121 - 6136
  • [30] A novel approach for self-driving car in partially observable environment using life long reinforcement learning
    Quadir, Md Abdul
    Jaiswal, Dibyanshu
    Mohan, Senthilkumar
    Innab, Nisreen
    Sulaiman, Riza
    Alaoui, Mohammed Kbiri
    Ahmadian, Ali
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2024, 38