Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations

被引:2
|
作者
Yaghmaie, Farnaz Adib [1 ]
Modares, Hamidreza [2 ]
Gustafsson, Fredrik [1 ]
机构
[1] Linkoping Univ, Fac Elect Engn, S-58183 Linkoping, Sweden
[2] Michigan State Univ, Coll Engn, E Lansing, MI 48824 USA
基金
瑞典研究理事会; 美国国家科学基金会;
关键词
Costs; History; Noise; Dynamical systems; Noise measurement; Heuristic algorithms; Data models; Linear quadratic Gaussian; partiially observable dynamical systems; reinforcement learning;
D O I
10.1109/TAC.2024.3385680
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning algorithms are commonly used to control dynamical systems with measurable state variables. If the dynamical system is partially observable, reinforcement learning algorithms are modified to compensate for the effect of partial observability. One common approach is to feed a finite history of input-output data instead of the state variable. In this article, we study and quantify the effect of this approach in linear Gaussian systems with quadratic costs. We coin the concept of L-Extra-Sampled-dynamics to formalize the idea of using a finite history of input-output data instead of state and show that this approach increases the average cost.
引用
收藏
页码:6397 / 6404
页数:8
相关论文
共 50 条
  • [1] Reinforcement Learning of Chaotic Systems Control in Partially Observable Environments
    Weissenbacher, Max
    Borovykh, Anastasia
    Rigas, Georgios
    FLOW TURBULENCE AND COMBUSTION, 2025,
  • [2] Deep Reinforcement Learning for Partially Observable Data Poisoning Attack in Crowdsensing Systems
    Li, Mohan
    Sun, Yanbin
    Lu, Hui
    Maharjan, Sabita
    Tian, Zhihong
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07): : 6266 - 6278
  • [3] Modeling and reinforcement learning in partially observable many-agent systems
    He, Keyang
    Doshi, Prashant
    Banerjee, Bikramjit
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (01)
  • [4] Learning reward machines: A study in partially observable reinforcement learning 
    Icarte, Rodrigo Toro
    Klassen, Toryn Q.
    Valenzano, Richard
    Castro, Margarita P.
    Waldie, Ethan
    Mcilraith, Sheila A.
    ARTIFICIAL INTELLIGENCE, 2023, 323
  • [5] Partially Observable Reinforcement Learning for Sustainable Active Surveillance
    Chen, Hechang
    Yang, Bo
    Liu, Jiming
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2018, PT II, 2018, 11062 : 425 - 437
  • [6] Selective Decentralization to Improve Reinforcement Learning in Unknown Linear Noisy Systems
    Thanh Nguyen
    Mukhopadhyay, Snehasis
    2017 21ST ASIA PACIFIC SYMPOSIUM ON INTELLIGENT AND EVOLUTIONARY SYSTEMS (IES), 2017, : 77 - 82
  • [7] Gaussian processes meet NeuralODEs: a Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data
    Bhouri, Mohamed Aziz
    Perdikaris, Paris
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2022, 380 (2229):
  • [8] Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning
    Doshi-Velez, Finale
    Pfau, David
    Wood, Frank
    Roy, Nicholas
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (02) : 394 - 407
  • [9] PALO bounds for reinforcement learning in partially observable stochastic games
    Ceren, Roi
    He, Keyang
    Doshi, Prashant
    Banerjee, Bikramjit
    NEUROCOMPUTING, 2021, 420 : 36 - 56
  • [10] Abstraction in Model Based Partially Observable Reinforcement Learning using Extended Sequence Trees
    Cilden, Erkin
    Polat, Faruk
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 2, 2012, : 348 - 355