Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations

被引：2

作者：

Yaghmaie, Farnaz Adib ^{[1
]}

Modares, Hamidreza ^{[2
]}

Gustafsson, Fredrik ^{[1
]}

机构：

[1] Linkoping Univ, Fac Elect Engn, S-58183 Linkoping, Sweden

[2] Michigan State Univ, Coll Engn, E Lansing, MI 48824 USA

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2024年 / 69卷 / 09期

基金：

瑞典研究理事会; 美国国家科学基金会;

关键词：

Costs; History; Noise; Dynamical systems; Noise measurement; Heuristic algorithms; Data models; Linear quadratic Gaussian; partiially observable dynamical systems; reinforcement learning;

D O I：

10.1109/TAC.2024.3385680

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning algorithms are commonly used to control dynamical systems with measurable state variables. If the dynamical system is partially observable, reinforcement learning algorithms are modified to compensate for the effect of partial observability. One common approach is to feed a finite history of input-output data instead of the state variable. In this article, we study and quantify the effect of this approach in linear Gaussian systems with quadratic costs. We coin the concept of L-Extra-Sampled-dynamics to formalize the idea of using a finite history of input-output data instead of state and show that this approach increases the average cost.

引用

页码：6397 / 6404

页数：8

共 50 条

[1] Reinforcement Learning of Chaotic Systems Control in Partially Observable Environments
Weissenbacher, Max
Borovykh, Anastasia
Rigas, Georgios
FLOW TURBULENCE AND COMBUSTION, 2025,
[2] Deep Reinforcement Learning for Partially Observable Data Poisoning Attack in Crowdsensing Systems
Li, Mohan
Sun, Yanbin
Lu, Hui
Maharjan, Sabita
Tian, Zhihong
IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07): : 6266 - 6278
[3] Modeling and reinforcement learning in partially observable many-agent systems
He, Keyang
Doshi, Prashant
Banerjee, Bikramjit
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (01)
[4] Learning reward machines: A study in partially observable reinforcement learning
Icarte, Rodrigo Toro
Klassen, Toryn Q.
Valenzano, Richard
Castro, Margarita P.
Waldie, Ethan
Mcilraith, Sheila A.
ARTIFICIAL INTELLIGENCE, 2023, 323
[5] Partially Observable Reinforcement Learning for Sustainable Active Surveillance
Chen, Hechang
Yang, Bo
Liu, Jiming
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2018, PT II, 2018, 11062 : 425 - 437
[6] Selective Decentralization to Improve Reinforcement Learning in Unknown Linear Noisy Systems
Thanh Nguyen
Mukhopadhyay, Snehasis
2017 21ST ASIA PACIFIC SYMPOSIUM ON INTELLIGENT AND EVOLUTIONARY SYSTEMS (IES), 2017, : 77 - 82
[7] Gaussian processes meet NeuralODEs: a Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data
Bhouri, Mohamed Aziz
Perdikaris, Paris
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2022, 380 (2229):
[8] Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning
Doshi-Velez, Finale
Pfau, David
Wood, Frank
Roy, Nicholas
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (02) : 394 - 407
[9] PALO bounds for reinforcement learning in partially observable stochastic games
Ceren, Roi
He, Keyang
Doshi, Prashant
Banerjee, Bikramjit
NEUROCOMPUTING, 2021, 420 : 36 - 56
[10] Abstraction in Model Based Partially Observable Reinforcement Learning using Extended Sequence Trees
Cilden, Erkin
Polat, Faruk
2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 2, 2012, : 348 - 355

← 1 2 3 4 5 →