Unconstrained feedback controller design using Q-learning from noisy data

被引：0

作者：

Kumar, Pratyush ^{[1
]}

Rawlings, James B. ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Dept Chem Engn, Santa Barbara, CA 93106 USA

来源：

COMPUTERS & CHEMICAL ENGINEERING | 2023年 / 177卷

关键词：

Reinforcement learning; Q-learning; Least squares policy iteration; System identification; Maximum likelihood estimation; Linear quadratic regulator; MODEL-PREDICTIVE CONTROL; REINFORCEMENT; STABILITY; MPC;

D O I：

10.1016/j.compchemeng.2023.108325

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper develops a novel model-free Q-learning based approach to estimate linear, unconstrained feedback controllers from noisy process data. The proposed method is based on an extension of an available approach developed to estimate the linear quadratic regulator (LQR) for linear systems with full state measurements driven by Gaussian process noise of known covariance. First, we modify the approach to treat the case of an unknown noise covariance. Then, we use the modified approach to estimate a feedback controller for linear systems with both process and measurement noise and only output measurements. We also present a model-based maximum likelihood estimation (MLE) approach to determine a linear dynamic model and noise covariances from data, which is used to construct a regulator and state estimator for comparisons in simulation studies. The performances of the model-free and model-based controller estimation approaches are compared with an example heating, ventilation, and air-conditioning (HVAC) system. We show that the proposed Q-learning approach estimates a reasonably accurate feedback controller from 24 h of noisy data. The controllers estimated using both the model-free and model-based approaches provide similar closed-loop performances with 3.5 and 2.7% losses respectively, compared to a perfect controller that uses the true dynamic model and noise covariances of the HVAC system. Finally, we give future work directions for the model-free controller design approaches by discussing some remaining advantages of the model-based approaches.

引用

页数：13

共 50 条

[41] On Improving the Properties of Random Walk on Graph using Q-learning
Matsuo, Ryotaro
Miyashita, Tomoyuki
Suzuki, Taisei
Ohsaki, Hiroyuki
IEICE COMMUNICATIONS EXPRESS, 2023, 12 (01): : 36 - 41
[42] Multi Target Tracking using a Compact Q-Learning with a Teacher
Saad, E. M.
Awadalla, M. H.
Hamdy, A. M.
Ali, H. I.
ICCES: 2008 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS, 2007, : 173 - 178
[43] Experimental Results of a Disturbance Compensating Q-learning Controller for HVAC Systems
Rizvi, Syed Ali Asad
Pertzborn, Amanda J.
2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3353 - 3353
[44] Design of Transmission Lengths for IR-HARQ Scheme Using Q-Learning
Mueadkhunthod, Krittiyaporn
Phakphisut, Watid
2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 880 - 883
[45] Q-learning with recurrent neural networks as a controller for the inverted pendulum problem
Onat, A
Kita, H
Nishikawa, Y
ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 837 - 840
[46] Learning to Navigate in 3D Virtual Environment Using Q-Learning
Sani, Nurulhidayati Haji Mohd
Phon-Amnuaisuk, Somnuk
Au, Thien Wan
Tan, Ee Leng
COMPUTATIONAL INTELLIGENCE IN INFORMATION SYSTEMS (CIIS 2018), 2019, 888 : 191 - 202
[47] An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
Alsalti, Mohammad
Lopez, Victor G.
Mueller, Matthias A.
6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 312 - 323
[48] Application of self-improving Q-learning controller for a class of dynamical processes: Implementation aspects
Musial, Jakub
Stebel, Krzysztof
Czeczot, Jacek
Nowak, Pawel
Gabrys, Bogdan
APPLIED SOFT COMPUTING, 2024, 152
[49] Q-Learning approach for minutiae extraction from fingerprint image
Tiwari, Sandeep
Sharma, Neha
2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 : 82 - 89
[50] Intelligent transportation system using Q-learning
Park, MS
Kim, PJ
Choi, JY
2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4684 - 4687

← 1 2 3 4 5 →