Unconstrained feedback controller design using Q-learning from noisy data

被引:0
|
作者
Kumar, Pratyush [1 ]
Rawlings, James B. [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Chem Engn, Santa Barbara, CA 93106 USA
关键词
Reinforcement learning; Q-learning; Least squares policy iteration; System identification; Maximum likelihood estimation; Linear quadratic regulator; MODEL-PREDICTIVE CONTROL; REINFORCEMENT; STABILITY; MPC;
D O I
10.1016/j.compchemeng.2023.108325
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper develops a novel model-free Q-learning based approach to estimate linear, unconstrained feedback controllers from noisy process data. The proposed method is based on an extension of an available approach developed to estimate the linear quadratic regulator (LQR) for linear systems with full state measurements driven by Gaussian process noise of known covariance. First, we modify the approach to treat the case of an unknown noise covariance. Then, we use the modified approach to estimate a feedback controller for linear systems with both process and measurement noise and only output measurements. We also present a model-based maximum likelihood estimation (MLE) approach to determine a linear dynamic model and noise covariances from data, which is used to construct a regulator and state estimator for comparisons in simulation studies. The performances of the model-free and model-based controller estimation approaches are compared with an example heating, ventilation, and air-conditioning (HVAC) system. We show that the proposed Q-learning approach estimates a reasonably accurate feedback controller from 24 h of noisy data. The controllers estimated using both the model-free and model-based approaches provide similar closed-loop performances with 3.5 and 2.7% losses respectively, compared to a perfect controller that uses the true dynamic model and noise covariances of the HVAC system. Finally, we give future work directions for the model-free controller design approaches by discussing some remaining advantages of the model-based approaches.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] On Improving the Properties of Random Walk on Graph using Q-learning
    Matsuo, Ryotaro
    Miyashita, Tomoyuki
    Suzuki, Taisei
    Ohsaki, Hiroyuki
    IEICE COMMUNICATIONS EXPRESS, 2023, 12 (01): : 36 - 41
  • [42] Multi Target Tracking using a Compact Q-Learning with a Teacher
    Saad, E. M.
    Awadalla, M. H.
    Hamdy, A. M.
    Ali, H. I.
    ICCES: 2008 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS, 2007, : 173 - 178
  • [43] Experimental Results of a Disturbance Compensating Q-learning Controller for HVAC Systems
    Rizvi, Syed Ali Asad
    Pertzborn, Amanda J.
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3353 - 3353
  • [44] Design of Transmission Lengths for IR-HARQ Scheme Using Q-Learning
    Mueadkhunthod, Krittiyaporn
    Phakphisut, Watid
    2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 880 - 883
  • [45] Q-learning with recurrent neural networks as a controller for the inverted pendulum problem
    Onat, A
    Kita, H
    Nishikawa, Y
    ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 837 - 840
  • [46] Learning to Navigate in 3D Virtual Environment Using Q-Learning
    Sani, Nurulhidayati Haji Mohd
    Phon-Amnuaisuk, Somnuk
    Au, Thien Wan
    Tan, Ee Leng
    COMPUTATIONAL INTELLIGENCE IN INFORMATION SYSTEMS (CIIS 2018), 2019, 888 : 191 - 202
  • [47] An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
    Alsalti, Mohammad
    Lopez, Victor G.
    Mueller, Matthias A.
    6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 312 - 323
  • [48] Application of self-improving Q-learning controller for a class of dynamical processes: Implementation aspects
    Musial, Jakub
    Stebel, Krzysztof
    Czeczot, Jacek
    Nowak, Pawel
    Gabrys, Bogdan
    APPLIED SOFT COMPUTING, 2024, 152
  • [49] Q-Learning approach for minutiae extraction from fingerprint image
    Tiwari, Sandeep
    Sharma, Neha
    2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 : 82 - 89
  • [50] Intelligent transportation system using Q-learning
    Park, MS
    Kim, PJ
    Choi, JY
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4684 - 4687