Unconstrained feedback controller design using Q-learning from noisy data

被引:0
|
作者
Kumar, Pratyush [1 ]
Rawlings, James B. [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Chem Engn, Santa Barbara, CA 93106 USA
关键词
Reinforcement learning; Q-learning; Least squares policy iteration; System identification; Maximum likelihood estimation; Linear quadratic regulator; MODEL-PREDICTIVE CONTROL; REINFORCEMENT; STABILITY; MPC;
D O I
10.1016/j.compchemeng.2023.108325
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper develops a novel model-free Q-learning based approach to estimate linear, unconstrained feedback controllers from noisy process data. The proposed method is based on an extension of an available approach developed to estimate the linear quadratic regulator (LQR) for linear systems with full state measurements driven by Gaussian process noise of known covariance. First, we modify the approach to treat the case of an unknown noise covariance. Then, we use the modified approach to estimate a feedback controller for linear systems with both process and measurement noise and only output measurements. We also present a model-based maximum likelihood estimation (MLE) approach to determine a linear dynamic model and noise covariances from data, which is used to construct a regulator and state estimator for comparisons in simulation studies. The performances of the model-free and model-based controller estimation approaches are compared with an example heating, ventilation, and air-conditioning (HVAC) system. We show that the proposed Q-learning approach estimates a reasonably accurate feedback controller from 24 h of noisy data. The controllers estimated using both the model-free and model-based approaches provide similar closed-loop performances with 3.5 and 2.7% losses respectively, compared to a perfect controller that uses the true dynamic model and noise covariances of the HVAC system. Finally, we give future work directions for the model-free controller design approaches by discussing some remaining advantages of the model-based approaches.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Improving Energy Efficiency and QoS of LPWANs for IoT Using Q-Learning Based Data Routing
    Pandey, Om Jee
    Yuvaraj, Tankala
    Paul, Joseph K.
    Nguyen, Ha H.
    Gundepudi, Karthikay
    Shukla, Mahendra K.
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (01) : 365 - 379
  • [32] Multi-UAV Formation Maneuvering Control Based on Q-Learning Fuzzy Controller
    Rui, Pang
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 4, 2010, : 252 - 257
  • [33] Improving the Performance of Q-learning Using Simultanouse Q-values Updating
    Pouyan, Maryam
    Mousavi, Amin
    Golzari, Shahram
    Hatam, Ahmad
    2014 INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK), 2014,
  • [34] Self-organizing state aggregation for architecture design of Q-learning
    Hwang, Kao-Shing
    Lin, Hsin-Yi
    Hsu, Yuan-Pao
    Yu, Hung-Hsiu
    INFORMATION SCIENCES, 2011, 181 (13) : 2813 - 2822
  • [35] Inter-carrier SLA negotiation using Q-learning
    Pouyllau, Helia
    Carofiglio, Giovanna
    TELECOMMUNICATION SYSTEMS, 2013, 52 (02) : 611 - 622
  • [36] Collaborative Traffic Signal Automation Using Deep Q-Learning
    Hassan, Muhammad Ahmed
    Elhadef, Mourad
    Khan, Muhammad Usman Ghani
    IEEE ACCESS, 2023, 11 : 136015 - 136032
  • [37] Channel BlaQLisT: Channel Blacklist using Q-Learning for TSCH
    Kim, JunMyeung
    Chung, Sang-Hwa
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 134 - 139
  • [38] Improving Q-learning by using the agent's action history
    Saito M.
    Sekozawa T.
    2016, Institute of Electrical Engineers of Japan (136) : 1209 - 1217
  • [39] Inter-carrier SLA negotiation using Q-learning
    Hélia Pouyllau
    Giovanna Carofiglio
    Telecommunication Systems, 2013, 52 : 611 - 622
  • [40] Self-improvement of OPAmp parameters using Q-Learning
    Takai, Nobukazu
    Fukuda, Masafumi
    Saruta, Masahiro
    2019 16TH INTERNATIONAL CONFERENCE ON SYNTHESIS, MODELING, ANALYSIS AND SIMULATION METHODS AND APPLICATIONS TO CIRCUIT DESIGN (SMACD 2019), 2019, : 293 - 296