Unconstrained feedback controller design using Q-learning from noisy data

被引：0

作者：

Kumar, Pratyush ^{[1
]}

Rawlings, James B. ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Dept Chem Engn, Santa Barbara, CA 93106 USA

来源：

COMPUTERS & CHEMICAL ENGINEERING | 2023年 / 177卷

关键词：

Reinforcement learning; Q-learning; Least squares policy iteration; System identification; Maximum likelihood estimation; Linear quadratic regulator; MODEL-PREDICTIVE CONTROL; REINFORCEMENT; STABILITY; MPC;

D O I：

10.1016/j.compchemeng.2023.108325

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper develops a novel model-free Q-learning based approach to estimate linear, unconstrained feedback controllers from noisy process data. The proposed method is based on an extension of an available approach developed to estimate the linear quadratic regulator (LQR) for linear systems with full state measurements driven by Gaussian process noise of known covariance. First, we modify the approach to treat the case of an unknown noise covariance. Then, we use the modified approach to estimate a feedback controller for linear systems with both process and measurement noise and only output measurements. We also present a model-based maximum likelihood estimation (MLE) approach to determine a linear dynamic model and noise covariances from data, which is used to construct a regulator and state estimator for comparisons in simulation studies. The performances of the model-free and model-based controller estimation approaches are compared with an example heating, ventilation, and air-conditioning (HVAC) system. We show that the proposed Q-learning approach estimates a reasonably accurate feedback controller from 24 h of noisy data. The controllers estimated using both the model-free and model-based approaches provide similar closed-loop performances with 3.5 and 2.7% losses respectively, compared to a perfect controller that uses the true dynamic model and noise covariances of the HVAC system. Finally, we give future work directions for the model-free controller design approaches by discussing some remaining advantages of the model-based approaches.

引用

页数：13

共 50 条

[31] Improving Energy Efficiency and QoS of LPWANs for IoT Using Q-Learning Based Data Routing
Pandey, Om Jee
Yuvaraj, Tankala
Paul, Joseph K.
Nguyen, Ha H.
Gundepudi, Karthikay
Shukla, Mahendra K.
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (01) : 365 - 379
[32] Multi-UAV Formation Maneuvering Control Based on Q-Learning Fuzzy Controller
Rui, Pang
2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 4, 2010, : 252 - 257
[33] Improving the Performance of Q-learning Using Simultanouse Q-values Updating
Pouyan, Maryam
Mousavi, Amin
Golzari, Shahram
Hatam, Ahmad
2014 INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK), 2014,
[34] Self-organizing state aggregation for architecture design of Q-learning
Hwang, Kao-Shing
Lin, Hsin-Yi
Hsu, Yuan-Pao
Yu, Hung-Hsiu
INFORMATION SCIENCES, 2011, 181 (13) : 2813 - 2822
[35] Inter-carrier SLA negotiation using Q-learning
Pouyllau, Helia
Carofiglio, Giovanna
TELECOMMUNICATION SYSTEMS, 2013, 52 (02) : 611 - 622
[36] Collaborative Traffic Signal Automation Using Deep Q-Learning
Hassan, Muhammad Ahmed
Elhadef, Mourad
Khan, Muhammad Usman Ghani
IEEE ACCESS, 2023, 11 : 136015 - 136032
[37] Channel BlaQLisT: Channel Blacklist using Q-Learning for TSCH
Kim, JunMyeung
Chung, Sang-Hwa
2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 134 - 139
[38] Improving Q-learning by using the agent's action history
Saito M.
Sekozawa T.
2016, Institute of Electrical Engineers of Japan (136) : 1209 - 1217
[39] Inter-carrier SLA negotiation using Q-learning
Hélia Pouyllau
Giovanna Carofiglio
Telecommunication Systems, 2013, 52 : 611 - 622
[40] Self-improvement of OPAmp parameters using Q-Learning
Takai, Nobukazu
Fukuda, Masafumi
Saruta, Masahiro
2019 16TH INTERNATIONAL CONFERENCE ON SYNTHESIS, MODELING, ANALYSIS AND SIMULATION METHODS AND APPLICATIONS TO CIRCUIT DESIGN (SMACD 2019), 2019, : 293 - 296

← 1 2 3 4 5 →