Linear Quadratic Control Using Model-Free Reinforcement Learning

被引:23
|
作者
Yaghmaie, Farnaz Adib [1 ]
Gustafsson, Fredrik [1 ]
Ljung, Lennart [1 ]
机构
[1] Linkoping Univ, Dept Elect Engn, S-58431 Linkoping, Sweden
基金
瑞典研究理事会;
关键词
Noise measurement; Costs; Dynamical systems; Adaptation models; Heuristic algorithms; Process control; Optimal control; Linear quadratic (LQ) control; reinforcement learning (RL); SYSTEMS;
D O I
10.1109/TAC.2022.3145632
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we consider linear quadratic (LQ) control problem with process and measurement noises. We analyze the LQ problem in terms of the average cost and the structure of the value function. We assume that the dynamics of the linear system is unknown and only noisy measurements of the state variable are available. Using noisy measurements of the state variable, we propose two model-free iterative algorithms to solve the LQ problem. The proposed algorithms are variants of policy iteration routine where the policy is greedy with respect to the average of all previous iterations. We rigorously analyze the properties of the proposed algorithms, including stability of the generated controllers and convergence. We analyze the effect of measurement noise on the performance of the proposed algorithms, the classical off-policy, and the classical Q-learning routines. We also investigate a model-building approach, inspired by adaptive control, where a model of the dynamical system is estimated and the optimal control problem is solved assuming that the estimated model is the true model. We use a benchmark to evaluate and compare our proposed algorithms with the classical off-policy, the classical Q-learning, and the policy gradient. We show that our model-building approach performs nearly identical to the analytical solution and our proposed policy iteration-based algorithms outperform the classical off-policy and the classical Q-learning algorithms on this benchmark but do not outperform the model-building approach.
引用
收藏
页码:737 / 752
页数:16
相关论文
共 50 条
  • [1] Using Reinforcement Learning for Model-free Linear Quadratic Control with Process and Measurement Noises
    Yaghmaie, Farnaz Adib
    Gustafsson, Fredrik
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 6510 - 6517
  • [2] Secure Linear Quadratic Regulator Using Sparse Model-Free Reinforcement Learning
    Kiumarsi, Bahare
    Basar, Tamer
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 3641 - 3647
  • [3] Model-free learning control of neutralization processes using reinforcement learning
    Syafiie, S.
    Tadeo, F.
    Martinez, E.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (06) : 767 - 782
  • [4] Model-Free Quantum Control with Reinforcement Learning
    Sivak, V. V.
    Eickbusch, A.
    Liu, H.
    Royer, B.
    Tsioutsios, I
    Devoret, M. H.
    PHYSICAL REVIEW X, 2022, 12 (01)
  • [5] Model-free Predictive Optimal Iterative Learning Control using Reinforcement Learning
    Zhang, Yueqing
    Chu, Bing
    Shu, Zhan
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3279 - 3284
  • [6] Model-Free Adaptive Control Approach Using Integral Reinforcement Learning
    Abouheaf, Mohammed
    Gueaieb, Wail
    2019 IEEE INTERNATIONAL SYMPOSIUM ON ROBOTIC AND SENSORS ENVIRONMENTS (ROSE 2019), 2019, : 84 - 90
  • [7] Model-free LQ Control for Unmanned Helicopters using Reinforcement Learning
    Lee, Dong Jin
    Bang, Hyochoong
    2011 11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2011, : 117 - 120
  • [8] DATA-DRIVEN MODEL-FREE ITERATIVE LEARNING CONTROL USING REINFORCEMENT LEARNING
    Song, Bing
    Phan, Minh Q.
    Longman, Richard W.
    ASTRODYNAMICS 2018, PTS I-IV, 2019, 167 : 2579 - 2597
  • [9] Model-free MIMO control tuning of a chiller process using reinforcement learning
    Rosdahl, Christian
    Bernhardsson, B. O.
    Eisenhower, Bryan
    SCIENCE AND TECHNOLOGY FOR THE BUILT ENVIRONMENT, 2023, 29 (08) : 782 - 794
  • [10] Model-free Data-driven Predictive Control Using Reinforcement Learning
    Sawant, Shambhuraj
    Reinhardt, Dirk
    Kordabad, Arash Bahari
    Gros, Sebastien
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4046 - 4052