A reinforcement learning-based scheme for direct adaptive optimal control of linear stochastic systems

被引:14
作者
Wong, Wee Chin [1 ]
Lee, Jay H. [1 ]
机构
[1] Georgia Inst Technol, Sch Chem & Biomol Engn, Atlanta, GA 30332 USA
关键词
reinforcement learning; linear systems; stochastic adaptive optimal control; IDENTIFICATION;
D O I
10.1002/oca.915
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning where decision-making agents learn optimal policies through environmental interactions is an attractive paradigm for model-free, adaptive controller design. However, results for systems with continuous state and action variables are rare. In this paper, we present convergence results for optimal linear quadratic control of discrete-time linear stochastic systems. This work can be viewed as a generalization of a previous work on deterministic linear systems. Key differences between the algorithms for deterministic and stochastic systems are highlighted through examples. The usefulness of the algorithm is demonstrated through a nonlinear chemostat bioreactor case study Copyright (C) 2009 John Wiley & Sons, Ltd.
引用
收藏
页码:365 / 374
页数:10
相关论文
共 19 条
  • [1] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    [J]. AUTOMATICA, 2007, 43 (03) : 473 - 481
  • [2] [Anonymous], 2000, Dynamic programming and optimal control
  • [3] Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics
  • [4] BRADKTE S, 1994, THESIS U MASSACHUSET
  • [5] BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
  • [6] Christopher John Cornish Hellaby Watkins, 1989, Learning from Delayed Rewards
  • [7] Goodwin G. C., 1984, Adaptive filtering prediction and control
  • [8] Effect of process nonlinearity on linear quadratic regulator performance
    Guay, M
    Dier, R
    Hahn, J
    McLellan, PJ
    [J]. JOURNAL OF PROCESS CONTROL, 2005, 15 (01) : 113 - 124
  • [9] HAGEN S, 2001, THESIS U AMSTERDAM
  • [10] Adaptive critic methods for stochastic systems with input-dependent noise
    Herzallah, Randa
    [J]. AUTOMATICA, 2007, 43 (08) : 1355 - 1362