A Q-learning predictive control scheme with guaranteed stability

被引:9
作者
Beckenbach, Lukas [1 ]
Osinenko, Pavel [1 ]
Streif, Stefan [1 ]
机构
[1] Tech Univ Chemnitz, Automat Control & Syst Dynam Lab, D-09107 Chemnitz, Germany
关键词
Predictive control; Q-Learning; Cost shaping; Nominal stability; RECEDING-HORIZON CONTROL; DISCRETE-TIME-SYSTEMS; NONLINEAR-SYSTEMS; FINITE; PERFORMANCE; MPC; STATE;
D O I
10.1016/j.ejcon.2020.03.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model-based predictive controllers are used to tackle control tasks in which constraints on state, input or both need to be satisfied. These controllers commonly optimize a fixed finite-horizon cost, which relates to an infinite-horizon (IH) cost profile, while the resulting closed-loop under the predictive controller yields an in general suboptimal IH cost. To capture the optimal IH cost and the associated control policy, reinforcement learning methods, such as Q-learning, that approximate said cost via a parametric architec-ture can be employed. Conversely to predictive controllers, however, closed-loop stability has rarely been investigated under the approximation associated controller in explicit dependence of these parameters. It is the aim of this work to incorporate model-based Q-learning into a predictive control setup as to provide closed-loop stability in online learning, while eventually improving the performance of finite-horizon controllers. The proposed scheme provides nominal asymptotic stability and the observation was made that the suggested learning approach could in fact improve the performance against a baseline predictive controller. (c) 2020 European Control Association. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:167 / 178
页数:12
相关论文
共 50 条
  • [21] Learning rates for Q-learning
    Even-Dar, E
    Mansour, Y
    JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 : 1 - 25
  • [22] A DIRECT ADAPTIVE GENERALIZED PREDICTIVE CONTROL ALGORITHM WITH GUARANTEED STABILITY
    WANG, W
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 1994, 8 (03) : 211 - 222
  • [23] Congestion Control in Charging Stations Allocation with Q-Learning
    Li Zhang
    Gong, Ke
    Xu, Maozeng
    SUSTAINABILITY, 2019, 11 (14)
  • [24] Hierarchical model predictive control strategy based on Q-Learning algorithm for hybrid electric vehicle platoon
    Yin, Yanli
    Huang, Xuejiang
    Zhan, Sen
    Zhang, Xinxin
    Wang, Fuzhen
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING, 2024, 238 (2-3) : 385 - 402
  • [25] Ramp Metering Control Based on the Q-Learning Algorithm
    Ivanjko, Edouard
    Necoska, Daniela Koltovska
    Greguric, Martin
    Vujic, Miroslav
    Jurkovic, Goran
    Mandzuka, Sadko
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2015, 15 (05) : 88 - 97
  • [26] On-policy Q-learning for Adaptive Optimal Control
    Jha, Sumit Kumar
    Bhasin, Shubhendu
    2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014, : 301 - 306
  • [27] Input-Decoupled Q-Learning for Optimal Control
    Minh Q. Phan
    Seyed Mahdi B. Azad
    The Journal of the Astronautical Sciences, 2020, 67 : 630 - 656
  • [28] Input-Decoupled Q-Learning for Optimal Control
    Phan, Minh Q.
    Azad, Seyed Mahdi B.
    JOURNAL OF THE ASTRONAUTICAL SCIENCES, 2020, 67 (02) : 630 - 656
  • [29] Switching control of morphing aircraft based on Q-learning
    Ligang GONG
    Qing WANG
    Changhua HU
    Chen LIU
    Chinese Journal of Aeronautics, 2020, 33 (02) : 672 - 687
  • [30] Balance Control of Robot With CMAC Based Q-learning
    Li Ming-ai
    Jiao Li-fang
    Qiao Jun-fei
    Ruan Xiao-gang
    2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 2668 - 2672