A Q-learning predictive control scheme with guaranteed stability

被引:9
作者
Beckenbach, Lukas [1 ]
Osinenko, Pavel [1 ]
Streif, Stefan [1 ]
机构
[1] Tech Univ Chemnitz, Automat Control & Syst Dynam Lab, D-09107 Chemnitz, Germany
关键词
Predictive control; Q-Learning; Cost shaping; Nominal stability; RECEDING-HORIZON CONTROL; DISCRETE-TIME-SYSTEMS; NONLINEAR-SYSTEMS; FINITE; PERFORMANCE; MPC; STATE;
D O I
10.1016/j.ejcon.2020.03.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model-based predictive controllers are used to tackle control tasks in which constraints on state, input or both need to be satisfied. These controllers commonly optimize a fixed finite-horizon cost, which relates to an infinite-horizon (IH) cost profile, while the resulting closed-loop under the predictive controller yields an in general suboptimal IH cost. To capture the optimal IH cost and the associated control policy, reinforcement learning methods, such as Q-learning, that approximate said cost via a parametric architec-ture can be employed. Conversely to predictive controllers, however, closed-loop stability has rarely been investigated under the approximation associated controller in explicit dependence of these parameters. It is the aim of this work to incorporate model-based Q-learning into a predictive control setup as to provide closed-loop stability in online learning, while eventually improving the performance of finite-horizon controllers. The proposed scheme provides nominal asymptotic stability and the observation was made that the suggested learning approach could in fact improve the performance against a baseline predictive controller. (c) 2020 European Control Association. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:167 / 178
页数:12
相关论文
共 50 条
[41]   Stable approximate Q-learning under discounted cost for data-based adaptive tracking control [J].
Liang, Zhantao ;
Ha, Mingming ;
Liu, Derong ;
Wang, Yonghua .
NEUROCOMPUTING, 2024, 568
[42]   An adaptive backoff selection scheme based on Q-learning for CSMA/CA [J].
Zhichao Zheng ;
Shengming Jiang ;
Ruoyu Feng ;
Lige Ge ;
Chongchong Gu .
Wireless Networks, 2023, 29 :1899-1909
[43]   Q-Learning Approach for Hierarchical AGC Scheme of Interconnected Power Grids [J].
Zhou, B. ;
Chan, K. W. ;
Yu, T. .
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON SMART GRID AND CLEAN ENERGY TECHNOLOGIES (ICSGCE 2011), 2011, 12
[44]   A Q-Learning Based Scheme to Securely Cache Content in Edge-Enabled Heterogeneous Networks [J].
Dai, Minghui ;
Su, Zhou ;
Xu, Qichao ;
Chen, Weiwei .
IEEE ACCESS, 2019, 7 :163898-163911
[45]   Quantized measurements in Q-learning based model-free optimal control [J].
Tiistola, Sini ;
Ritala, Risto ;
Vilkko, Matti .
IFAC PAPERSONLINE, 2020, 53 (02) :1640-1645
[46]   Control with adaptive Q-learning: A comparison for two classical control problems [J].
Araujo, Joao Pedro ;
Figueiredo, Mario A. T. ;
Botto, Miguel Ayala .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 112
[47]   Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants [J].
Schilperoort, Jits ;
Mak, Ivar ;
Drugan, Madalina M. ;
Wiering, Marco A. .
2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, :1151-1158
[48]   Novel Static Security and Stability Control of Power Systems Based on Artificial Emotional Lazy Q-Learning [J].
Bao T. ;
Ma X. ;
Li Z. ;
Yang D. ;
Wang P. ;
Zhou C. .
Energy Engineering: Journal of the Association of Energy Engineering, 2024, 121 (06) :1713-1737
[49]   Optimal Trajectory Output Tracking Control with a Q-learning Algorithm [J].
Vamvoudakis, Kyriakos G. .
2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, :5752-5757
[50]   Q-LEARNING BASED CONTROL ALGORITHM FOR HTTP ADAPTIVE STREAMING [J].
Martin, Virginia ;
Cabrera, Julian ;
Garcia, Narciso .
2015 VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2015,