A Q-learning predictive control scheme with guaranteed stability

被引：9

作者：

Beckenbach, Lukas ^{[1
]}

Osinenko, Pavel ^{[1
]}

Streif, Stefan ^{[1
]}

机构：

[1] Tech Univ Chemnitz, Automat Control & Syst Dynam Lab, D-09107 Chemnitz, Germany

来源：

EUROPEAN JOURNAL OF CONTROL | 2020年 / 56卷 / 56期

关键词：

Predictive control; Q-Learning; Cost shaping; Nominal stability; RECEDING-HORIZON CONTROL; DISCRETE-TIME-SYSTEMS; NONLINEAR-SYSTEMS; FINITE; PERFORMANCE; MPC; STATE;

D O I：

10.1016/j.ejcon.2020.03.001

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model-based predictive controllers are used to tackle control tasks in which constraints on state, input or both need to be satisfied. These controllers commonly optimize a fixed finite-horizon cost, which relates to an infinite-horizon (IH) cost profile, while the resulting closed-loop under the predictive controller yields an in general suboptimal IH cost. To capture the optimal IH cost and the associated control policy, reinforcement learning methods, such as Q-learning, that approximate said cost via a parametric architec-ture can be employed. Conversely to predictive controllers, however, closed-loop stability has rarely been investigated under the approximation associated controller in explicit dependence of these parameters. It is the aim of this work to incorporate model-based Q-learning into a predictive control setup as to provide closed-loop stability in online learning, while eventually improving the performance of finite-horizon controllers. The proposed scheme provides nominal asymptotic stability and the observation was made that the suggested learning approach could in fact improve the performance against a baseline predictive controller. (c) 2020 European Control Association. Published by Elsevier Ltd. All rights reserved.

引用

页码：167 / 178

页数：12

共 50 条

[41] Stable approximate Q-learning under discounted cost for data-based adaptive tracking control [J].

Liang, Zhantao ;

Ha, Mingming ;

Liu, Derong ;

Wang, Yonghua .

NEUROCOMPUTING, 2024, 568

[42] An adaptive backoff selection scheme based on Q-learning for CSMA/CA [J].

Zhichao Zheng ;

Shengming Jiang ;

Ruoyu Feng ;

Lige Ge ;

Chongchong Gu .

Wireless Networks, 2023, 29 :1899-1909

[43] Q-Learning Approach for Hierarchical AGC Scheme of Interconnected Power Grids [J].

Zhou, B. ;

Chan, K. W. ;

Yu, T. .

PROCEEDINGS OF INTERNATIONAL CONFERENCE ON SMART GRID AND CLEAN ENERGY TECHNOLOGIES (ICSGCE 2011), 2011, 12

[44] A Q-Learning Based Scheme to Securely Cache Content in Edge-Enabled Heterogeneous Networks [J].

Dai, Minghui ;

Su, Zhou ;

Xu, Qichao ;

Chen, Weiwei .

IEEE ACCESS, 2019, 7 :163898-163911

[45] Quantized measurements in Q-learning based model-free optimal control [J].

Tiistola, Sini ;

Ritala, Risto ;

Vilkko, Matti .

IFAC PAPERSONLINE, 2020, 53 (02) :1640-1645

[46] Control with adaptive Q-learning: A comparison for two classical control problems [J].

Araujo, Joao Pedro ;

Figueiredo, Mario A. T. ;

Botto, Miguel Ayala .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 112

[47] Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants [J].

Schilperoort, Jits ;

Mak, Ivar ;

Drugan, Madalina M. ;

Wiering, Marco A. .

2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, :1151-1158

[48] Novel Static Security and Stability Control of Power Systems Based on Artificial Emotional Lazy Q-Learning [J].

Bao T. ;

Ma X. ;

Li Z. ;

Yang D. ;

Wang P. ;

Zhou C. .

Energy Engineering: Journal of the Association of Energy Engineering, 2024, 121 (06) :1713-1737

[49] Optimal Trajectory Output Tracking Control with a Q-learning Algorithm [J].

Vamvoudakis, Kyriakos G. .

2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, :5752-5757

[50] Q-LEARNING BASED CONTROL ALGORITHM FOR HTTP ADAPTIVE STREAMING [J].

Martin, Virginia ;

Cabrera, Julian ;

Garcia, Narciso .

2015 VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2015,

← 1 2 3 4 5 →