Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

被引：0

作者：

Perrusquia, Adolfo ^{[1
]}

Zou, Mengbang ^{[1
]}

Guo, Weisi ^{[1
]}

机构：

[1] Cranfield Univ, Sch Aerosp Transport & Mfg, Bedford MK43 0AL, England

来源：

INFORMATION SCIENCES | 2024年 / 682卷

关键词：

Q-learning; State-transition function; Explainable Q-learning (XQL); Control policy; REINFORCEMENT; IDENTIFICATION;

D O I：

10.1016/j.ins.2024.121283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms.

引用

页数：15

共 50 条

[41] Data-Driven Optimal Controller Design for Maglev Train: Q-Learning Method
Xin, Liang
Jiang, Hongwei
Wen, Tao
Long, Zhiqiang
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 1289 - 1294
[42] Data Driven Q-Learning for Commercial HVAC Control
Faddel, Samy
Tian, Guanyu
Zhou, Qun
Aburub, Haneen
IEEE SOUTHEASTCON 2020, 2020,
[43] Safe Q-Learning for Data-Driven Nonlinear Optimal Control with Asymmetric State Constraints
Zhao, Mingming
Wang, Ding
Song, Shijie
Qiao, Junfei
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (12) : 2408 - 2422
[44] Learning from Neural Control for a Class of Discrete-Time Nonlinear Systems
Chen, Tianrui
Wang, Cong
PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 6732 - 6737
[45] Data-Driven Control of Linear Time-Varying Systems
Nortmann, Benita
Mylvaganam, Thulasi
2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3939 - 3944
[46] Q-learning algorithm in solving consensusability problem of discrete-time multi-agent systems
Feng, Tao
Zhang, Jilie
Tong, Yin
Zhang, Huaguang
AUTOMATICA, 2021, 128
[47] Optimal control for unknown mean-field discrete-time system based on Q-Learning
Ge, Yingying
Liu, Xikui
Li, Yan
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2021, 52 (15) : 3335 - 3349
[48] A New Discrete-Time Iterative Adaptive Dynamic Programming Algorithm Based on Q-Learning
Wei, Qinglai
Liu, Derong
ADVANCES IN NEURAL NETWORKS - ISNN 2015, 2015, 9377 : 43 - 52
[49] Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control
Rizvi, Syed Ali Asad
Lin, Zongli
AUTOMATICA, 2018, 95 : 213 - 221
[50] Efficient Off-Policy Q-Learning for Data-Based Discrete-Time LQR Problems
Lopez, Victor G.
Alsalti, Mohammad
Mueller, Matthias A.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2922 - 2933

← 1 2 3 4 5 →