Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

被引：0

作者：

Perrusquia, Adolfo ^{[1
]}

Zou, Mengbang ^{[1
]}

Guo, Weisi ^{[1
]}

机构：

[1] Cranfield Univ, Sch Aerosp Transport & Mfg, Bedford MK43 0AL, England

来源：

INFORMATION SCIENCES | 2024年 / 682卷

关键词：

Q-learning; State-transition function; Explainable Q-learning (XQL); Control policy; REINFORCEMENT; IDENTIFICATION;

D O I：

10.1016/j.ins.2024.121283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms.

引用

页数：15

共 50 条

[31] Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis
Wei, Qinglai
Lewis, Frank L.
Sun, Qiuye
Yan, Pengfei
Song, Ruizhuo
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (05) : 1224 - 1237
[32] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
Wei QingLai
Liu DeRong
SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) : 1 - 15
[33] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
WEI QingLai
LIU DeRong
ScienceChina(InformationSciences), 2015, 58 (12) : 147 - 161
[34] A data-driven approach to actuator and sensor fault detection, isolation and estimation in discrete-time linear systems
Naderi, Esmaeil
Khorasani, K.
AUTOMATICA, 2017, 85 : 165 - 178
[35] A Q-LEARNING ALGORITHM FOR DISCRETE-TIME LINEAR-QUADRATIC CONTROL WITH RANDOM PARAMETERS OF UNKNOWN DISTRIBUTION: CONVERGENCE AND STABILIZATION
DU, K. A., I
Meng, Q. I. N. G. X. I. N.
Zhang, F. U.
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2022, 60 (04) : 1991 - 2015
[36] Reinforcement Q-Learning and Non-Zero-Sum Games Optimal Tracking Control for Discrete-Time Linear Multi-Input Systems
Zhao, Jin-Gang
2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 277 - 282
[37] Safety-Critical Optimal Control of Discrete-Time Non-Linear Systems via Policy Iteration-Based Q-Learning
Long, Lijun
Liu, Xiaomei
Huang, Xiaomin
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2025,
[38] Policy Iteration Q-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems
Luo, Biao
Yang, Yin
Liu, Derong
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (07) : 3630 - 3640
[39] A Combined Policy Gradient and Q-learning Method for Data-driven Optimal Control Problems
Lin, Mingduo
Liu, Derong
Zhao, Bo
Dai, Qionghai
Dong, Yi
2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, : 6 - 10
[40] Adaptive Observer Based Data-Driven Control for Nonlinear Discrete-Time Processes
Xu, Dezhi
Jiang, Bin
Shi, Peng
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2014, 11 (04) : 1037 - 1045

← 1 2 3 4 5 →