Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

被引：0

作者：

Perrusquia, Adolfo ^{[1
]}

Zou, Mengbang ^{[1
]}

Guo, Weisi ^{[1
]}

机构：

[1] Cranfield Univ, Sch Aerosp Transport & Mfg, Bedford MK43 0AL, England

来源：

INFORMATION SCIENCES | 2024年 / 682卷

关键词：

Q-learning; State-transition function; Explainable Q-learning (XQL); Control policy; REINFORCEMENT; IDENTIFICATION;

D O I：

10.1016/j.ins.2024.121283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms.

引用

页数：15

共 50 条

[1] Improved Q-Learning Method for Linear Discrete-Time Systems
Chen, Jian
Wang, Jinhua
Huang, Jie
PROCESSES, 2020, 8 (03)
[2] Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning
Zhang, Li
Fan, Jialu
Xue, Wenqian
Lopez, Victor G.
Li, Jinna
Chai, Tianyou
Lewis, Frank L.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3553 - 3567
[3] Q-Learning Methods for LQR Control of Completely Unknown Discrete-Time Linear Systems
Fan, Wenwu
Xiong, Junlin
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 5933 - 5943
[4] Q-Learning for Continuous-Time Linear Systems: A Data-Driven Implementation of the Kleinman Algorithm
Possieri, Corrado
Sassano, Mario
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (10): : 6487 - 6497
[5] H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs
Yang, Yunjie
Wan, Yan
Zhu, Jihong
Lewis, Frank L.
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (01): : 175 - 180
[6] H∞ Tracking Control of Unknown Discrete-Time Linear Systems via Output-Data-Driven Off-policy Q-learning Algorithm
Zhang, Kun
Liu, Xuantong
Zhang, Lei
Chen, Qian
Peng, Yunjian
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2350 - 2356
[7] Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems
Li, Chun
Ding, Jinliang
Lewis, Frank L.
Chai, Tianyou
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3191 - 3201
[8] Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
Rizvi, Syed Ali Asad
Lin, Zongli
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1523 - 1536
[9] An Optimal Tracking Control Method with Q-learning for Discrete-time Linear Switched System
Zhao, Shangwei
Wang, Jingcheng
Wang, Hongyuan
Xu, Haotian
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1414 - 1419
[10] The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach
Dong, Xunde
Lin, Yuxin
Suo, Xudong
Wang, Xihao
Sun, Weijie
MATHEMATICS, 2024, 12 (04)

← 1 2 3 4 5 →