Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

被引:0
|
作者
Perrusquia, Adolfo [1 ]
Zou, Mengbang [1 ]
Guo, Weisi [1 ]
机构
[1] Cranfield Univ, Sch Aerosp Transport & Mfg, Bedford MK43 0AL, England
关键词
Q-learning; State-transition function; Explainable Q-learning (XQL); Control policy; REINFORCEMENT; IDENTIFICATION;
D O I
10.1016/j.ins.2024.121283
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Data-Driven Optimal Controller Design for Maglev Train: Q-Learning Method
    Xin, Liang
    Jiang, Hongwei
    Wen, Tao
    Long, Zhiqiang
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 1289 - 1294
  • [42] Data Driven Q-Learning for Commercial HVAC Control
    Faddel, Samy
    Tian, Guanyu
    Zhou, Qun
    Aburub, Haneen
    IEEE SOUTHEASTCON 2020, 2020,
  • [43] Safe Q-Learning for Data-Driven Nonlinear Optimal Control with Asymmetric State Constraints
    Zhao, Mingming
    Wang, Ding
    Song, Shijie
    Qiao, Junfei
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (12) : 2408 - 2422
  • [44] Learning from Neural Control for a Class of Discrete-Time Nonlinear Systems
    Chen, Tianrui
    Wang, Cong
    PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 6732 - 6737
  • [45] Data-Driven Control of Linear Time-Varying Systems
    Nortmann, Benita
    Mylvaganam, Thulasi
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3939 - 3944
  • [46] Q-learning algorithm in solving consensusability problem of discrete-time multi-agent systems
    Feng, Tao
    Zhang, Jilie
    Tong, Yin
    Zhang, Huaguang
    AUTOMATICA, 2021, 128
  • [47] Optimal control for unknown mean-field discrete-time system based on Q-Learning
    Ge, Yingying
    Liu, Xikui
    Li, Yan
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2021, 52 (15) : 3335 - 3349
  • [48] A New Discrete-Time Iterative Adaptive Dynamic Programming Algorithm Based on Q-Learning
    Wei, Qinglai
    Liu, Derong
    ADVANCES IN NEURAL NETWORKS - ISNN 2015, 2015, 9377 : 43 - 52
  • [49] Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control
    Rizvi, Syed Ali Asad
    Lin, Zongli
    AUTOMATICA, 2018, 95 : 213 - 221
  • [50] Efficient Off-Policy Q-Learning for Data-Based Discrete-Time LQR Problems
    Lopez, Victor G.
    Alsalti, Mohammad
    Mueller, Matthias A.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2922 - 2933