Safety-Critical Optimal Control of Discrete-Time Non-Linear Systems via Policy Iteration-Based Q-Learning

被引：0

作者：

Long, Lijun ^{[1
,2
]}

Liu, Xiaomei ^{[1
,2
]}

Huang, Xiaomin ^{[1
,2
]}

机构：

[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China

[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL | 2025年

关键词：

control barrier functions; discrete-time systems; neural networks; Q-learning; safety-critical control;

D O I：

10.1002/rnc.7809

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates the problem of safety-critical optimal control for discrete-time non-linear systems. A safety-critical control algorithm is developed based on Q-learning and an iterative adaptive dynamic programming, that is, policy iteration. Discrete-time control barrier functions (CBFs) are introduced into the utility function for guaranteeing safety, in which a novel definition of the safe set and its boundary with multiple discrete-time CBFs are given. Also, for discrete-time systems, by using multiple discrete-time CBFs, the safety-critical optimal control problem of multiple safety objectives is addressed. Meanwhile, safety, convergence, and stability of the developed algorithm are rigorously demonstrated. An effective method to obtain an initial safety-admissible control law is established. Also, the developed algorithm is implemented by building an actor-critic structure with neural networks. Finally, the effectiveness of the proposed algorithm is illustrated by three simulation examples.

引用

页数：19

共 50 条

[21] Neural-network-based accelerated safe Q-learning for optimal control of discrete-time nonlinear systems with state constraints☆
Zhao, Mingming
Wang, Ding
Qiao, Junfei
NEURAL NETWORKS, 2025, 186
[22] Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI
Kim, J. -H.
Lewis, F. L.
AUTOMATICA, 2010, 46 (08) : 1320 - 1326
[23] Reinforcement Q-Learning and Non-Zero-Sum Games Optimal Tracking Control for Discrete-Time Linear Multi-Input Systems
Zhao, Jin-Gang
2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 277 - 282
[24] Optimal control for unknown mean-field discrete-time system based on Q-Learning
Ge, Yingying
Liu, Xikui
Li, Yan
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2021, 52 (15) : 3335 - 3349
[25] Optimal Learning Control for Discrete-Time Nonlinear Systems Using Generalized Policy Iteration Based Adaptive Dynamic Programming
Wei, Qinglai
Liu, Derong
2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1781 - 1786
[26] H∞ Tracking Control of Unknown Discrete-Time Linear Systems via Output-Data-Driven Off-policy Q-learning Algorithm
Zhang, Kun
Liu, Xuantong
Zhang, Lei
Chen, Qian
Peng, Yunjian
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2350 - 2356
[27] Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems
Perrusquia, Adolfo
Zou, Mengbang
Guo, Weisi
INFORMATION SCIENCES, 2024, 682
[28] H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs
Yang, Yunjie
Wan, Yan
Zhu, Jihong
Lewis, Frank L.
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (01): : 175 - 180
[29] The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach
Dong, Xunde
Lin, Yuxin
Suo, Xudong
Wang, Xihao
Sun, Weijie
MATHEMATICS, 2024, 12 (04)
[30] Off-policy inverse Q-learning for discrete-time antagonistic unknown systems
Lian, Bosen
Xue, Wenqian
Xie, Yijing
Lewis, Frank L.
Davoudi, Ali
AUTOMATICA, 2023, 155

← 1 2 3 4 5 →