Safety-Critical Optimal Control of Discrete-Time Non-Linear Systems via Policy Iteration-Based Q-Learning

被引：0

作者：

Long, Lijun ^{[1
,2
]}

Liu, Xiaomei ^{[1
,2
]}

Huang, Xiaomin ^{[1
,2
]}

机构：

[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China

[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL | 2025年

关键词：

control barrier functions; discrete-time systems; neural networks; Q-learning; safety-critical control;

D O I：

10.1002/rnc.7809

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates the problem of safety-critical optimal control for discrete-time non-linear systems. A safety-critical control algorithm is developed based on Q-learning and an iterative adaptive dynamic programming, that is, policy iteration. Discrete-time control barrier functions (CBFs) are introduced into the utility function for guaranteeing safety, in which a novel definition of the safe set and its boundary with multiple discrete-time CBFs are given. Also, for discrete-time systems, by using multiple discrete-time CBFs, the safety-critical optimal control problem of multiple safety objectives is addressed. Meanwhile, safety, convergence, and stability of the developed algorithm are rigorously demonstrated. An effective method to obtain an initial safety-admissible control law is established. Also, the developed algorithm is implemented by building an actor-critic structure with neural networks. Finally, the effectiveness of the proposed algorithm is illustrated by three simulation examples.

引用

页数：19

共 50 条

[1] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
Wei QingLai
Liu DeRong
SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) : 1 - 15
[2] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
WEI QingLai
LIU DeRong
ScienceChina(InformationSciences), 2015, 58 (12) : 147 - 161
[3] Generalized Policy Iteration-based Reinforcement Learning Algorithm for Optimal Control of Unknown Discrete-time Systems
Lin, Mingduo
Zhao, Bo
Liu, Derong
Liu, Xi
Luo, Fangchao
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3650 - 3655
[4] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
Wei, Qinglai
Liu, Derong
Song, Ruizhuo
2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
[5] Policy Iteration Q-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems
Luo, Biao
Yang, Yin
Liu, Derong
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (07) : 3630 - 3640
[6] Adaptive Optimal Control for Discrete-Time Linear Systems via Hybrid Iteration
Qasem, Omar
Gao, Weinan
Gutierrez, Hector
2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 1141 - 1146
[7] Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
Li, Jinna
Chai, Tianyou
Lewis, Frank L.
Ding, Zhengtao
Jiang, Yi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1308 - 1320
[8] Improved Q-Learning Method for Linear Discrete-Time Systems
Chen, Jian
Wang, Jinhua
Huang, Jie
PROCESSES, 2020, 8 (03)
[9] An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics
Mu, Chaoxu
Zhao, Qian
Sun, Changyin
Gao, Zhongke
APPLIED SOFT COMPUTING, 2019, 82
[10] An Optimal Tracking Control Method with Q-learning for Discrete-time Linear Switched System
Zhao, Shangwei
Wang, Jingcheng
Wang, Hongyuan
Xu, Haotian
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1414 - 1419

← 1 2 3 4 5 →