Optimal control for unknown mean-field discrete-time system based on Q-Learning

被引：6

作者：

Ge, Yingying ^{[1
]}

Liu, Xikui ^{[2
]}

Li, Yan ^{[2
]}

机构：

[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China

[2] Qufu Normal Univ, Coll Comp Sci, Rizhao, Peoples R China

来源：

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE | 2021年 / 52卷 / 15期

基金：

中国国家自然科学基金;

关键词：

Stochastic mean-field system; Q-learning; optimal control; discrete-time system; QUADRATIC OPTIMAL-CONTROL; MARKOV JUMP; STOCHASTIC-SYSTEMS; LINEAR-SYSTEMS; GAME; STABILIZATION; REGULATOR; EQUATION; STATE;

D O I：

10.1080/00207721.2021.1929554

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Solving the optimal mean-field control problem usually requires complete system information. In this paper, a Q-learning algorithm is discussed to solve the optimal control problem of the unknown mean-field discrete-time stochastic system. First, through the corresponding transformation, we turn the stochastic mean-field control problem into a deterministic problem. Second, the H matrix is obtained through Q-function, and the control strategy relies only on the H matrix. Therefore, solving H matrix is equivalent to solving the mean-field optimal control. The proposed Q-learning method iteratively solves H matrix and gain matrix according to input system state information, without the need for system parameter knowledge. Next, it is proved that the control matrix sequence obtained by Q-learning converge to the optimal control, which shows theoretical feasibility of the Q-learning. Finally, two simulation cases verify the effectiveness of Q-learning algorithm.

引用

页码：3335 / 3349

页数：15

共 50 条

[1] Optimal Control for Mean-field System: Discrete-time Case
Zhang, Huanshui
Qi, Qingyuan
2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4474 - 4480
[2] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
Wei, Qinglai
Liu, Derong
Song, Ruizhuo
2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
[3] Continuous Time q-Learning for Mean-Field Control Problems
Wei, Xiaoli
Yu, Xiang
APPLIED MATHEMATICS AND OPTIMIZATION, 2025, 91 (01):
[4] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics
Kiumarsi, Bahare
Lewis, Frank L.
Modares, Hamidreza
Karimpour, Ali
Naghibi-Sistani, Mohammad-Bagher
AUTOMATICA, 2014, 50 (04) : 1167 - 1175
[5] An Optimal Tracking Control Method with Q-learning for Discrete-time Linear Switched System
Zhao, Shangwei
Wang, Jingcheng
Wang, Hongyuan
Xu, Haotian
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1414 - 1419
[6] Optimal Stabilization Control for Discrete-Time Mean-Field Stochastic Systems
Zhang, Huanshui
Qi, Qingyuan
Fu, Minyue
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (03) : 1125 - 1136
[7] An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics
Mu, Chaoxu
Zhao, Qian
Sun, Changyin
Gao, Zhongke
APPLIED SOFT COMPUTING, 2019, 82
[8] Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory
Zhao, Jin-Gang
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1751 - 1759
[9] Q-Learning Methods for LQR Control of Completely Unknown Discrete-Time Linear Systems
Fan, Wenwu
Xiong, Junlin
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 5933 - 5943
[10] A DISCRETE-TIME SWITCHING SYSTEM ANALYSIS OF Q-LEARNING
Lee, Donghwan
Hu, Jianghai
He, Niao
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (03) : 1861 - 1880

← 1 2 3 4 5 →