Optimal control for unknown mean-field discrete-time system based on Q-Learning

被引:6
|
作者
Ge, Yingying [1 ]
Liu, Xikui [2 ]
Li, Yan [2 ]
机构
[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[2] Qufu Normal Univ, Coll Comp Sci, Rizhao, Peoples R China
基金
中国国家自然科学基金;
关键词
Stochastic mean-field system; Q-learning; optimal control; discrete-time system; QUADRATIC OPTIMAL-CONTROL; MARKOV JUMP; STOCHASTIC-SYSTEMS; LINEAR-SYSTEMS; GAME; STABILIZATION; REGULATOR; EQUATION; STATE;
D O I
10.1080/00207721.2021.1929554
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Solving the optimal mean-field control problem usually requires complete system information. In this paper, a Q-learning algorithm is discussed to solve the optimal control problem of the unknown mean-field discrete-time stochastic system. First, through the corresponding transformation, we turn the stochastic mean-field control problem into a deterministic problem. Second, the H matrix is obtained through Q-function, and the control strategy relies only on the H matrix. Therefore, solving H matrix is equivalent to solving the mean-field optimal control. The proposed Q-learning method iteratively solves H matrix and gain matrix according to input system state information, without the need for system parameter knowledge. Next, it is proved that the control matrix sequence obtained by Q-learning converge to the optimal control, which shows theoretical feasibility of the Q-learning. Finally, two simulation cases verify the effectiveness of Q-learning algorithm.
引用
收藏
页码:3335 / 3349
页数:15
相关论文
共 50 条
  • [1] Optimal Control for Mean-field System: Discrete-time Case
    Zhang, Huanshui
    Qi, Qingyuan
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4474 - 4480
  • [2] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
    Wei, Qinglai
    Liu, Derong
    Song, Ruizhuo
    2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
  • [3] Continuous Time q-Learning for Mean-Field Control Problems
    Wei, Xiaoli
    Yu, Xiang
    APPLIED MATHEMATICS AND OPTIMIZATION, 2025, 91 (01):
  • [4] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics
    Kiumarsi, Bahare
    Lewis, Frank L.
    Modares, Hamidreza
    Karimpour, Ali
    Naghibi-Sistani, Mohammad-Bagher
    AUTOMATICA, 2014, 50 (04) : 1167 - 1175
  • [5] An Optimal Tracking Control Method with Q-learning for Discrete-time Linear Switched System
    Zhao, Shangwei
    Wang, Jingcheng
    Wang, Hongyuan
    Xu, Haotian
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1414 - 1419
  • [6] Optimal Stabilization Control for Discrete-Time Mean-Field Stochastic Systems
    Zhang, Huanshui
    Qi, Qingyuan
    Fu, Minyue
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (03) : 1125 - 1136
  • [7] An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics
    Mu, Chaoxu
    Zhao, Qian
    Sun, Changyin
    Gao, Zhongke
    APPLIED SOFT COMPUTING, 2019, 82
  • [8] Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory
    Zhao, Jin-Gang
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1751 - 1759
  • [9] Q-Learning Methods for LQR Control of Completely Unknown Discrete-Time Linear Systems
    Fan, Wenwu
    Xiong, Junlin
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 5933 - 5943
  • [10] A DISCRETE-TIME SWITCHING SYSTEM ANALYSIS OF Q-LEARNING
    Lee, Donghwan
    Hu, Jianghai
    He, Niao
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (03) : 1861 - 1880