Modified λ-Policy Iteration Based Adaptive Dynamic Programming for Unknown Discrete-Time Linear Systems

被引:6
作者
Jiang, Huaiyuan [1 ]
Zhou, Bin [1 ]
Duan, Guang-Ren [1 ]
机构
[1] Harbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive dynamic programming (ADP); data-driven control; discrete-time systems; modified 1-policy iteration (1-PI); policy iteration; unknown systems; STABILIZATION;
D O I
10.1109/TNNLS.2023.3244934
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
this article, the 1-policy iteration (1-PI) method for the optimal control problem of discrete-time linear systems is reconsidered and restated from a novel aspect. First, the traditional 1-PI method is recalled, and some new properties of the traditional 1-PI are proposed. Based on these new properties, a modified 1-PI algorithm is introduced with its convergence proven. Compared with the existing results, the initial con-dition is further relaxed. The data-driven implementation is then constructed with a new matrix rank condition for veri-fying the feasibility of the proposed data-driven implementation. A simulation example verifies the effectiveness of the proposed method.
引用
收藏
页码:3291 / 3301
页数:11
相关论文
共 52 条
  • [1] Adda J, 2003, DYNAMIC ECONOMICS: QUANTITATIVE METHODS AND APPLICATIONS, P1
  • [2] The Boundedness Conditions for Model-Free HDP(lambda)
    Al-Dabooni, Seaar
    Wunsch, Donald
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (07) : 1928 - 1942
  • [3] DYNAMIC PROGRAMMING
    BELLMAN, R
    [J]. SCIENCE, 1966, 153 (3731) : 34 - &
  • [4] Bertsekas D.P., 2005, Dynamic Programming and Optimal Control, VI
  • [5] Bertsekas Dimitri P, 1996, Report LIDSP-2349, P14
  • [6] Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise
    Bian, Tao
    Jiang, Yu
    Jiang, Zhong-Ping
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 4170 - 4175
  • [7] Chakrabarty A, 2019, 2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), P524, DOI [10.23919/ECC.2019.8795815, 10.23919/ecc.2019.8795815]
  • [8] Homotopic policy iteration-based learning design for unknown linear continuous-time systemsx2729;
    Chen, Ci
    Lewis, Frank L.
    Li, Bo
    [J]. AUTOMATICA, 2022, 138
  • [9] Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems
    Chen, Ci
    Lewis, Frank L.
    Xie, Kan
    Xie, Shengli
    Liu, Yilu
    [J]. AUTOMATICA, 2020, 119
  • [10] Stability and monotone convergence of generalised policy iteration for discrete-time linear quadratic regulations
    Chun, Tae Yoon
    Lee, Jae Young
    Park, Jin Bae
    Choi, Yoon Ho
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 2016, 89 (03) : 437 - 450