Sum-of-squares-based policy iteration for suboptimal control of polynomial time-varying systems

被引：7

作者：

Pakkhesal, Sajjad ^{[1
]}

Shamaghdari, Saeed ^{[1
]}

机构：

[1] Iran Univ Sci & Technol, Dept Elect Engn, Tehran 1684613114, Iran

来源：

ASIAN JOURNAL OF CONTROL | 2022年 / 24卷 / 06期

关键词：

adaptive dynamic programming (ADP); optimal control; policy iteration; sum-of-squares (SOS) programming; time-varying systems; NONLINEAR-SYSTEMS;

D O I：

10.1002/asjc.2689

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates the suboptimal control design for polynomial time-varying systems. It is known that the solution to this problem relies on the solution of the Hamilton-Jacobi-Bellman (HJB) equation, which is a nonlinear partial differential equation (PDE). A policy iteration (PI) algorithm is developed to solve the HJB equation. The policy evaluation step of this algorithm consists of a sum-of-squares (SOS) program, which is computationally tractable. This algorithm distinguishes from previously known SOS-based adaptive dynamic programming (ADP) algorithms in that it is developed for time-varying systems. The convergence of the iterative algorithm and the global stability of the closed-loop system are proved. At the end, the effectiveness of the proposed algorithm is illustrated through two simulation examples.

引用

页码：3022 / 3031

页数：10

共 24 条

[1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].