Undiscounted control policy generation for continuous-valued optimal control by approximate dynamic programming

被引：2

作者：

Lock, Jonathan ^{[1
]}

McKelvey, Tomas ^{[1
]}

机构：

[1] Chalmers Univ Technol, Dept Elect Engn, S-41296 Gothenburg, Sweden

来源：

INTERNATIONAL JOURNAL OF CONTROL | 2022年 / 95卷 / 10期

关键词：

Approximate dynamic programming; control policy; undiscounted infinite-horizon; optimal control;

D O I：

10.1080/00207179.2021.1939892

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a numerical method for generating the state-feedback control policy associated with general undiscounted, constant-setpoint, infinite-horizon, nonlinear optimal control problems with continuous state variables. The method is based on approximate dynamic programming, and is closely related to approximate policy iteration. Existing methods typically terminate based on the convergence of the control policy and either require a discounted problem formulation or demand the cost function to lie in a specific subclass of functions. The presented method extends on existing termination criteria by requiring both the control policy and the resulting system state to converge, allowing for use with undiscounted cost functions that are bounded and continuous. This paper defines the numerical method, derives the relevant underlying mathematical properties, and validates the numerical method with representative examples. A MATLAB implementation with the shown examples is freely available.

引用

页码：2854 / 2864

页数：11

共 18 条

[1] THE THEORY OF DYNAMIC PROGRAMMING
BELLMAN, R
[J]. BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1954, 60 (06) : 503 - 515
[2] Bertsekas D., 2017, DYNAMIC PROGRAMMING, VI
[3] Approximate policy iteration: A survey and some new methods
Bertsekas D.P.
[J]. Journal of Control Theory and Applications, 2011, 9 (3): : 310 - 335
[4] Bertsekas D. P., 2012, Dynamic Programming and Optimal Control, VII
[5] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN DETERMINISTIC OPTIMAL-CONTROL
BERTSEKAS, DP
SHREVE, SE
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1979, 69 (02) : 607 - 620
[6] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
Guo, Wentao
Si, Jennie
Liu, Feng
Mei, Shengwei
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
[7] Variable resolution discretization in optimal control
Munos, R
Moore, A
[J]. MACHINE LEARNING, 2002, 49 (2-3) : 291 - 323
[8] ?nnheim, 2016, INTRO CONTINUOUS OPT
[9] What You Should Know About Approximate Dynamic Programming
Powell, Warren B.
[J]. NAVAL RESEARCH LOGISTICS, 2009, 56 (03) : 239 - 249
[10] Puterman M. L., 1979, Mathematics of Operations Research, V4, P60, DOI 10.1287/moor.4.1.60

← 1 2 →